10,000 Matching Annotations
  1. Nov 2025
    1. ante los recortes y la inacción de los gobiernos más allá de los designios del mercado, las universidades públicas se han mostrado autosuficientes mientras crecía el número de estudiantes, el profesorado aceptaba la precariedad, el funcionamiento ordinario se mantenía pese al deterioro de las infraestructuras y, sobre todo, las sociedades mantenían su confianza.

      Autosuficientes para operar como han operado, lejos del ideal de universidad que ha expresado previamente el autor, lo que me parece ha reducido significativamente la confianza en ellas de la sociedad.

    2. Universidades y comunidades han crecido en una relación simbiótica que trae su causa en la libertad academica que se garantiza a las universidades desde su autonomía

      Se puede esperar entonces que comunidades que han tenido problemas serios de crecimiento tengan universidades con problemas serios en su operación.

    1. Reviewer #1 (Public review):

      Summary:

      Noell et al have presented a careful study of the dissociation kinetics of Kinesin (1,2,3) classes of motors moving in vitro on a microtubule. These motors move against the opposing force from a ~1 micron DNA strand (DNA tensiometer) that is tethered to the microtubule and also bound to the motor via specific linkages (Figure 1A). The authors compare the time for which motors remain attached to the microtubule when they are tethered to the DNA, versus when they are not. If the former is longer, the interpretation is that the force on the motor from the stretched DNA (presumed to be working solely along the length of the microtubule) causes the motor's detachment rate from the microtubule to be reduced. Thus, the specific motor exhibits "catch-bond" like behaviour.

      Strengths:

      The motivation is good - to understand how kinesin competes against dynein through the possible activation of a catch bond. Experiments are well done, and there is an effort to model the results theoretically.

      Weaknesses:

      The motivation of these studies is to understand how kinesin (1/2/3) motors would behave when they are pitted in a tug of war against dynein motors as they transport cargo in a bidirectional manner on microtubules. Earlier work on dynein and kinesin motors using optical tweezers has suggested that dynein shows a catch bond phenomenon, whereas such signatures were not seen for kinesin. Based on their data with the DNA tensiometer, the authors would like to claim that (i) Kinesin1 and Kinesin2 also show catch-bonding and (ii) the earlier results using optical traps suffer from vertical forces, which complicates the catch-bond interpretation.

      While the motivation of this work is reasonable, and the experiments are careful, I find significant issues that the authors have not addressed:

      (1) Figure 1B shows the PREDICTED force-extension curve for DNA based on a worm-like chain model. Where is the experimental evidence for this curve? This issue is crucial because the F-E curve will decide how and when a catch-bond is induced (if at all it is) as the motor moves against the tensiometer. Unless this is actually measured by some other means, I find it hard to accept all the results based on Figure 1B.

      (2) The authors can correct me on this, but I believe that all the catch-bond studies using optical traps have exerted a load force that exceeds the actual force generated by the motor. For example, see Figure 2 in reference 42 (Kunwar et al). It is in this regime (load force > force from motor) that the dissociation rate is reduced (catch-bond is activated). Such a regime is never reached in the DNA tensiometer study because of the very construction of the experiment. I am very surprised that this point is overlooked in this manuscript. I am therefore not even sure that the present experiments even induce a catch-bond (in the sense reported for earlier papers).

      (3) I appreciate the concerns about the Vertical force from the optical trap. But that leads to the following questions that have not at all been addressed in this paper:

      (i) Why is the Vertical force only a problem for Kinesins, and not a problem for the dynein studies?

      (ii) The authors state that "With this geometry, a kinesin motor pulls against the elastic force of a stretched DNA solely in a direction parallel to the microtubule". Is this really true? What matters is not just how the kinesin pulls the DNA, but also how the DNA pulls on the kinesin. In Figure 1A, what is the guarantee that the DNA is oriented only in the plane of the paper? In fact, the DNA could even be bending transiently in a manner that it pulls the kinesin motor UPWARDS (Vertical force). How are the authors sure that the reaction force between DNA and kinesin is oriented SOLELY along the microtubule?

      (4) For this study to be really impactful and for some of the above concerns to be addressed, the data should also have included DNA tensiometer experiments with Dynein. I wonder why this was not done?

      While I do like several aspects of the paper, I do not believe that the conclusions are supported by the data presented in this paper for the reasons stated above.

    2. Author response:

      Reviewer 1 (Public review):

      (1) Figure 1B shows the PREDICTED force-extension curve for DNA based on a worm-like chain model. Where is the experimental evidence for this curve? This issue is crucial because the F-E curve will decide how and when a catch-bond is induced (if at all it is) as the motor moves against the tensiometer. Unless this is actually measured by some other means, I find it hard to accept all the results based on Figure 1B.

      The Worm-Like-Chain model for the elasticity of DNA was established by early work from the Bustamante lab (Smith et al., 1992)  and Marko and Siggia (Marko and Siggia, 1995), and was further validated and refined by the Block lab (Bouchiat et al., 1999; Wang et al., 1997). The 50 nm persistence length is the consensus value, and was shown to be independent of force and extension in Figure 3 of Bouchiat et al (Bouchiat et al., 1999). However, we would like to stress that for our conclusions, the precise details of the Force-Extension relationship of our dsDNA are immaterial. The key point is that the motor stretches the DNA and stalls when it reaches its stall force. Our claim of the catch-bond character of kinesin is based on the longer duration at stall compared to the run duration in the absence of load. Provided that the motor is indeed stalling because it has stretched out the DNA (which is strongly supported by the repeated stalling around the predicted extension corresponding to ~6 pN of force), then the stall duration depends on neither the precise value for the extension nor the precise value of the force at stall.

      (2) The authors can correct me on this, but I believe that all the catch-bond studies using optical traps have exerted a load force that exceeds the actual force generated by the motor. For example, see Figure 2 in reference 42 (Kunwar et al). It is in this regime (load force > force from motor) that the dissociation rate is reduced (catch-bond is activated). Such a regime is never reached in the DNA tensiometer study because of the very construction of the experiment. I am very surprised that this point is overlooked in this manuscript. I am therefore not even sure that the present experiments even induce a catch-bond (in the sense reported for earlier papers).

      It is true that Kunwar et al measured binding durations at super-stall loads and used that to conclude that dynein does act as a catch-bond (but kinesin does not) (Kunwar et al., 2011). However, we would like to correct the reviewer on this one. This approach of exerting super-stall forces and measuring binding durations is in fact less common than the approach of allowing the motor to walk up to stall and measuring the binding duration. This ‘fixed trap’ approach has been used to show catch-bond behavior of dynein (Leidel et al., 2012; Rai et al., 2013) and kinesin (Kuo et al., 2022; Pyrpassopoulos et al., 2020). For the non-processive motor Myosin I, a dynamic force clamp was used to keep the actin filament in place while the myosin generated a single step (Laakso et al., 2008). Because the motor generates the force, these are not superstall forces either.

      (3) I appreciate the concerns about the Vertical force from the optical trap. But that leads to the following questions that have not at all been addressed in this paper:

      (i) Why is the Vertical force only a problem for Kinesins, and not a problem for the dynein studies?

      Actually, we do not claim that vertical force is not a problem for dynein; our data do not speak to this question. There is debate in the literature as to whether dynein has catch bond behavior in the traditional single-bead optical trap geometry - while some studies have measured dynein catch bond behavior (Kunwar et al., 2011; Leidel et al., 2012; Rai et al., 2013), others have found that dynein has slip-bond or ideal-bond behavior (Ezber et al., 2020; Nicholas et al., 2015; Rao et al., 2019). This discrepancy may relate to vertical forces, but not in an obvious way.

      (ii) The authors state that "With this geometry, a kinesin motor pulls against the elastic force of a stretched DNA solely in a direction parallel to the microtubule". Is this really true? What matters is not just how the kinesin pulls the DNA, but also how the DNA pulls on the kinesin. In Figure 1A, what is the guarantee that the DNA is oriented only in the plane of the paper? In fact, the DNA could even be bending transiently in a manner that it pulls the kinesin motor UPWARDS (Vertical force). How are the authors sure that the reaction force between DNA and kinesin is oriented SOLELY along the microtubule?

      We acknowledge that “solely” is an absolute term that is too strong to describe our geometry. We will soften this term in our revision to “nearly parallel to the microtubule”. In the Geometry Calculations section of Supplementary Methods, we calculate that if the motor and streptavidin are on the same protofilament, the vertical force will be <1% of the horizontal force. We also note that if the motor is on a different protofilament, there will be lateral forces and forces perpendicular to the microtubule surface, except they are oriented toward rather than away from the microtubule. The DNA can surely bend due to thermal forces, but because inertia plays a negligible role at the nanoscale (Howard, 2001; Purcell, 1977), any resulting upward forces will only be thermal forces, which the motor is already subjected to at all times.

      (4) For this study to be really impactful and for some of the above concerns to be addressed, the data should also have included DNA tensiometer experiments with Dynein. I wonder why this was not done?

      As much as we would love to fully characterize dynein here, this paper is about kinesin and it took a substantial effort. The dynein work merits a stand-alone paper.

      While I do like several aspects of the paper, I do not believe that the conclusions are supported by the data presented in this paper for the reasons stated above.

      The three key points the reviewer makes are the validity of the worm-like-chain model, the question of superstall loads, and the role of DNA bending in generating vertical forces. We hope that we have fully addressed these concerns in our responses above.

      Reviewer #2 (Public review):

      Major comments:

      (1) The use of the term "catch bond" is misleading, as the authors do not really mean consistently a catch bond in the classical sense (i.e., a protein-protein interaction having a dissociation rate that decreases with load). Instead, what they mean is that after motor detachment (i.e., after a motor protein dissociating from a tubulin protein), there is a slip state during which the reattachment rate is higher as compared to a motor diffusing in solution. While this may indeed influence the dynamics of bidirectional cargo transport (e.g., during tug-of-war events), the used terms (detachment (with or without slip?), dissociation, rescue, ...) need to be better defined and the results discussed in the context of these definitions. It is very unsatisfactory at the moment, for example, that kinesin-3 is at first not classified as a catch bond, but later on (after tweaking the definitions) it is. In essence, the typical slip/catch bond nomenclature used for protein-protein interaction is not readily applicable for motors with slippage.

      We appreciate the reviewer’s point and we will work to streamline and define terms in our revision.

      (2) The authors define the stall duration as the time at full load, terminated by >60 nm slips/detachments. Isn't that a problem? Smaller slips are not detected/considered... but are also indicative of a motor dissociation event, i.e., the end of a stall. What is the distribution of the slip distances? If the slip distances follow an exponential decay, a large number of short slips are expected, and the presented data (neglecting those short slips) would be highly distorted.

      The reviewer brings up a good point that there may be undetected slips. To address this question, we plotted the distribution of slip distances for kinesin-3, which by far had the most slip events. As the reviewer suggested, it is indeed an exponential distribution. Our preliminary analysis suggests that roughly 20% of events are missed due to this 60 nm cutoff. This will change our unloaded duration numbers slightly, but this will not alter our conclusions.\

      (3) Along the same line: Why do the authors compare the stall duration (without including the time it took the motor to reach stall) to the unloaded single motor run durations? Shouldn't the times of the runs be included?

      The elastic force of the DNA spring is variable as the motor steps up to stall, and so if we included the entire run duration then it would be difficult to specify what force we were comparing to unloaded. More importantly, if we assume that any stepping and detachment behavior is history independent, then it is mathematically proper to take any arbitrary starting point (such as when the motor reaches stall), start the clock there, and measure the distribution of detachments durations relative to that starting point.

      More importantly, what we do in Fig. 3 is to separate out the ramps from the stalls and, using a statistical model, we compute a separate duration parameter (which is the inverse of the off-rate) for the ramp and the stall. What we find is that the relationship between ramp, stall, and unloaded durations is different for the three motors, which is interesting in itself.

      (4) At many places, it appears too simple that for the biologically relevant processes, mainly/only the load-dependent off-rates of the motors matter. The stall forces and the kind of motor-cargo linkage (e.g., rigid vs. diffusive) do likely also matter. For example: "In the context of pulling a large cargo through the viscous cytoplasm or competing against dynein in a tug-of-war, these slip events enable the motor to maintain force generation and, hence, are distinct from true detachment events." I disagree. The kinesin force at reattachment (after slippage) is much smaller than at stall. What helps, however, is that due to the geometry of being held close to the microtubule (either by the DNA in the present case or by the cargo in vivo) the attachment rate is much higher. Note also that upon DNA relaxation, the motor is likely kept close to the microtubule surface, while, for example, when bound to a vesicle, the motor may diffuse away from the microtubule quickly (e.g., reference 20).

      We appreciate the reviewer’s detailed thinking here, and we offer our perspective. As to the first point, we agree that the stall force is relevant and that the rigidity of the motor-cargo linkage will play a role. The goal of the sentence on pulling cargo that the reviewer highlights is to set up our analysis of slips, which we define as rearward displacements that don’t return to the baseline before force generation resumes. We agree that force after slippage is much smaller than at stall, and we plan to clarify that section of text. However, as shown in the model diagram in Fig. 5, we differentiate between the slip state (and recovery from this slip state) and the detached state (and reattachment from this detached state). This delineation is important because, as the reviewer points out, if we are measuring detachment and reattachment with our DNA tensiometer, then the geometry of a vesicle in a cell will be different and diffusion away from the microtubule or elastic recoil perpendicular to the microtubule will suppress this reattachment.

      Our evidence for a slip state in which the motor maintains association with the microtubule comes from optical trapping work by Tokelis et al (Toleikis et al., 2020) and Sudhakar et al (Sudhakar et al., 2021). In particular, Sudhakar used small, high index Germanium microspheres that had a low drag coefficient. They showed that during ‘slip’ events, the relaxation time constant of the bead back to the center of the trap was nearly 10-fold slower than the trap response time, consistent with the motor exerting drag on the microtubule. (With larger beads, the drag of the bead swamps the motor-microtubule friction.) Another piece of support for the motor maintaining association during a slip is work by Ramaiya et al. who used birefringent microspheres to exert and measure rotational torque during kinesin stepping (Ramaiya et al., 2017). In most traces, when the motor returned to baseline following a stall, the torque was dissipated as well, consistent with a ‘detached’ state. However, a slip event is shown in S18a where the motor slips backward while maintaining torque. This is best explained by the motor slipping backward in a state where the heads are associated with the microtubule (at least sufficiently to resist rotational forces). Thus, we term the resumption after slip to be a rescue from the slip state rather than a reattachment from the detached state.

      To finish the point, with the complex geometry of a vesicle, during slip events the motor remains associated with the microtubule and hence primed for recovery. This recovery rate is expected to be the same as for the DNA tensiometer. Following a detachment, however, we agree that there will likely be a higher probability of reattachment in the DNA tensiometer due to proximity effects, whereas with a vesicle any elastic recoil or ‘rolling’ will pull the detached motor away from the microtubule, suppressing reattachment. We plan to clarify these points in the text of the revision.

      (5) Why were all motors linked to the neck-coil domain of kinesin-1? Couldn't it be that for normal function, the different coils matter? Autoinhibition can also be circumvented by consistently shortening the constructs.

      We chose this dimerization approach to focus on how the mechoanochemical properties of kinesins vary between the three dominant transport families. We agree that in cells, autoinhibition of both kinesins and dynein likely play roles in regulating bidirectional transport, as will the activity of other regulatory proteins. The native coiled-coils may act as as ‘shock absorbers’ due to their compliance, or they might slow the motor reattachment rate due to the relatively large search volumes created by their long lengths (10s of nm). These are topics for future work. By using the neck-coil domain of kinesin-1 for all three motors, we eliminate any differences in autoinhibition or other regulation between the three kinesin families and focus solely on differences in the mechanochemistry of their motor domains.

      (6) I am worried about the neutravidin on the microtubules, which may act as roadblocks (e.g. DOI: 10.1039/b803585g), slip termination sites (maybe without the neutravidin, the rescue rate would be much lower?), and potentially also DNA-interaction sites? At 8 nM neutravidin and the given level of biotinylation, what density of neutravidin do the authors expect on their microtubules? Can the authors rule out that the observed stall events are predominantly the result of a kinesin motor being stopped after a short slippage event at a neutravidin molecule?

      We will address these points in our revision.

      (7) Also, the unloaded runs should be performed on the same microtubules as in the DNA experiments, i.e., with neutravidin. Otherwise, I do not see how the values can be compared.

      We will address this point in our revision.

      (8) If, as stated, "a portion of kinesin-3 unloaded run durations were limited by the length of the microtubules, meaning the unloaded duration is a lower limit." corrections (such as Kaplan-Meier) should be applied, DOI: 10.1016/j.bpj.2017.09.024.

      (9) Shouldn't Kaplan-Meier also be applied to the ramp durations ... as a ramp may also artificially end upon stall? Also, doesn't the comparison between ramp and stall duration have a problem, as each stall is preceded by a ramp ...and the (maximum) ramp times will depend on the speed of the motor? Kinesin-3 is the fastest motor and will reach stall much faster than kinesin-1. Isn't it obvious that the stall durations are longer than the ramp duration (as seen for all three motors in Figure 3)?

      The reviewer rightly notes the many challenges in estimating the motor off-rates during ramps. To estimate ramp off-rates and as an independent approach to calculating the unloaded and stall durations, we developed a Markov model coupled with Bayesian inference methods to estimate a duration parameter (equivalent to the inverse of the off-rate) for the unloaded, ramp, and stall duration distributions. With the ramps, we have left censoring due to the difficulty in detecting the start of the ramps in the fluctuating baseline, and we have right censoring due to reaching stall (with different censoring of the ramp duration for the three motors due to their different speeds). The Markov model assumes a constant detachment probability and history independence, and thus is robust even in the face of left and right censoring (details in the Supplementary section). This approach is preferred over Kaplan-Meier because, although these non-parametric methods make no assumptions for the distribution, they require the user to know exactly where the start time is.

      Regarding the potential underestimate of the kinesin-3 unloaded run duration due to finite microtubule lengths. The first point is that the unloaded duration data in Fig. 2C are quite linear up to 6 s and are well fit by the single-exponential fit (the points above 6s don’t affect the fit very much). The second point is that when we used our Markov model (which is robust against right censoring) to estimate the unloaded and stall durations, the results agreed with the single-exponential fits very well (Table S2). For instance, the single-exponential fit for the kinesin-3 unloaded duration was 2.74 s (2.33 – 3.17 s 95% CI) and the estimate from the Markov model was 2.76 (2.28 – 3.34 s 95% CI). Thus, we chose not to make any corrections due to finite microtubule lengths.

      (10) It is not clear what is seen in Figure S6A: It looks like only single motors (green, w/o a DNA molecule) are walking ... Note: the influence of the attached DNA onto the stepping duration of a motor may depend on the DNA conformation (stretched and near to the microtubule (with neutravidin!) in the tethered case and spherically coiled in the untethered case).

      In Figure S6A kymograph, the green traces are GFP-labeled kinesin-1 without DNA attached (which are in excess) and the red diagonal trace is a motor with DNA attached. There are also two faint horizontal red traces, which are labeled DNA diffusing by (smearing over a large area during a single frame). Panel S6B shows run durations of motors with DNA attached. We agree that the DNA conformation will differ if it is attached and stretched (more linear) versus simply being transported (random coil), but by its nature this control experiment is only addressing random coil DNA.

      (11) Along this line: While the run time of kinesin-1 with DNA (1.4 s) is significantly shorter than the stall time (3.0 s), it is still larger than the unloaded run time (1.0 s). What do the authors think is the origin of this increase?

      Our interpretation of the unloaded kinesin-DNA result is that the much slower diffusion constant of the DNA relative to the motor alone enables motors to transiently detach and rebind before the DNA cargo has diffused away, thus extending the run duration. In contrast, such detachment events for motors alone normally result in the motor diffusing away from the microtubule, terminating the run. This argument has been used to reconcile the longer single-motor run lengths in the gliding assay versus the bead assay (Block et al., 1990). Notably, this slower diffusion constant should not play a role in the DNA tensiometer geometry because if the motor transiently detaches, then it will be pulled backward by the elastic forces of the DNA and detected as a slip or detachment event. We will address this point in the revision.

      (12) "The simplest prediction is that against the low loads experienced during ramps, the detachment rate should match the unloaded detachment rate." I disagree. I would already expect a slight increase.

      Agreed. We will change this text to: “The prediction for a slip bond is that against the low loads experienced during ramps, the detachment rate should be equal to or faster than the unloaded detachment rate.”

      (13) Isn't the model over-defined by fitting the values for the load-dependence of the strong-to-weak transition and fitting the load dependence into the transition to the slip state?

      Essentially, yes, it is overdefined, but that is essentially by design and it is still very useful. Our goal here was to make as simple a model as possible that could account for the data and use it to compare model parameters for the different motor families. Ignoring the complexity of the slip and detached states, a model with a strong and weak state in the stepping cycle and a single transition out of the stepping cycle is the simplest formulation possible. And having rate constants (k<sub>S-W</sub> and k<sub>slip</sub> in our case) that vary exponentially with load makes thermodynamic sense for modeling mechanochemistry (Howard, 2001). Thus, we were pleasantly surprised that this bare-bones model could recapitulate the unloaded and stall durations for all three motors (Fig. 5C-E).

      (14) "When kinesin-1 was tethered to a glass coverslip via a DNA linker and hydrodynamic forces were imposed on an associated microtubule, kinesin-1 dissociation rates were relatively insensitive to loads up to ~3 pN, inconsistent with slip-bond characteristics (37)." This statement appears not to be true. In reference 37, very similar to the geometry reported here, the microtubules were fixed on the surface, and the stepping of single kinesin motors attached to large beads (to which defined forces were applied by hydrodynamics) via long DNA linkers was studied. In fact, quite a number of statements made in the present manuscript have been made already in ref. 37 (see in particular sections 2.6 and 2.7), and the authors may consider putting their results better into this context in the Introduction and Discussion. It is also noteworthy to discuss that the (admittedly limited) data in ref. 37 does not indicate a "catch-bond" behavior but rather an insensitivity to force over a defined range of forces.

      The reviewer misquoted our sentence. The actual wording of the sentence was: “When kinesin-1 was connected to micron-scale beads through a DNA linker and hydrodynamic forces parallel to the microtubule imposed, dissociation rates were relatively insensitive to loads up to ~3 pN, inconsistent with slip-bond characteristics (Urbanska et al., 2021).” The sentence the reviewer quoted was in a previous version that is available on BioRxiv and perhaps they were reading that version. Nonetheless, in the revision we will note in the Discussion that this behavior was indicative of an ideal bond (not a catch-bond), and we will also add a sentence in the Introduction highlighting this work.

      Reviewer #3 (Public review):

      The authors attribute the differences in the behaviour of kinesins when pulling against a DNA tether compared to an optical trap to the differences in the perpendicular forces. However, the compliance is also much different in these two experiments. The optical trap acts like a ~ linear spring with stiffness ~ 0.05 pN/nm. The dsDNA tether is an entropic spring, with negligible stiffness at low extensions and very high compliance once the tether is extended to its contour length (Fig. 1B). The effect of the compliance on the results should be addressed in the manuscript.

      This is an interesting point. To address it, we calculated the predicted stiffness of the dsDNA by taking the slope of theoretical force-extension curve in Fig. 1B. Below 650 nm extension, the stiffness is <0.001 pN/nM; it reaches 0.01 pN/nM at 855 nm, and at 960 nm where the force is 6 pN the stiffness is roughly 0.2 pN/nm. That value is higher than the quoted 0.05 pN/nm trap stiffness, but for reference, at this stiffness, an 8 nm step leads to a 1.6 pN jump in force, which is reasonable. Importantly, the stiffness of kinesin motors has been estimated to be in the range of 0.3 pN (Coppin et al., 1996; Coppin et al., 1997). Granted, this stiffness is also nonlinear, but what this means is that even at stall, our dsDNA tether has a similar predicted compliance to the motor that is pulling on it. We will address this point in our revision.  

      Compared to an optical trapping assay, the motors are also tethered closer to the microtubule in this geometry. In an optical trap assay, the bead could rotate when the kinesin is not bound. The authors should discuss how this tethering is expected to affect the kinesin reattachment and slipping. While likely outside the scope of this study, it would be interesting to compare the static tether used here with a dynamic tether like MAP7 or the CAP-GLY domain of p150glued.

      Please see our response to Reviewer #2 Major Comment #4 above, which asks this same question in the context of intracellular cargo. We plan to address this in our revision. Regarding a dynamic tether, we agree that’s interesting – there are kinesins that have a second, non-canonical binding site that achieves this tethering (ncd and Cin8); p150glued likely does this naturally for dynein-dynactin-activator complexes; and we speculated in a review some years ago (Hancock, 2014) that during bidirectional transport kinesin and dynein may act as dynamic tethers for one another when not engaged, enhancing the activity of the opposing motor.

      In the single-molecule extension traces (Figure 1F-H; S3), the kinesin-2 traces often show jumps in position at the beginning of runs (e.g., the four runs from ~4-13 s in Fig. 1G). These jumps are not apparent in the kinesin-1 and -3 traces. What is the explanation? Is kinesin-2 binding accelerated by resisting loads more strongly than kinesin-1 and -3?

      Due to the compliance of the dsDNA, the 95% limits for the initial attachment position are +/- 290 nm (Fig. S2). Thus, some apparent ‘jumps’ from the detached state are expected. We will take a closer look at why there are jumps for kinesin-2 that aren’t apparent for kinesin-1 or -3.

      When comparing the durations of unloaded and stall events (Fig. 2), there is a potential for bias in the measurement, where very long unloaded runs cannot be observed due to the limited length of the microtubule (Thompson, Hoeprich, and Berger, 2013), while the duration of tethered runs is only limited by photobleaching. Was the possible censoring of the results addressed in the analysis?

      Yes. Please see response to Reviewer #2 points (8) and (9) above.

      The mathematical model is helpful in interpreting the data. To assess how the "slip" state contributes to the association kinetics, it would be helpful to compare the proposed model with a similar model with no slip state. Could the slips be explained by fast reattachments from the detached state?

      In the model, the slip state and the detached states are conceptually similar; they only differ in the sequence (slip to detached) and the transition rates into and out of them. The simple answer is: yes, the slips could be explained by fast reattachments from the detached state. In that case, the slip state and recovery could be called a “detached state with fast reattachment kinetics”. However, the key data for defining the kinetics of the slip and detached states is the distribution of Recovery times shown in Fig. 4D-F, which required a triple exponential to account for all of the data. If we simplified the model by eliminating the slip state and incorporating fast reattachment from a single detached state, then the distribution of Recovery times would be a single-exponential with a time constant equivalent to t<sub>1</sub>, which would be a poor fit to the experimental distributions in Fig. 4D-F.

      We appreciate the efforts and helpful suggestions of all three reviewers and the Editor.

      References:

      Block, S.M., L.S. Goldstein, and B.J. Schnapp. 1990. Bead movement by single kinesin molecules studied with optical tweezers. Nature. 348:348-352.

      Bouchiat, C., M.D. Wang, J. Allemand, T. Strick, S.M. Block, and V. Croquette. 1999. Estimating the persistence length of a worm-like chain molecule from force-extension measurements. Biophys J. 76:409-413.

      Coppin, C.M., J.T. Finer, J.A. Spudich, and R.D. Vale. 1996. Detection of sub-8-nm movements of kinesin by high-resolution optical-trap microscopy. Proc Natl Acad Sci U S A. 93:1913-1917.

      Coppin, C.M., D.W. Pierce, L. Hsu, and R.D. Vale. 1997. The load dependence of kinesin's mechanical cycle. Proc Natl Acad Sci U S A. 94:8539-8544.

      Ezber, Y., V. Belyy, S. Can, and A. Yildiz. 2020. Dynein Harnesses Active Fluctuations of Microtubules for Faster Movement. Nat Phys. 16:312-316.

      Hancock, W.O. 2014. Bidirectional cargo transport: moving beyond tug of war. Nat Rev Mol Cell Biol. 15:615-628.

      Howard, J. 2001. Mechanics of Motor Proteins and the Cytoskeleton. Sinauer Associates, Inc., Sunderland, MA. 367 pp.

      Kunwar, A., S.K. Tripathy, J. Xu, M.K. Mattson, P. Anand, R. Sigua, M. Vershinin, R.J. McKenney, C.C. Yu, A. Mogilner, and S.P. Gross. 2011. Mechanical stochastic tug-of-war models cannot explain bidirectional lipid-droplet transport. Proc Natl Acad Sci U S A. 108:18960-18965.

      Kuo, Y.W., M. Mahamdeh, Y. Tuna, and J. Howard. 2022. The force required to remove tubulin from the microtubule lattice by pulling on its alpha-tubulin C-terminal tail. Nature communications. 13:3651.

      Laakso, J.M., J.H. Lewis, H. Shuman, and E.M. Ostap. 2008. Myosin I can act as a molecular force sensor. Science. 321:133-136.

      Leidel, C., R.A. Longoria, F.M. Gutierrez, and G.T. Shubeita. 2012. Measuring molecular motor forces in vivo: implications for tug-of-war models of bidirectional transport. Biophys J. 103:492-500.

      Marko, J.F., and E.D. Siggia. 1995. Stretching DNA. Macromolecules. 28:8759-8770.

      Nicholas, M.P., F. Berger, L. Rao, S. Brenner, C. Cho, and A. Gennerich. 2015. Cytoplasmic dynein regulates its attachment to microtubules via nucleotide state-switched mechanosensing at multiple AAA domains. Proc Natl Acad Sci U S A. 112:6371-6376.

      Purcell, E.M. 1977. Life at low Reynolds Number. Amer J. Phys. 45:3-11.

      Pyrpassopoulos, S., H. Shuman, and E.M. Ostap. 2020. Modulation of Kinesin's Load-Bearing Capacity by Force Geometry and the Microtubule Track. Biophys J. 118:243-253.

      Rai, A.K., A. Rai, A.J. Ramaiya, R. Jha, and R. Mallik. 2013. Molecular adaptations allow dynein to generate large collective forces inside cells. Cell. 152:172-182.

      Ramaiya, A., B. Roy, M. Bugiel, and E. Schaffer. 2017. Kinesin rotates unidirectionally and generates torque while walking on microtubules. Proc Natl Acad Sci U S A. 114:10894-10899.

      Rao, L., F. Berger, M.P. Nicholas, and A. Gennerich. 2019. Molecular mechanism of cytoplasmic dynein tension sensing. Nature communications. 10:3332.

      Smith, S.B., L. Finzi, and C. Bustamante. 1992. Direct mechanical measurements of the elasticity of single DNA molecules by using magnetic beads. Science. 258:1122-1126.

      Sudhakar, S., M.K. Abdosamadi, T.J. Jachowski, M. Bugiel, A. Jannasch, and E. Schaffer. 2021. Germanium nanospheres for ultraresolution picotensiometry of kinesin motors. Science. 371.

      Toleikis, A., N.J. Carter, and R.A. Cross. 2020. Backstepping Mechanism of Kinesin-1. Biophys J. 119:1984-1994.

      Urbanska, M., A. Ludecke, W.J. Walter, A.M. van Oijen, K.E. Duderstadt, and S. Diez. 2021. Highly-Parallel Microfluidics-Based Force Spectroscopy on Single Cytoskeletal Motors. Small. 17:e2007388.

      Wang, M.D., H. Yin, R. Landick, J. Gelles, and S.M. Block. 1997. Stretching DNA with optical tweezers. Biophys J. 72:1335-1346.

  2. www.planalto.gov.br www.planalto.gov.br
    1. mais vantajoso

      Acórdão 2450/2025 Plenário (Representação, Relator Ministro Jorge Oliveira)

      Licitação. Estudo de viabilidade. Locação (Licitação). Veículo. Estudo técnico preliminar. Análise de custos. Benefícios. Opção. Aquisição. Tecnologia. Ciclo de vida.

      • No estudo técnico preliminar de licitação para locação de veículos, deve ser realizada análise do custo-benefício da opção de locação em comparação com a de aquisição, bem como exame do custo do ciclo de vida do objeto e avaliação das alternativas tecnológicas possíveis (como estudo comparativo entre veículos a combustão e híbridos), em cumprimento ao disposto no art. 11, inciso I, da Lei 14.133/2021.
    1. Author response:

      Reviewer #1 (Public review):

      Fombellida-Lopez and colleagues describe the results of an ART intensification trial in people with HIV infection (PWH) on suppressive ART to determine the effect of increasing the dose of one ART drug, dolutegravir, on viral reservoirs, immune activation, exhaustion, and circulating inflammatory markers. The authors hypothesize that ART intensification will provide clues about the degree to which low-level viral replication is occurring in circulation and in tissues despite ongoing ART, which could be identified if reservoirs decrease and/or if immune biomarkers change. The trial design is straightforward and well-described, and the intervention appears to have been well tolerated. The investigators observed an increase in dolutegravir concentrations in circulation, and to a lesser degree in tissues, in the intervention group, indicating that the intervention has functioned as expected (ART has been intensified in vivo). Several outcome measures changed during the trial period in the intervention group, leading the investigators to conclude that their results provide strong evidence of ongoing replication on standard ART. The results of this small trial are intriguing, and a few observations in particular are hypothesis-generating and potentially justify further clinical trials to explore them in depth. However, I am concerned about over-interpretation of results that do not fully justify the authors' conclusions.

      We thank Reviewer #1 for their thoughtful and constructive comments, which helped us clarify and improve the manuscript. Below, we address each of the reviewer’s points and describe the changes that we implemented in the revised version. We acknowledge the reviewer’s concern regarding potential overinterpretation of certain findings, and in the revised version we took particular care to ensure that all conclusions are supported by the data and framed within the exploratory nature of the study.

      (1) Trial objectives: What was the primary objective of the trial? This is not clearly stated. The authors describe changes in some reservoir parameters and no changes in others. Which of these was the primary outcome? No a priori hypothesis / primary objective is stated, nor is there explicit justification (power calculations, prior in vivo evidence) for the small n, unblinded design, and lack of placebo control. In the abstract (line 36, "significant decreases in total HIV DNA") and conclusion (lines 244-246), the authors state that total proviral DNA decreased as a result of ART intensification. However, in Figures 2A and 2E (and in line 251), the authors indicate that total proviral DNA did not change. These statements are confusing and appear to be contradictory. Regarding the decrease in total proviral DNA, I believe the authors may mean that they observed transient decrease in total proviral DNA during the intensification period (day 28 in particular, Figure 2A), however this level increases at Day 56 and then returns to baseline at Day 84, which is the source of the negative observation. Stating that total proviral DNA decreased as a result of the intervention when it ultimately did not is misleading, unless the investigators intended the day 28 timepoint as a primary endpoint for reservoir reduction - if so, this is never stated, and it is unclear why the intervention would then be continued until day 84? If, instead, reservoir reduction at the end of the intervention was the primary endpoint (again, unstated by the authors), then it is not appropriate to state that the total proviral reservoir decreased significantly when it did not.

      We agree with the reviewer that the primary objective of the study was not explicitly stated in the submitted manuscript. We clarified this in the revised manuscript (lines 361-364). As registered on ClinicalTrials.gov (NCT05351684), the primary outcome was defined as “To evaluate the impact of treatment intensification at the level of total and replication-competent reservoir (RCR) in blood and in tissues”, with a time frame of 3 months. Accordingly, our aim was to explore whether any measurable reduction in the HIV reservoir (total or replication-competent) occurred during the intensification period, including at day 28, 56, or 84. The protocol did not prespecify a single time point for this effect to occur, and the exploratory design allowed for detection of transient or sustained changes within the intensification window.

      We recognize that this scope was not clearly articulated in the original text and may have led to confusion in interpreting the transient drop in total HIV DNA observed at day 28. While total DNA ultimately returned to baseline by the end of intensification, the presence of a transient reduction during this 3-month window still fits within the framework of the study’s registered objective. Moreover, although the change in total HIV DNA was transient, it aligns with the consistent direction of changes observed across the multiple independent measures, including CA HIV RNA, RNA/DNA ratio and intact HIV DNA, collectively supporting a biological effect of intensification.

      We would also like to stress that this is the first clinical trial ever, in which an ART intensification is performed not by adding an extra drug but by increasing the dosage of an existing drug. Therefore, we were more interested in the overall, cumulative, effect of intensification throughout the entire trial period, than in differences between groups at individual time points. We clarified in the revised manuscript that this was a proof-of-concept phase 2 study, designed to reveal biological effects of ART intensification rather than confirm efficacy in a powered comparison. The absence of a prespecified statistical endpoint or sample size calculation reflects the exploratory nature of the trial.

      (2) Intervention safety and tolerability: The results section lacks a specific heading for participant safety and tolerability of the intervention. I was wondering about clinically detectable viremia in the study. Were there any viral blips? Was the increased DTG well tolerated? This drug is known to cause myositis, headache, CPK elevation, hepatotoxicity, and headache. Were any of these observed? What is the authors' interpretation of the CD4:8 ratio change (line 198)? Is this a significant safety concern for a longer duration of intensification? Was there also a change in CD4% or only in absolute counts? Was there relative CD4 depletion observed in the rectal biopsy samples between days 0 and 84? Interestingly, T cells dropped at the same timepoints that reservoirs declined... how do the authors rule out that reservoir decline reflects transient T cell decline that is non-specific (not due to additional blockade of replication)?

      We improved the Methods section to clarify how safety and tolerability were assessed during the study (lines 389-396). Safety evaluations were conducted on day 28 and day 84 and included a clinical examination and routine laboratory testing (liver function tests, kidney function, and complete blood count). Medication adherence was also monitored through pill counts performed by the study nurses.

      No virological blips above 50 copies/mL were observed and no adverse events were reported by participants during the 3-month intensification period. Although CPK levels were not included in the routine biological monitoring, no participant reported muscle pain or other symptoms suggestive of muscle toxicity.

      The CD4:CD8 ratio decrease noted during intensification was not associated with significant changes in absolute CD4 or CD8 counts, as shown in Figure 5. We interpret this ratio change as a transient redistribution rather than an immunological risk, therefore we do not consider it to represent a safety concern.

      We would like to clarify that CD4⁺ T-cell counts did not significantly decrease in any of the treatment groups, as shown in Figure 5. The apparent decline observed concerns the CD4/CD8 ratio, which transiently dropped, but not the absolute number of CD4⁺ T cells. Moreover, although the dynamics of total HIV DNA is indeed similar to that of CD4/CD8 ratio (both declined transiently and then returned to baseline by day 84), the dynamics of unspliced RNA and unspliced RNA/total DNA ratio are clearly different, as these markers demonstrated a sustained decrease that was maintained throughout the trial period, even when the CD4/CD8 ratio already returned to baseline. Also, we observed a significant decrease in intact HIV DNA at day 84 compared to day 0. These effects cannot be easily explained by a transient decline in CD4+ cells.

      (3) The investigators describe a decrease in intact proviral DNA after 84 days of ART intensification in circulating cells (Figure 2D), but no changes to total proviral DNA in blood or tissue (Figures 2A and 2E; IPDA does not appear to have been done on tissue samples). It is not clear why ART intensification would result in a selective decrease in intact proviruses and not in total proviruses if the source of these reservoir cells is due to ongoing replication. These reservoir results have multiple interpretations, including (but not limited to) the investigators' contention that this provides strong evidence of ongoing replication. However, ongoing replication results in the production of both intact and mutated/defective proviruses that both contribute to reservoir size (with defective proviruses vastly outnumbering intact proviruses). The small sample size and well-described heterogeneity of the HIV reservoir (with regard to overall size and composition) raise the possibility that the study was underpowered to detect differences over the 84-day intervention period. No power calculations or prior studies were described to justify the trial size or the duration of the intervention. Readers would benefit from a more nuanced discussion of reservoir changes observed here.

      We sincerely thank the reviewer for this insightful comment. We fully agree that the reservoir dynamics observed in our study might raise several possible interpretations, and that its complexity, resulting from continuous cycles of expansion and contraction, reflects the heterogeneity of the latent reservoir. 

      Total HIV DNA in PBMCs showed a transient decline during intensification (notably at day 28), ultimately returning to baseline by day 84. This biphasic pattern likely reflects the combined effects of suppression of ongoing low-level replication by an increased DTG dosage, followed by the expansion of infected cell clones (mostly harbouring defective proviruses). In other words, the transient decrease in total (intact + defective) DNA at day 28 may be due to an initial decrease in newly infected cells upon ART intensification, however at the subsequent time points this effect was masked by proliferation (clonal expansion) of infected cells with defective proviruses. Recent studies suggest that intact and defective proviruses are subjected to different selection pressures by the immune system on ART (PMID: 38337034) and their decay on therapy is different (intact proviruses are cleared much more rapidly than defectives). In addition, defective proviruses can be preferentially expanded as they can reprogram the host cell proliferation machinery (https://doi.org/10.1101/2025.09.22.676989). This explains why in our study the intact proviruses decreased, but the total proviruses did not change, between days 0 and 84, in the intensification group. Interestingly, in the control group, we observed a significant increase in total DNA at day 84 compared to day 0, with no difference for the intact DNA, which is also in line with the clonal expansion of defective proviruses.

      Importantly, we observed a significant decrease in intact proviral DNA between day 0 and day 84 in the intensification group (Figure 2D). This result directly addresses the study’s primary objective: assessing the impact of intensification on the replication-competent reservoir. In comparison, as the reviewer rightly points out, total HIV DNA includes over 90% defective genomes, which limits its interpretability as a biomarker of biologically relevant reservoir changes. In addition, other reservoir markers, such as cell-associated unspliced RNA and RNA/DNA ratios, also showed consistent trends supporting a biologically relevant effect of intensification. Even in the absence of sustained changes in total HIV DNA, the coherence across the different independent measures of the reservoir (intact DNA, unspliced RNA), suggests an effect indicative of ongoing replication pre-intensification.

      Regarding tissue reservoirs, the lack of substantial change in total HIV DNA between days 0 and 84 is also in line with the predominance of defective sequences in these compartments. Moreover, the limited increase in rectal tissue dolutegravir levels during intensification (from 16.7% to 20% of plasma concentrations) may have limited the efficacy of the intervention in this site.

      As for the IPDA on rectal biopsies, we attempted the assay using two independent DNA extraction methods (Promega Reliaprep and Qiagen Puregene), but both yielded high DNA shearing index values, and intact proviral detection was successful in only 3 of 40 samples. Given the poor DNA integrity, these results were not interpretable.

      That said, we fully acknowledge the limitations of our study, especially the small sample size, and we agree with the reviewer that caution is needed when interpreting these findings. In the revised manuscript, we adopted a more measured tone in the discussion (lines 340-346), stating that these observations are exploratory and hypothesis-generating, and require confirmation in larger, more powered studies. Nonetheless, we believe that the convergence of multiple reservoir markers pointing in the same direction constitutes a meaningful biological effect that deserves further investigation.

      (4) While a few statistically significant changes occurred in immune activation markers, it is not clear that these are biologically significant. Lines 175-186 and Figure 3: The change in CD4 cells + for TIGIT looks as though it declined by only 1-2%, and at day 84, the confidence interval appears to widen significantly at this timepoint, spanning an interquartile range of 4%. The only other immune activation/exhaustion marker change that reached statistical significance appears to be CD8 cells + for CD38 and HLA-DR, however, the decline appears to be a fraction of a percent, with the control group trending in the same direction. Despite marginal statistical significance, it is not clear there is any biological significance to these findings; Figure S6 supports the contention that there is no significant change in these parameters over time or between groups. With most markers showing no change and these two showing very small changes (and the latter moving in the same direction as the control group), these results do not justify the statement that intensifying DTG decreases immune activation and exhaustion (lines 38-40 in the abstract and elsewhere).

      We agree with the reviewer that the observed changes in immune activation and exhaustion markers were modest. We revised the abstract and the manuscript text (including a section header) to reflect this more accurately (lines 39, 175, 185, 253). We noted that these differences, while statistically significant (e.g., in TIGIT+ CD4+ T cells and CD38+HLA-DR+ CD8+ T cells), were limited in magnitude. We explicitly acknowledged these limitations and interpreted the findings with appropriate caution.

      (5) There are several limitations of the study design that deserve consideration beyond those discussed at line 327. The study was open-label and not placebo-controlled, which may have led to some medication adherence changes that confound results (authors describe one observation that may be evidence of this; lines 146-148). Randomized/blinded / cross-over design would be more robust and help determine signal from noise, given relatively small changes observed in the intervention arm.There does not seem to be a measurement of key outcome variables after treatment intensification ceased - evidence of an effect on replication through ART intensification would be enhanced by observing changes once intensification was stopped. Why was intensification maintained for 84 days? More information about the study duration would be helpful. Table 1 indicates that participants were 95% male. Sex is known to be a biological variable, particularly with regard to HIV reservoir size and chronic immune activation in PWH. Worldwide, 50% of PWH are women. Research into improving management/understanding of disease should reflect this, and equal participation should be sought in trials. Table 1 shows differing baseline reservoir sizes between the control and intervention groups. This may have important implications, particularly for outcomes where reservoir size is used as the denominator.

      We expanded the limitations section to address several key aspects raised by the reviewer: the absence of blinding and placebo control, the predominantly male study population, and the lack of postintervention follow-up. While we acknowledge that open-label designs can introduce behavioural biases, including potential changes in adherence, we now explicitly state that placebo-controlled, blinded trials would provide a more robust assessment and are warranted in future research (lines 340346). 

      The 84-day duration of intensification was chosen based on previous studies and provided sufficient time for observing potential changes in viral transcription and reservoir dynamics. However, we agree that including post-intervention follow-up would have strengthened the conclusions, and we highlighted this limitation and future direction in the revised manuscript (lines 340-346). 

      The sex imbalance is now clearly acknowledged as a limitation in the revised manuscript, and we fully support ongoing efforts to promote equitable recruitment in HIV research. We would like to add that, in our study, rectal biopsies were coupled with anal cancer screening through HPV testing. This screening is specifically recommended for younger men who have sex with men (MSM), as outlined in the current EACS guidelines (see: https://eacs.sanfordguide.com/eacs part2/cancer/cancerscreening-methods). As a result, MSM participants had both a clinical incentive and medical interest to undergo this procedure, which likely contributed to the higher proportion of male participants in the study.

      Lastly, although baseline total HIV DNA was higher in the intensified group, our statistical approach is based on a within-subject (repeated-measures) design, in which the longitudinal change of a parameter within the same participant during the study was the main outcome. In other words, we are not comparing absolute values of any marker between the groups, we are looking at changes of parameters from baseline within participants, and these are not expected to be affected by baseline imbalances.

      (6) Figure 1: the increase in DTG levels is interesting - it is not uniform across participants. Several participants had lower levels of DTG at the end of the intervention. Though unlikely to be statistically significant, it would be interesting to evaluate if there is a correlation between change in DTG concentrations and virologic / reservoir / inflammatory parameters. A positive relationship between increasing DTG concentration and decreased cell-associated RNA, for example, would help support the hypothesis that ongoing replication is occurring.

      We agree with the reviewer that assessing correlations between DTG concentrations and virological, immunological, or inflammatory markers would be highly informative. In fact, we initially explored this question in a preliminary way by examining whether individuals who showed a marked increase in DTG levels after intensification also demonstrated stronger changes in the viral reservoir. While this exploratory analysis did not reveal any clear associations, we would like to emphasize that correlating biological effects with DTG concentrations measured at a single timepoint may have limited interpretability. A more comprehensive understanding of the relationship between drug exposure and reservoir dynamics would ideally require multiple pharmacokinetic measurements over time, including pre-intensification baselines. This is particularly important given that DTG concentrations vary across individuals and over time, depending on adherence, metabolism, and other individual factors.

      (7) Figure 2: IPDA in tissue- was this done? scRNA in blood (single copy assay) - would this be expected to correlate with usCaRNA? The most unambiguous result is the decrease in cell-associated RNA - accompanying results using single-copy assay in plasma would be helpful to bolster this result.

      As mentioned in our response to point 3, we attempted IPDA on tissue samples, but technical limitations prevented reliable detection of intact proviruses. Regarding residual viremia, we did perform ultra-sensitive plasma HIV RNA quantification but due to a technical issue (an inadvertent PBMC contamination during plasma separation) that affected the reliability of the results we felt uncomfortable including these data in the manuscript.

      The use of the US RNA / Total DNA ratio is not helpful/difficult to interpret since the control and intervention arms were unmatched for total DNA reservoir size at study entry.

      We respectfully disagree with this comment. The US RNA/total DNA ratio is commonly used to assess the relative transcriptional activity of the viral reservoir, rather than its absolute size. While we acknowledge that the total HIV-1 DNA levels differed at baseline between the two groups, the US RNA/total DNA ratio specifically reflects the relationship between transcriptional activity and reservoir size within each individual, and is therefore not directly confounded by baseline differences in total DNA alone.

      Moreover, our analyses focus on within-subject longitudinal changes from baseline, not on direct between-group comparisons of absolute marker values. As such, the observed changes in the US RNA/total DNA ratio over time are interpreted relative to each participant's baseline, mitigating concerns related to baseline imbalances between groups.

      Reviewer #2 (Public review):

      Summary:

      An intensification study with a double dose of 2nd generation integrase inhibitor with a background of nucleoside analog inhibitors of the HIV retrotranscriptase in 2, and inflammation is associated with the development of co-morbidities in 20 individuals randomized with controls, with an impact on the levels of viral reservoirs and inflammation markers. Viral reservoirs in HIV are the main impediment to an HIV cure, and inflammation is associated with co-morbidities.

      Strengths:

      The intervention that leads to a decrease of viral reservoirs and inflammation is quite straightforward forward as a doubling of the INSTI is used in some individuals with INSTI resistance, with good tolerability.

      This is a very well documented study, both in blood and tissues, which is a great achievement due to the difficulty of body sampling in well-controlled individuals on antiretroviral therapy. The laboratory assays are performed by specialists in the field with state-of-the art quantification assays. Both the introduction and the discussion are remarkably well presented and documented.

      The findings also have a potential impact on the management of chronic HIV infection.

      Weaknesses:

      I do not think that the size of the study can be considered a weakness, nor the fact that it is open-label either.

      We thank Reviewer #2 for their constructive and supportive comments. We appreciate their positive assessment of the study design, the translational relevance of the intervention, and the technical quality of the assays. We also take note of their perspective regarding sample size and study design, which supports our positioning of this trial as an exploratory, hypothesis-generating phase 2 study.

      Reviewer #3 (Public review):

      The introduction does a very good job of discussing the issue around whether there is ongoing replication in people with HIV on antiretroviral therapy. Sporadic, non-sustained replication likely occurs in many PWH on ART related to adherence, drug-drug interactions and possibly penetration of antivirals into sanctuary areas of replication and as the authors point out proving it does not occur is likely not possible and proving it does occur is likely very dependent on the population studied and the design of the intervention. Whether the consequences of this replication in the absence of evolution toward resistance have clinical significance challenging question to address.

      It is important to note that INSTI-based therapy may have a different impact on HIV replication events that results in differences in virus release for specific cell type (those responsible for "second phase" decay) by blocking integration in cells that have completed reverse transcription prior to ART initiation but have yet to be fully activated. In a PI or NNRTI-based regimen, those cells will release virus, whereas with an INSTI-based regimen, they will not.

      Given the very small sample size, there is a substantial risk of imbalance between the groups in important baseline measures. Unfortunately, with the small sample size, a non-significant P value is not helpful when comparing baseline measures between groups. One suggestion would be to provide the full range as opposed to the inter-quartile range (essentially only 5 or 6 values). The authors could also report the proportion of participants with baseline HIV RNA target not detected in the two groups.

      We thank Reviewer #3 for their thoughtful and balanced review. We are grateful for the recognition of the strength of the Introduction, the complexity of evaluating residual replication, and the technical execution of the assays. We also appreciate the insightful suggestions for improving the clarity and transparency of our results and discussion.

      We revised the manuscript to address several of the reviewer’s key concerns. We agree that the small sample size increases the risk of baseline imbalances. We acknowledged these limitations in the manuscript (lines 327-330). For transparency, we now provide both the full range and the IQR for all parameters in Table 1. However, we would like to stress that our statistical approach is based on a within-subject (repeated-measures) design, in which the longitudinal change of a parameter within the same participant during the study was the main outcome. In other words, we are not comparing absolute values of any marker between the groups, we are looking at changes of parameters from baseline within participants, and these are not expected to be affected by baseline imbalances.

      A suggestion that there is a critical imbalance between groups is that the control group has significantly lower total HIV DNA in PBMC, despite the small sample size. The control group also has numerically longer time of continuous suppression, lower unspliced RNA, and lower intact proviral DNA. These differences may have biased the ability to see changes in DNA and US RNA in the control group.

      We acknowledge the significant baseline difference in total HIV DNA between groups, which we have clearly reported. However, the other variables mentioned, such as duration of continuous viral suppression, unspliced RNA levels, and intact proviral DNA, did not differ significantly between groups at baseline, despite differences in the median values (that are always present). These numerical differences do not necessarily indicate a critical imbalance.

      Notably, there was no significant difference in the change in US RNA/DNA between groups (Figure 2C).

      The nonsignificant difference in the change in US RNA/total DNA between groups is not unexpected, given the significant between-group differences for both US RNA and total DNA changes. Since the ratio combines both markers, it is likely to show attenuated between-group differences compared to the individual components. However, while the difference did not reach statistical significance (p = 0.09), we still observed a trend towards a greater reduction in the US RNA/total DNA ratio in the intervention group.

      The fact that the median relative change appears very similar in Figure 2C, yet there is a substantial difference in P values, is also a comment on the limits of the current sample size. 

      Although we surely agree that in general, the limited sample size impacts statistical power, we would like to point out that in Figure 2C, while the medians may appear similar, the ranges do differ between groups. At days 56 and 84, the median fold changes from baseline are indeed close but the full interquartile range in the DTG group stays below 1, while in the control group, the interquartile range is wider and covers approximately equal distance above and below 1. This explains the difference in p values between the groups.

      The text should report the median change in US RNA and US RNA/DNA when describing Figures 2A-2C.

      These data are already reported in the Results section (lines 164–166): "By day 84, US RNA and US RNA/total DNA ratio had decreased from day 0 by medians (IQRs) of 5.1 (3.3–6.4) and 4.6 (3.1–5.3) fold, respectively (p = 0.016 for both markers)."

      This statistical comparison of changes in IPDA results between groups should be reported. The presentation of the absolute values of all the comparisons in the supplemental figures is a strength of the manuscript.

      In the assessment of ART intensification on immune activation and exhaustion, the fact that none of the comparisons between randomized groups were significant should be noted and discussed.

      We would like to point out that a statistically significant difference between the randomized groups was observed for the frequency of CD4⁺ T cells expressing TIGIT, as shown in Figure 3A and reported in the Results section (p = 0.048).

      The changes in CD4:CD8 ratio and sCD14 levels appear counterintuitive to the hypothesis and are commented on in the discussion.

      Overall, the discussion highlights the significant changes in the intensified group, which are suggestive. There is limited discussion of the comparisons between groups where the results are less convincing.

      We observed statistically significant differences between the randomized groups for total DNA (p<0.001) and US RNA (p=0.01), as well as for the frequency of CD4⁺ T cells expressing TIGIT (p=0.048). We would like to stress that US RNA is a key marker of residual replication as it is very sensitive to de novo infection events. As discussed in the manuscript (lines 291-294), a newly infected CD4+ T lymphocyte can contain hundreds to thousands of US HIV RNA copies at the peak of infection. Therefore, a change in the US RNA level upon ART intensification is a very sensitive indicator of new infections. The fact that for US RNA we observed both a significant reduction in the intensified group and a significant difference between the groups is a strong indicator that some new infections had been occurring prior to intensification.

      The limitations of the study should be more clearly discussed. The small sample size raises the possibility of imbalance at baseline. The supplemental figures (S3-S5) are helpful in showing the differences between groups at baseline, and the variability of measurements is more apparent. The lack of blinding is also a weakness, though the PK assessments do help (note 3TC levels rise substantially in both groups for most of the time on study (Figure S2).

      The many assays and comparisons are listed as a strength. The many comparisons raise the possibility of finding significance by chance. In addition, if there is an imbalance at baseline outcomes, measuring related parameters will move in the same direction.

      We agree that the multiple comparisons raise the possibility of chance findings but would like to stress that in an exploratory study like this it is very important to avoid a type II error. In addition, the consistent directionality of the most relevant outcomes (US RNA and intact DNA) lends biological plausibility to the observed effects.

      The limited impact on activation and inflammation should be addressed in the discussion, as they are highlighted as a potentially important consequence of intermittent, not sustained replication in the introduction.

      The study is provocative and well executed, with the limitations listed above. Pharmacokinetic analyses help mitigate the lack of blinding. The major impact of this work is if it leads to a much larger randomized, controlled, blinded study of a longer duration, as the authors point out.

      Finally, we fully endorse the reviewer’s suggestion that the primary contribution of this study lies in its value as a proof-of-concept and foundation for future randomized, blinded trials of greater scale and duration. We highlighted this more clearly in the revised Discussion (lines 340-346).

      Reviewer #1 (Recommendations for the authors):

      (1) Lines 84-87: How would chronic immune activation/inflammation be expected to differ if viral antigen is being released from stable reservoirs rather than low-level replication?

      This is a very insightful question. Although release of viral antigens from stable reservoirs could certainly also trigger immune activation/inflammation, the reservoir cells in PWH on long-term ART are constantly being negatively selected by the immune system (PMID: 38337034; PMID: 36596305) so that after a number of years on therapy, most proviruses are either transcriptionally silent or express only a low amount of viral RNA/antigen. Recent evidence suggests that these selected cells possess specific biological properties that include mechanisms that limit proviral gene expression (PMID: 36599977; PMID: 36599978). In comparison, low-level replication would result in de novo infection of unselected, activated CD4+ cells that are expected to produce much more viral antigen than preselected reservoir cells.

      (2) Lines 249-253: There are multiple ways to explain this observation - alternatively, the total proviral DNA declined due to transient CD4 depletion.

      As discussed above, CD4⁺ T-cell counts did not significantly decrease in any of the treatment groups, as shown in Figure 5. The apparent decline observed concerns the CD4/CD8 ratio, which transiently dropped, but not the absolute number of CD4⁺ T cells. Moreover, although the dynamics of total HIV DNA is indeed similar to that of CD4/CD8 ratio (both declined transiently and then returned to baseline by day 84), the dynamics of unspliced RNA and unspliced RNA/total DNA ratio is clearly different, as these markers demonstrated a sustained decrease that was maintained throughout the trial period. Also, we observed a significant decrease in intact HIV DNA at day 84 compared to day 0. These effects cannot be easily explained by a transient decline in CD4+ cells.

      (3) Lines 301-305: This is a confusing explanation for not seeing an effect in tissue. Overall, there was no change in total proviral DNA in blood between days 0 and 84 either - yet the explanation for this observation is different (249-253). Was IPDA not performed on the tissue? Wouldn't this be the preferred test for reservoir depletion?

      We thank the reviewer for bringing this point to our attention. We modified the Discussion to prevent the confusion (lines 303-305). As for the IPDA on tissue, we attempted this assay on the tissue samples using two independent DNA extraction methods (Promega Reliaprep and Qiagen Puregene), but both yielded high DNA shearing index values, and intact proviral detection was successful in only 3 of 40 samples. Given the poor DNA integrity, these results were not interpretable.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Weaknesses:

      Only 1 gene (katG) gave a strong and 1 (Mab_1456c) exhibited a minor defect. Two of the clones did not show any persistence phenotype (blaR and recR) and one (pafA) showed a minor phenotype,

      We have now carried out more detailed validation studies on the Tn-Seq, with analysis of timedependent killing over 14 d. This more comprehensive analysis shows that 4 of 5 genes analyzed do indeed have antibiotic tolerance defects under the conditions that Tn-Seq predicted a survival defect (Revised Figure 3). In addition, we found that even before actual cell death, several mutants had delayed resumption of growth after antibiotic removal (Figure 3 Supplemental).

      Fig 3 - Why is there such a huge difference in the extent of killing of the control strain in media, when exposed to TIG/LZD, when compared to Fig. 1C and Fig. 4. In Fig. 1C, M. abs grown in media decreases by >1 log by Day 3 and >4 log by Day 6, whereas in Fig. 3, the bacterial load decreases by <1 log by Day 3 and <2 log by Day 6. This needs to be clarified, if the experimental conditions were different, because if comparing to Fig. 1C data then the katG mutant strain phenotype is not very different.

      We agree with the reviewer that there is variability in the timing and extent of cell death from experiment to experiment. As noted by the reviewer, in Figure 1C the largest decrement in survival is between day 1 - day 3 (also seen in Figure 6A). As they noted in Figure 4 the largest decrement is between day 3 – day 6 (also seen in Figure 3A, Figure 5F). In each experiment with katG mutants we carefully compare the mutant vs. the control strain within that experiment, which is more accurate than comparing the behavior of mutant in one experiment to a control in another experiment.

      Reviewer #2 (Public review):

      Weaknesses:

      .First, word-choice decisions could better conform to the published literature. Alternatively, novel definitions could be included. In particular, the data support the concept of phenotypic tolerance, not persistence. 

      We appreciate the reviewers comments, text modified.

      Second, two of the novel observations could be explored more extensively to provide mechanistic explanations for the phenomena. 

      We have added several additional experiments, these are detailed below in response to specific comments.

      Reviewer #3 (Public review):

      Weaknesses:

      The findings could not be validated in clinical strains.

      We understand the reviewer’s concern that the katG phenotype was only observed in one of the two clinical strains we studied. We feel that our findings are relevant beyond the ATCC 19977 strain for two reasons

      (1) We have performed additional analyses of the two clinical isolates and indeed find significant accumulation of ROS following antibiotic exposure in both of these strains (revised Figure 6A).

      (2) We do in fact see a role for katG in starvation-induced antibiotic tolerance in Mabs clinical strain-2. It is not surprising that different strains from a particular species may have some different responses to stresses – for example, there is wide strain-specific variability in susceptibility to different phages within a species based on which particular phage defense modules a given strain carries (for example PMID: 37160116). We speculate that different Mabs strains may express varying levels of other antioxidant factors and note that the genes encoding several such factors were identified by our Tn-Seq screen including the peroxidases ahpC, ahpD, and ahpE. Our analysis of the genetic interactions between katG and these other factors is ongoing. 

      Comments/Suggestions

      (1) In Fig1E, the authors show no difference in killing Mtb with or without adaptation in PBS. These data are contrary to the data presented in Figure 1B. These also do not align with the data of M. smegmatis and M. abscesses. Please discuss these observations in light of the Duncan model of persistence (Mol Microbiol. 2002 Feb;43(3):717-31.).’

      The above referenced Duncan laboratory study found tolerance after prolonged starvation but did not actually examine tolerance at early time points. While some of the transcriptional and metabolic changes seen by Duncan and others are slow, other groups have described starvation responses in Mtb that are quite rapid. For example, the stringent response mediator ppGpp accumulates within a few hours after onset of starvation in Mtb (PMID: 30906866). We suspect that a rapid signaling response such as this underlies the phenotype we observe. Regarding the difference between Mtb and other mycobacterial species we also find it surprising that Mtb had a much more rapid starvation response. This is a clear species-specific difference that may reflect an adaptation of Mtb to the nutrient-limited physiologic niche within host macrophages.

      (2) Line 151, the authors state that they have used an M. abscesses Tn mutant library of ~ 55,000 mutant strains. The manuscript will benefit from the description of the coverage of total TA sites covered by the mutants.

      Text modified to add this detail. There are 91,559 TA sites in the abscessus genome. Thus, our Tn density is ~60%.

      (3) Line 155: Please explain how long the cells were kept in an Antibiotic medium.

      This technical detail was noted above on line 153 in the original text: “…and then exposed them to TIG/LZD for 6 days”. To clarify the overall conditions, we have also revised the text of the manuscript and added the detail of how long cells were passaged after removal of antibiotics.

      (4) Line 201: data not shown. Delayed resumption of growth after removal of antibiotic would be helpful in indicating drug resilience. This data could enhance the manuscript.

      Data now provided in Figure 3 Supplemental

      (5) Figures 4C and 4F represent the kill curve. It will be good to show the date with CFU against the drug concentration in place of OD600. CFU rather than OD600 best reflects growth inhibition.

      Figures 4C and 4F are measuring the minimum inhibitory concentration (MIC) to stop the overall growth of the bacterial population. While we agree that CFU could be analyzed, this would be measuring a different outcome – cell death and the minimum bactericidal concentration (MBC). In these experiments we sought to specifically examine the MIC so as to separate growth inhibition from cell death. For this we used the standard method employed by clinical microbiology laboratories for MIC, which is optical density of the culture (PMID: 10325306).

      (6) Figure 5C. The authors shall show the effect of TIG/LZD on M. abscesses ROS production without the PBS adaptation. It is important to conclude that TIG/LZD induces ROS in cells. Authors should utilize ROS scavengers such as Thiourea, DFO, etc., to conclude ROS's contribution to bacterial killing following inhibition of transcription and translation.

      New data added (revised Figure 5 and Figure 5 Supplemental)  

      (7) Line 303. Remove "note".

      Text revised. We thank the reviewer for identifying this typographical error.  

      (8) The introduction and Discussion are very similar, and several lines are repeated.

      Text revised with overlapping content removed.

      Reviewer #1 (Recommendations for the authors):

      It appears that the same datasets for PBS adapted cultures were plotted in A-C and D-F. Either this should be specifically mentioned in the legend or it might be better to integrate the non-adapted plots into A-C which would also allow easier comparison.

      Appreciate the reviewer’s suggestion; text modified with added clarification to figure legend.

      This manuscript is focused on M. abs and the antibiotics TIG/LZD, so the Mtb data or data using the antibiotics INH/RIF/EMB and serves more as a distraction and can be removed

      We appreciate the reviewer’s perspective. However, we wish to include these data to show the similarities (and differences) in starvation-induced tolerance between the three organisms.

      Fig 3 -As mentioned for Fig. 1, it appears that the same dataset was used for the control in all the figures A-E. This should be explicitly stated in the Figure legend.

      Appreciate the reviewer’s suggestion; text modified with added clarification to figure legend.

      The divergent results from the clinical strains are extremely interesting. It would be helpful to determine the oxidative stress levels (similar to the cellROX data shown in 5E), to tease out if the difference in katG role is because of lack of ROS induction in these strains or due to expression of alternate anti-oxidative stress defense mechanisms.

      We have performed additional cellROX analysis as suggested by the reviewer and found that the ROS induction is indeed present across all three Mabs strains, but that katG is only required in one of the two strains (Strain #2). These data are now included in the revised Figure 6.

      Reviewer #2 (Recommendations for the authors):

      GENERAL COMMENTS

      This is a nice piece of work that uses the pathogen Mabs as a test subject.

      The work has findings that likely apply generally to antibiotics and mycobacteria: 1) phenotypic tolerance is associated with suppression of ROS, 2) lethal protein synthesis inhibitors act via accumulation of ROS, and 3) levofloxacin behaves in an unexpected way. Each is a new observation. However, I believe that each topic requires more work to be firmly established to be suitable for eLife.

      Phenotypic tolerance: Association with suppression of ROS is important but expected. I would solidify the conclusion by performing several additional experiments. For example, confirm the lethal effect of ROS by reducing it with an iron chelator and a radical scavenger. There is a large literature on effects of iron uptake, levels, etc. on antibiotic lethality that could be applied to this question. In 2013 Imlay argued against the validity of fluorescent probes. Perhaps getting the same results with another probe would strengthen the conclusion.

      We have carried out additional experiments with both an iron chelator and small molecule ROS scavengers to further test this idea but note that these experiments have several inherent limitations: 1) These compounds have highly pleiotropic effects. For example while N-acetyl cysteine (NAC) is an antioxidant it also increases mycobacterial respiration and was shown to paradoxically decrease antibiotic tolerance in M. tuberculosis (PMID: 28396391). 2) It has been shown by the Imlay group that small-molecule antioxidants are often ineffective in quenching ROS in bacteria (PMID: 388893820), making negative results difficult to interpret. Nonetheless, we present new experimental data showing that iron chelation does indeed improve the survival of antibiotic-treated Mabs (revised Figure 5).  However,  small molecule antioxidants such as thiourea do not restore antibiotic tolerance and actually increased bacterial cell death, suggesting that they may be affecting respiration in Mabs in a manner similar to that seen for NAC in Mtb. We also note that our genetic analysis, which identified numerous other genes encoding proteins with antioxidant function (Figure 2) is a strong additional argument in support of the importance of ROS in antibiotic-mediated lethality. 

      Regarding the concern raised by Imlay about the validity of oxidation-sensitive dyes - this relates to concern bacterial autofluorescence induced by antibiotics that can confound analyses in some species. We have ruled this out in our analyses by using bacteria unstained by cellROX as controls to confirm that there is negligible autofluorescence in Mabs (<0.1%, Figure 5E, Figure 6A).

      Protein synthesis inhibitors: At present, this is simply an observation. More work is needed to suggest a mechanism. For example, with E. coli the aminoglycosides are protein synthesis inhibitors that also cause membrane damage. Membrane damage is known to stimulate ROS-mediated killing. Your observation needs to be extended because chloramphenicol, another protein synthesis inhibitor, blocks ROS production. The lethality may be a property of mycobacteria: does it occur with E. coli (note that rifampicin is bacteriostatic with E. coli but lethal to Mtb)?

      We agree with the reviewer that the mechanism underlying ROS accumulation following transcription or translational inhibition in Mabs is of significant interest. It is likely to be a mechanism different from E. coli, because in E. coli tetracyclines and rifamycins are both bacteriostatic, whereas in Mabs they are both bactericidal. Determining the mechanism by which translation inhibitors cause ROS accumulation in Mabs is an ongoing effort in our laboratory using proteomics and metabolomics, but is outside the scope of this manuscript.

      Levofloxacin: This is also at the observational stage but is unexpected. In other studies, ROS is involved in quinolone-mediated killing of bacteria. Why is this not the case with Mabs? The observation should be solidified by showing the contrast with moxifloxacin, since this compound has been studied with mycobacteria (Shee 2022 AAC). With E. coli, quinolone structure can affect the relative contribution of ROS to killing (Malik 2007 AAC), as is also seen with Mtb (Malik 2006 AAC). What is happening in the present work with levofloxacin, an important anti-tuberculosis drug? Is there a structure explanation (compare with ofloxacin)?

      While these are interesting questions, a detailed exploration of the structure-function relationships between different fluoroquinolone antibiotics and their varying activities on Mtb and Mabs is outside the scope of this manuscript.  

      The writing is generally easy to follow. However, the concept of persistence should be changed to phenotypic tolerance with text changes throughout. I base this suggestion on the definitions of tolerance and persistence as stated in the consensus review (Balaban 2019 Nat Micro Rev). Experimentally, tolerance is seen as a gradual decline in survival following antibiotic addition; the decline is slower than seen with wild-type cells. The data presented in this paper fit that definition. In contrast, persistence refers to a rapid drop in survival followed by a distinct plateau (Balaban 2019 Nat Micro Rev; for example, see Wu Lewis AAC 2012 ). Moreover, to claim persistence, it would be necessary to demonstrate subpopulation status, which is not done. The Balaban review is an attempt to bring order to the field with respect to persistence and tolerance, since the two are commonly used without regard for a consistent definition.

      We appreciate the reviewer’s suggestion; text modified in multiple places to clarify.

      Another issue requiring clarification is the relationship between resistance and tolerance. Killing by antibiotics is a two-step process, as most clearly seen with quinolones. First a reversible bacteriostatic event occurs. Resistance blocks that bacteriostatic damage. Then a lethal metabolic response to that damage occurs. Tolerance selectively blocks the second, killing event, a distinct process that often involves the accumulation of ROS. Direct antibiotic-mediated damage is an additional mode of killing that also stems from the reversible, bacteriostatic damage created by antibiotics. The authors recognize the distinction but could make it clearer. Take a look at Zheng (JJ Collins) 2020, 2022.

      Text modified to clarify this point

      Many readers would also like to see a bit more background on Mabs. For example, does it grow rapidly? Are there features that make it a good model for studying mycobacteria or bacteria in general? The more general, the better.

      Text modified, background added

      Below I have listed specific comments that I hope are useful in bringing the work to publication and making it highly cited.

      SPECIFIC COMMENTS

      Line 30 unexpectedly. I would delete this word because the result is expected from the ROS work of Shee et al 2022 with mycobacteria. Moreover, Zeng et al 2022 PNAS showed that ROS participates in antimicrobial tolerance, and persistence is a form of tolerance (Balalban et al, 2019, Nat Micro Rev).

      Text modified as per review suggestion

      Line 39 key goal: this is probably untrue in the general sense stated, since bacteriostatic antibiotics are sufficient to clear infection (Wald-Dickler 2019 Clin Infect Dis). However, it is likely to be the goal for Mtb infections.

      We agree with the reviewer that bacteriostatic antibiotics are effective in treating most types of infections and do not claim otherwise in the manuscript. However, from a clinical standpoint, eradication of the pathogen causing the infection is indeed the goal of antibiotic therapy in virtually all circumstances (with the exception of specific scenarios such as cystic fibrosis where it is recognized that the infecting organism cannot be fully eliminated). In most cases, the combination of bacteriostatic antibiotics and the host immune response is sufficient to achieve eradication. We have modified the manuscript text to reflect this nuance noted by the reviewer.

      Line 62 several: you list three, but hipAB works via ppGpp, so the sentence needs fixing

      Text modified  

      Line 70 uncertain: this uncertainty is unreferenced. Since everything is uncertain, this vague phrase does not add to the story.

      The reviewer makes an interesting philosophical argument. However, we would submit that some aspects of biology, for example the regulation of glycolysis, are understood in great detail. However, other mechanisms, such as the precise mechanisms of lethality for diverse antibiotics in different bacterial species, are far more uncertain and remain a subject of debate (for example PMID: 39910302). Text not modified.

      Line 72 somewhat controversial: I would delete this, because the points in the Science papers by Lewis and Imlay have been clarified and in some cases refuted by prior and subsequent work.

      Text modified

      Line 72 presumed: this suggests that it is wrong and perhaps a different idea has replaced it. Another, and more likely view is that there is an additional mode of killing. I suggest rephrasing to be more in line with the literature.

      Text modified for clarity. In this sentence “presume” refers to the historical concept that direct target inhibition was solely responsible for antibiotic lethality. As the reviewer notes, there is now significant literature that ROS (and perhaps other secondary effects) also contribute to bacterial killing.  

      Line 73 However and the following might also: this phrasing, plus the presumed, misleads the reader from your intent. I suggest rephrasing.

      See above re: line 72

      Line 75 citations: these are inappropriate and should be changed to fit the statement. I suggest the initial paper by Collins (Kohanski 2007 Cell) a recent paper by Zhao (Zeng PNAS 2022), and a review Drlica Expert Rev Anti-infect Therapy 2021). The present citations are fine if you want to narrow the statement to mycobacteria, but the history is that the E. coli work came first and was then generalized to mycobacteria. A mycobacterial paper for ROS is Shee 2022 AAC.

      We thank the reviewer for noticing that we inadvertently omitted several important E. coli-related references. These have been added.

      Line 75 and 76: Conversely ... unresolved. Compelling arguments have been made that show major flaws in the two papers cited, and a large body of evidence has now accumulated showing the validity of the idea promoted by the Collins lab, beginning with Kohanski 2007. In addition to many papers by Collins, see Hong 2019 PNAS and Zeng 2022 PNAS). It is fine if you want to counter the arguments against the Lewis and Imlay papers (summarized in Drlica & Zhao 2021 Expert Rev Anti-infect Therapy), but making a blanket statement suggests that the authors are unfamiliar with the literature.

      We agree with the reviewer that the weight of the evidence supports a role for antibiotic-induced ROS as an important mechanism for antibiotic lethality under many (though not all) conditions. We have revised the text to better reflect this nuance.

      Line 78. Advantages over what?

      Text modified

      Line 80 exposure: to finish the logic you need to show that E. coli and S. aureus persisters fail to do this.

      We thank the reviewer for their suggestion but studying these other organisms is outside the scope of this study. 

      Line 82 whereas: this misdirects the reader. It would seem that a simple "and" is better

      Text modified

      Line 89 I think this paragraph is about the need to study Mabs, the subject of the present report. This paragraph could use a more appropriate topic sentence to guide the reader so that no guessing is involved. I suggest rephrasing this paragraph to make the case for studying more compelling.

      Text modified

      Line 96. I suggest citing several references after subinhibitory concentration of antibiotic.

      The references are in the following sentence alongside the key observations.

      Line 99. Genetic analysis: how does this phrase fit with the idea of persister cells arising stochastically?

      There are two issues: 1) We would argue that persister formation is not completely stochastic, but rather a probability that can be modified both genetically and by environment (for example hipA PMID: 6348026). 2) Even if persister formation were totally stochastic, the survival of these cells may depend on specific genes – as we indeed find in our Tn-Seq analysis of Mabs.  

      Line 106. In this paragraph you need to define persister. The consensus definition (Balaban 2019 Nat Micro Rev) is a subpopulation of tolerant cells. Tolerance is defined as the slowing or absence of killing while an antibiotic retains its ability to block growth. See Zeng 2022 PNAS for example with rapidly growing cells. Phenotypic tolerance is the absence of killing due to environmental perturbations, most notably nutrient starvation, dormancy, and growth to stationary phase. By extension, phenotypic persistence would be subpopulation status of a phenotypically tolerant cells. If you have a different definition, it is important to state it and emphasize that you disagree with the consensus statement.

      Text modified  

      Line 109 unexpectedly. I would delete this word, because the literature leads the reader to expect this result unless you make a clear case for Mabs being fundamentally different from other bacteria with respect to how antibiotics kill bacteria (this is unlikely, see Shee 2022 AAC). Indeed, lines 111-113 state extensions of E. coli work, although suppression of ROS in phenotypic tolerance and genetic persistence have not been demonstrated.

      Text modified

      Line 124 you might add, in parentheses and with references, that a property of persisters is crosspersistence to multiple antibiotic classes. This is also true for tolerance, both genetic and phenotypic. An addition will support your approach.

      Text modified

      Line 128 minimal

      Text not modified. We appreciate the reviewer’s preference but both “minimal” and “minimum” are both widely accepted terms. Indeed, the Balaban et al 2019 consensus statement on definitions cited by the author above also uses “minimum” (PMID: 30980069), as do IDSA clinical guidelines (PMID: 39108079).

      Line 130 is MIC somehow connected to killing or did you also measure killing? Note that blocking growth and killing cells are mechanistically distinct phenomena, although they are related. By being upstream from killing, blockage of growth will also interfere with killing.

      Text modified

      Line 133 PBS is undefined

      Text modified

      Line 134 increase in persisters ... you need to establish that these are not phenotypically tolerant cells. Do they constitute the entire population (tolerance)? Your data would be more indicative of persisters if you saw a distinct plateau with the PBS samples, as such data are often used to document persistence (retardation of killing is a property of tolerance, Balaban 2019). Fig. 1B is clearly phenotypic tolerance, as the entire population grows. Your data suggest that you are not measuring persistence as defined in the literature (Balaban 2019). Line 139 persister should be tolerance •

      Text modified

      Lines 142, 143, 144. 159, 163, 171, 181, 211, 226, 238, 246, 277, 279,289 persistent should be tolerant

      Text modified

      Line 146 fig 1E Mtb does not show the adaptation phenomenon and it is clearly tolerant, not persistent. This should be pointed out. As stated, you may be misleading the reader.

      Text modified  

      *Line 169. Please make it clear whether these genes are affecting antibiotic susceptibility (MIC will affect killing because blocking growth is upstream) or if you are dealing with tolerance (no change in MIC). These measurements are essential and should included as a table. By antibiotic response, do you mean that antibiotics change expression levels?

      Regarding MICs, the data for MICs in control and katG mutant are presented in Figure 4C and 4F. Regarding ‘response’ we have clarified the text of this sentence.

      Line 174 Interestingly should be as expected

      Text not modified; tetracyclines do not induce ROS in E. coli and oxazolidinones have not been studied in this regard.

      Line 183 you need to include citations. You can cite the ability of chloramphenicol to block ROS-mediated killing of E. coli. That allows you to use the word unexpected

      Text modified

      Line 199. All of the data in Fig. 3 shows tolerance, not persistence, requiring word changes in this paragraph.

      Text modified

      Line 226. The MIC experiment is important. You can add that this result solidifies the idea that blocking growth and killing cells are distinct phenomena. You can cite Shee 2022 AAC for a mycobacterial paper

      Text modified

      Line 241. The result with levofloxacin is unexpected, because the fluoroquinolones are widely reported to induce ROS, even with mycobacteria (see Shee 2022 AAC). You need to point this out and perhaps redo the experiment to make sure it is correct.

      We appreciate the reviewer’s interest in this question. All experiments in this paper were repeated multiple times. This particular experiment was repeated 3 times and in all replicates the katG mutant was sensitized to translation inhibitors but not levofloxacin. Shee et al examined Mtb treated with moxifloxacin and found ROS generation, but did not assess whether a Mtb katG mutant had impaired survival. Thus, in addition to differences in: i) the species studied and ii) the particular fluoroquinolone used, the two sets of experiments were designed to address different questions (ROS accumulation vs protection by katG) . A cell might accumulate ROS without a katG mutant having impaired survival if genetic redundancy exists – a result we indeed see in our clinical Mabs strains under some conditions (new data included in revised Figure 6A).  

      Line 269 Additional controls would bolster the conclusion: use of an antioxidant such as thiourea and an iron chelator (dipyridyl) both should reduce ROS effects.

      New experiments performed, revised Figure 5.

      Line 276 the word no is singular

      Text modified

      Line 284 this suggested ... in fact previous work suggested. This summary paragraph might go better as the first paragraph of the Discussion

      Text modified to specify that this is in reference to the work in this manuscript

      Lines 294-299 Most of this is redundant and should be deleted.

      Text modified

      Line 299 this species is vague

      Text modified

      Line 310 Do you want to discuss spoT?

      Text not modified

      Line 313 paragraph is largely redundant

      Text modified

      Line 314 controversial. As above, I would delete this, especially since it is not referenced and is unlikely to be true. If you believe it, you have the obligation to show why the ROS-lethality idea is untrue. If you are referring to Lewis and Imlay, there were almost a dozen supporting papers before 2013 and many after. This statement does not make the present work more important, so deletion costs you nothing.

      Text modified

      Line 314 direct disruption of targets. This is clearly not a general principle, because the quinolones rapidly kill while inhibition of gyrase by temperature-sensitive mutations does not (Kreuzer 1979 J.Bact; Steck 1985). Indeed, formation of drug-gyrase-DNA complexes is reversible: death is not.

      Text modified

      Line 318 as pointed out above, you have not brought this story up to date. The two papers mainly focused on Kohanski 2007, ignoring other available evidence.’’

      Text modified

      Line 326 you need to cite Shee 2022 AAC

      Text modified

      Line 342 the idea of mutants being protective is not novel, as several have been reported with E. coli studies. Thus, there is a general principle involved.

      We agree that this suggests a potential general principle

      Line 344. It depends on the inhibitor. For example, aminoglycosides are translation inhibitors and they also cause the accumulation of ROS.

      We agree that ROS generation depends on the inhibitor, and indeed upon other variables including drug concentration, growth conditions, and bacterial species as well.  

      Line 347. You need to point out the considerable data showing that the absence of catalase increases killing

      Text modified

      Line 363 look at Shee 2022 AAC and Jacobs 2021 AAC

      Text modified, reference added.

      Line 585 I suggest having a colleague provide critical comments on the manuscript and acknowledge that person.

      Text not modified

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      Pavel et al. analyzed a cohort of atrial fibrillation (AF) patients from the University of Illinois at Chicago, identifying TTN truncating variants (TTNtvs) and TTN missense variants (TTNmvs). They reported a rare TTN missense variant (T32756I) associated with adverse clinical outcomes in AF patients. To investigate its functional significance, the authors modeled the TTN-T32756I variant using human induced pluripotent stem cell-derived atrial cardiomyocytes (iPSC-aCMs). They demonstrated that mutant cells exhibit aberrant contractility, increased activity of the cardiac potassium channel KCNQ1 (Kv7.1), and dysregulated calcium homeostasis. Interestingly, these effects occurred without compromising sarcomeric integrity. The study further identified increased binding of the titin-binding protein Four-and-a-Half Lim domains 2 (FHL2) with KCNQ1 and its modulatory subunit KCNE1 in the TTN-T32756I iPSCaCMs.

      Strengths:

      This work has translational potential, suggesting that targeting KCNQ1 or FHL2 could represent a novel therapeutic strategy for improving cardiac function. The findings may also have broader implications for treating patients with rare, disease-causing variants in sarcomeric proteins and underscore the importance of integrating genomic analysis with experimental evidence to advance AF research and precision medicine.

      Weaknesses

      (1) Variant Identification: It is unclear how the TTN missense variant (T32756I) was identified using REVEL, as none of the patients' parents reportedly carried the mutation or exhibited AF symptoms. Are there other TTN variants identified in the three patients carrying TTN-T32756I? Clarification on this point is necessary.  

      We thank the reviewer for their insightful comment. We have now clarified these in the method section.

      Line 484-491: “The TTN-T32756I variant (REVEL Score: 0.58758, Supplementary Table 1) was prioritized due to its occurrence in multiple unrelated individuals within our clinical AF cohort, despite no reported family history of AF in affected individuals. While no parental inheritance was observed, the possibility of de novo origin cannot be excluded. Furthermore, this variant is located within a region overlapping a deletion mutation recently shown to cause AF in a zebrafish model, supporting its potential pathogenicity [37]. Notably, the affected individuals did not carry additional loss-of-function TTN variants.”

      (2) Patient-Specific iPSC Lines: Since the TTN-T32756I variant was modeled using only one healthy iPSC line, it is unclear whether patient-specific iPSC-derived atrial cardiomyocytes would exhibit similar AF-related phenotypes. This limitation should be addressed.

      We have now acknowledged this limitation in the revised manuscript.

      Line 505-509: “Due to the patients' unavailability of peripheral blood mononuclear cells (PBMCs), we utilized a healthy iPSC line and introduced the TTN-T32756I variant using CRISPR/Cas9 genome editing. This approach ensures an isogenic background, thereby minimizing genetic variability and providing a controlled system to study the direct effects of the mutation.”

      (3) Hypertension as a Confounding Factor: The three patients carrying TTN-T32756I also have hypertension. Could the hypertension associated with this variant contribute secondarily to AF? The authors should discuss or rule out this possibility.

      We have now explicitly discussed this in the revised manuscript.

      Line 362-367: “Hypertension is a common comorbidity in patients with AF and could contribute to disease progression. However, all three individuals carrying TTN-T32756I exhibited earlyonset AF (onset before 66 years), with one case occurring as early as 36 years. This suggests a potential two-hit mechanism, where genetic predisposition and comorbidities influence disease risk. Importantly, our iPSC model isolates the genetic effects of TTN-T32756I from other factors, supporting a direct pathogenic role.”

      (4) FHL2 and KCNQ1-KCNE1 Interaction: Immunostaining data demonstrating the colocalization of FHL2 with the KCNQ1-KCNE1 (MinK) complex in TTN-T32756I iPSC-aCMs are needed to strengthen the mechanistic findings.

      We thank the reviewer for this insightful suggestion. We agree that additional immunostaining data would further strengthen the evidence for FHL2 colocalization with the KCNQ1-KCNE1 complex in TTN-T32756I iPSC-aCMs. In line with this, we have expanded our analysis to include both co-immunoprecipitation and confocal microscopy.  As described in the revised manuscript (Lines 282–287), the colocalization between KCNE1 and FHL2 was increased by approximately threefold in TTN-T32756I iPSC-aCMs compared with WT, supporting an enhanced interaction between these proteins (Figure 5A, Supplementary Figure 6). We are generating additional immunostaining data to validate and extend these findings, and we will incorporate them into the revised submission to further substantiate the mechanistic link proposed.

      Line 282-287: “…..if TTN-T32756I increases I<sub>ks</sub> by modulating the interaction between KCNQ1KCNE1 and FHL2, we performed co-immunoprecipitation studies and confocal microscopy in both WT and TTN-T32756I-iPSC-aCMs. The co-localization between KCNE1 and FHL2 increased ~3 fold in TTN-T32756I-iPSC-aCMs, suggesting an increased interaction between them (Figure 5A, Supplementary Figure 7).”

      (5) Functional Characterization of FHL2-KCNQ1-KCNE1 Interaction: To further validate the proposed mechanism, additional functional assays are necessary to characterize the interaction between FHL2 and the KCNQ1-KCNE1 complex in TTN-T32756I iPSC-aCMs.

      We thank the reviewer for this valuable suggestion. We agree that additional functional assays would provide further validation of the proposed mechanism. However, we believe such in-depth characterization warrants a dedicated follow-up study and is beyond the scope of the current revision. In this work, our primary objective is to establish that the TTN missense variant can exert a detrimental effect and serve as a substrate for AF. 

      Line 418-419: “Further study is needed to validate the proposed mechanism and determine if TTNmvs in other regions are associated with AF by a similar process.”

      Reviewer #2 (Public review):

      Summary:

      The authors present data from a single-center cohort of African-American and Hispanic/Latinx individuals with atrial fibrillation (AF). This study provides insight into the incidences and clinical impact of missense variants in this population in the Titin (TTN) gene. In addition, the authors identified a single amino acid TTN missense variant (TTN-T32756I) that was further studied using human induced pluripotent stem cell-derived atrial cardiomyocytes (iPSC-aCMs). These studies demonstrated that the Four-and-a-Half Lim domains 2 (FHL2) has increased binding with KCNQ1 and its modulatory subunit KCNE1 in the TTN-T32756I-iPSCaCMs, enhancing the slow delayed rectifier potassium current (Iks) and is a potential mechanism for atrial fibrillation. Finally, the authors demonstrate that suppression of FHL2 could normalize the Iks current.

      Strengths:

      The strengths of this manuscript/study are listed below:

      (1) This study includes a previously underrepresented population in the study of the genetic and mechanistic basis of AF.

      (2) The authors utilize current state-of-the-art methods to investigate the pathogenicity of a specific TTN missense variant identified in this underrepresented patient population.

      (3) The findings of this study identify a potential therapeutic for treating atrial fibrillation.

      Weaknesses:

      (1) The authors do not include a non-AF group when evaluating the incidence and clinical significance of TTN missense variants in AF patients.

      We appreciate the reviewer’s comment and acknowledge the limitation of not including a non-AF control group in our clinical analysis. As noted in the revised manuscript (Lines 347–353), our cohort was derived from a single-center registry of individuals with AF and therefore lacks a matched non-AF control population for direct comparison of TTN missense variant incidence. We agree that future studies incorporating larger, multiethnic validation cohorts with both AF and non-AF individuals, as well as evaluating AF-specific measures such as arrhythmia burden and treatment response, will be essential to fully elucidate the clinical significance of TTN missense variants in AF.

      Line 347-353: “Our cohort is derived from a single-center multi-ethnic registry of individuals with AF and lacks a matched cohort of non-AF controls to compare the incidence of TTN missense variants.  Further study exploring these associations in mult-ethnic, larger validation cohorts that include both AF and non-AF individuals and examining AF-specific measures such as arrhythmia burden or treatment response will be necessary to fully understand the clinical importance of TTNmvs in AF.”

      (2) The authors do not provide evidence that TTN-T32756I-iPSC-aCMs are arrhythmogenic, only that there is an increase in the Iks current and associated action potential changes. More specifically, the authors report that "compared to the WT, TTN-T32756I-iPSC-aCMs exhibited increased arrhythmic frequency," yet it is unclear what they are referring to by "arrhythmic frequency."

      We thank the reviewer for this important point and for highlighting the need for clarification. In our study, the term “arrhythmic frequency” was intended to describe the increased spontaneous beating rate, irregular action potential patterns, and abnormal calcium handling observed in TTN-T32756I iPSC-aCMs compared with WT. These findings support the concept that the AF-associated TTN-T32756I variant promotes ion channel remodeling and perturbs excitation–contraction coupling, thereby creating a potential arrhythmogenic substrate for AF. To avoid ambiguity, we have removed the term “arrhythmic frequency” and revised the text for clarity and precision (Lines 222–223).

      Lines 222-223: “Compared to the WT, TTN-T32756I-iPSC-aCMs exhibited increased frequency along with a significant reduction of the time to 50% and 90% decline of calcium transients (Figure 3G-I, Supplementary Figure 4F).”

      (3) There seem to be discrepancies regarding the impact of the TTN-T32756I variant on mechanical function. Specifically, the authors report "both reduced contraction and abnormal relaxation in TTN-T32756I-iPSC-aCMs" yet, separately report "the contraction amplitude of the mutant was also increased . . . suggesting an increased contractile force by the TTN-T32756IiPSC-aCMs and TTN-T32756I-iPSC-CMs exhibited similar calcium transient amplitudes as the WT."

      We thank the reviewer for highlighting this critical point and apologize for the lack of clarity. We intended to distinguish between changes in contractile force and contractile dynamics. Specifically, the increased contraction amplitude observed in TTN-T32756I iPSCaCMs reflects enhanced contractile force, whereas the reduced contraction duration and impaired relaxation reflect abnormalities in contractile kinetics. Together, these findings indicate that the TTN-T32756I variant alters both the strength and the temporal dynamics of contraction, consistent with dysfunctional mechanical performance. We have revised the text accordingly to more accurately convey these results (Lines 187–192).

      Lines 187-192: “Compared to WT, the beating frequency of the TTN-T32756I-iPSC-aCMs was significantly increased (52 ± 7.8 vs. 98 ± 7.5 beats per min, P=0.001; Figure 2C) coupled with the reduction of the contraction duration (456.5 ± 61.45 vs 262.9 ± 48.16 msec, P=0.032; Figure 2D), the peak-to-peak time (1529 ± 195.5 vs 636.6 ± 135.8 msec, P=0.004; Supplementary Figure 3B),  and the relaxation (281.5 ± 42.95 vs 79.40 ± 21.14 msec, P=0.003; Supplementary Figure 3A).”

      Reviewer #3 (Public review):

      Summary:

      The authors describe the abnormal contractile function and cellular electrophysiology in an iPSC model of atrial myocytes with a titin missense variant. They provide contractility data by sarcomere length imaging, calcium imaging, and voltage clamp of the repolarizing current iKs. While each of the findings is interesting, the paper comes across as too descriptive because there is no data merging to support a cohesive mechanistic story/statement, especially from the electrophysiological standpoint. There is not enough support for the title "A Titin Missense Variant Causes Atrial Fibrillation", since there is no strong causative evidence. There is some interesting clinical data regarding the variant of interest and its association with HF hospitalization, which may lead to future important discoveries regarding atrial fibrillation.

      Strengths:

      The manuscript is well written, and a wide range of experimental techniques are used to probe this atrial fibrillation model.

      Weaknesses

      (1) While the clinical data is interesting, it is essential to rule out heart failure with preserved EF as a confounder. HFpEF leads to AF due to increased atrial remodeling, so the fact that patients with this missense variant have increased HF hospitalizations does not necessarily directly support the variant as causative of AF. It could be that the variant is associated directly with HFpEF instead, and this needs to be addressed and corrected in the analyses.

      We appreciate the reviewer’s insightful comment and agree that HFpEF-related atrial remodeling could represent a potential confounder in the association between TTN missense variants and AF. The primary aim of our clinical analysis was to assess the potential significance of TTNmv in AF, recognizing the inherent limitations of retrospective observational data in establishing causality. To complement this, our in vitro studies were specifically designed to demonstrate that TTNmv can alter the electrophysiological substrate, thereby predisposing to AF independent of clinical comorbidities.

      While HFpEF is an important consideration, to our knowledge, no existing literature directly implicates TTNmv in HFpEF pathogenesis. In contrast, loss-of-function TTN variants are more commonly associated with HFrEF and dilated cardiomyopathy, and even these associations remain an area of active debate. To address potential confounding in our cohort, we adjusted for reduced ejection fraction in multivariable analyses of clinical outcomes. Additionally, we performed a sensitivity analysis excluding patients with nonischemic dilated cardiomyopathy (Supplementary Table 6). Together, these approaches mitigate the potential impact of heart failure subtypes on our findings, while our mechanistic studies strengthen the argument that TTNmv may contribute directly to AF susceptibility.

      (2) All contractility and electrophysiologic data should be done with pacing at the same rate in both control and missense variant groups, to control for the effect of cycle length on APD and calcium loading. A shorter APD cannot be claimed when the firing rate of one set of cells is much faster than the other, since shorter APD is to be expected with a quicker rate. Similarly, contractility is affected by diastolic interval because of the influence of SR calcium content on the myocyte power stroke. So the cells need to be paced at the same rate in the IonOptix for any direct comparison of contractility. The authors should familiarize themselves with the concept of electrical restitution.

      We thank the reviewer for this crucial technical comment. iPSC-derived cardiomyocytes (iPSC-CMs) are known to exhibit spontaneous automaticity due to the presence of pacemaker-like currents and reduced I<sub>K1</sub>, which enables interrogation of their intrinsic electrophysiological properties and disease-relevant remodeling. In our study, we leveraged this feature to test the hypothesis that TTN missense variants alter electrophysiological properties through ion channel remodeling. That said, we fully agree with the reviewer that pacing iPSCCMs at a controlled cycle length is essential for minimizing rate-dependent effects on APD, calcium handling, and contractility, and would improve the interpretability of group comparisons. While iPSC-CMs with matched genetic backgrounds are expected to display broadly comparable electrophysiological profiles, biological and technical variability can influence spontaneous beating rates, thereby confounding direct comparisons. To address this, we have incorporated pacing protocols into our revised experimental design to ensure that APD and contractility measurements are obtained under identical cycle lengths, consistent with the concept of electrical restitution.

      (3) It is interesting that the firing rate of the myocytes is faster with the missense variant. This should lead to a hypothesis and investigation of abnormal automaticity or triggered activity, which may also explain the increased contractility since all these mechanisms are related to the SR's calcium clock and calcium loading. See #2 above for suggestions on how to probe calcium handling adequately. Such an investigation into impulse initiation mechanisms would be compelling in supporting the primary statement of the paper since these are actual mechanisms thought to cause AF.

      We thank the reviewer for this insightful suggestion. We agree that the faster firing rate observed in TTN-T32756I iPSC-aCMs raises the possibility of abnormal automaticity or triggered activity, both of which are highly relevant to AF pathophysiology. As these mechanisms are tightly coupled to calcium handling and the SR calcium clock, further probing of calcium cycling abnormalities would provide valuable mechanistic insights. While this level of investigation is beyond the scope of the current study, we view it as a compelling future direction that could directly link TTN missense variants to impulse initiation abnormalities contributing to AF. 

      (4) The claim of shortened APD without correcting for cycle length is problematic. However, linking shortened APD in isolated cells alone to AF causation is more complicated. To have a setup for reentry, there must be a gradient of APD from short to long, and this can only be demonstrated at the tissue level, not at the cellular level, so reentry should not be invoked here. If shortened APD is demonstrated with correction of the cycle length problem, restitution curves can be made showing APD shortening at different cycle lengths. If restitution is abnormal (i.e. the APD does not shorten normally in relation to the diastolic interval), this may lead to triggered activity which is an arrhythmogenic mechanism. This would also tie in well with the finding of abnormally elevated iKs current since iKs is a repolarizing current directly responsible for restitution.

      We thank the reviewer for this necessary clarification. We agree that isolated cell studies cannot directly demonstrate reentrant circuits and that reentry should not be inferred solely from cellular APD data. Our observation of shortened APD and abnormal beating patterns in TTN-T32756I iPSC-aCMs suggests ion channel remodeling that may predispose to arrhythmogenic conditions. Still, we recognize that tissue-level gradients of APD are required to establish reentry as a mechanism. Accordingly, we have removed mention of “the reentrant mechanism” from the revised manuscript and limited our interpretation to the cellular findings. Future studies incorporating pacing protocols and restitution curve analyses will be valuable in determining whether abnormal APD restitution and elevated I<sub>Ks</sub> contribute to triggered activity, thereby providing a more direct mechanistic link to AF (Lines 101–105).

      Lines 101-105: “Our study showed that the TTN-T32756I iPSC-aCMs exhibited a striking AF-like EP phenotype in vitro, and transcriptomic analyses revealed that the TTNmv increases the activity of the FHL2, which then modulates the slow delayed rectifier potassium current (I<sub>Ks</sub>) to cause AF.” 

      Reviewer #1 (Recommendations for the authors):

      Electrophysiological Phenotype in Ventricular CMs: Has the iPSC line carrying TTN-T32756I been differentiated into ventricular cardiomyocytes (iPSC-vCMs)? The reported cellular phenotype in iPSC-aCMs does not seem to specifically reflect an AF phenotype. Does the variant produce similar electrophysiological alterations in iPSC-vCMs?

      We thank the reviewer for this thoughtful comment. To date, we have not differentiated the TTN-T32756I iPSC line into ventricular cardiomyocytes (iPSC-vCMs). Our current work focuses on iPSC-aCMs, where we demonstrate that the AF-associated TTNT32756I variant induces ion channel remodeling and abnormal beating patterns, thereby creating a potential arrhythmogenic substrate relevant to AF. We agree that investigating whether this variant produces similar or distinct electrophysiological alterations in iPSC-vCMs would provide essential insights into chamber-specific effects and broaden our mechanistic understanding. We have acknowledged this as a future direction in the revised manuscript (Lines 422–425).

      Lines 422-425: “While we have not yet explored the effect of TTN-T32756I in iPSC-derived ventricular cardiomyocytes, it would be interesting to investigate whether this variant produces similar or distinct electrophysiological alterations in the ventricular cardiomyocytes.”

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      This is a reply/revision plan, not definitive. Planned and already implemented revisions are underlined.

      First of all, we wish to express our gratitude to the reviewers: they helped to improve the paper.

      Reviewer #1:* **

      Reviewer #1 wrote: Major Comments: 1.Differential gene/pathway analysis across epithelial clusters: What are the differential genes or pathways among the epithelial clusters? Without CCA/Harmony integration, do the tumor subgroups show distinct differences? In addition, I suggest applying NMF or hdWGCNA to identify shared modules and test whether ATC and PTC harbor overlapping regulatory modules.

      *

      Reply plan: Both reviewers suggested some regulatory network analysis. We proposed to run SCENIC+ (Nature Methods, 2023, https://doi.org/10.1038/s41592-023-01938-4) on our data__.__

      * Reviewer #1 wrote: 2.Validation of TSHR/TPO-based subgrouping: While the TSHR/TPO grouping appears appropriate for stratification at the single-cell level, it is necessary to exclude sequencing depth as a confounding factor. Should validate the existence of these subpopulations using mIHC/IF on corresponding samples. *

      __Reply plan: __We made claims about RNA expression, not protein expression. Thus, validation should be at the RNA level:

      • We already replicated part of our analysis on the dataset published by Lu et al. (JCI 2023, https://doi.org/10.1172/JCI169653), see Figs. 3 and 4. This effort will be extended to all single cell analysis results from our study in the revised paper.
      • We will also present plots demonstrating that the sequencing depth is similar in the different cancer cell subgroups-further excluding it as a confounding factor. Reviewer #1 wrote: *3.Impact of mutational differences on conclusions: According to Supplementary Table 1, almost all PTC cases carried BRAF mutations, whereas four ATC patients harbored no BRAF mutation. Could this difference influence the conclusions of the study? Although the authors briefly mention this in the Discussion, a more thorough clarification is warranted. *

      Reply plan: The dataset of Lu et al. includes BRAF-mutated ATCs along with BRAF-mutated PTCs. Therefore, the replication mentioned earlier will also address those concerns. In fact, Fig. 4E-I already confirm in Lu et al. data the ordered loss of markers. Replication will be extended to other results of the study and be more emphasized in the paper.

      * Reviewer #1 wrote: 4. The statement "Myeloid and T cells also grouped in specific clusters" seems descriptive. Is this clustering biologically meaningful? Please elaborate.

      *

      __Reply plan: __This is an important point, and accordingly, a cell mixing experiment was specifically designed to sort apart technical effects from biological effects. We therefore know with certainty that the myeloid and T cell patients-specific clusters are the result of biological variation (Fig. 1). We further demonstrate that part of this variation is associated with hypoxia (Supp. Fig 4). So yes, the clustering is biologically meaningful.

      * Reviewer #1 wrote: Minor Comments: In Figure 2C, the "Epith TSHR-" population resembles myeloid cells. Could the authors clarify why this is the case? For the correlation analysis in Figure 2C, were highly variable genes or all genes used?

      *

      Reply plan: There is a simple explanation: The Epith TSHR- population the reviewer is referring to are cells from anaplastic thyroid cancers (ATC), which are tumors notoriously infiltrated by macrophages (Supp. Fig. 4). A high correlation of Epith TSHR- and macrophages proportion across our panel of ATC and papillary cancer (PTC) is therefore expected. Among other things, Fig. 2C shows that high correlation, but it is not meant to and does not show that Epith TSHR- and macrophages "resemble" one another. It shows that their proportions are highly positively correlated. This correlation analysis does not rely on gene expression but on cell type proportions. It measures co-occurrence rather than resemblance. The text has been clarified in order to prevent any confusion.

      • *

      * __Reviewer #2: __

      Reviewer #2 wrote: 1. This study largely confirms established facts that 1) PTC due to BRAF driver mutation is a heterogeneous tumour entity and 2) ATC is the most dedifferentiated of all thyroid cancers. Although interesting, observations of a highly variable tissue cell composition including immune cells and the gradual loss of thyroid differentiation markers, in part linked to tumor subclone development featured by altered chromosomal copy numbers, are thus not surprising.

      *

      __Reply plan: __We wish to respectfully express our take on this perception of the work:

      • There is a difference between conjecturing a high heterogeneity in the cell composition of thyroid cancers and establishing it with the level of accuracy and quantitative rigor our analysis provides. The extreme amplitude of that variation was surprising to us: the size of the microenvironment makes from 8.4 to 80% of the cells in PTCs driven by the same BRAF mutation.
      • We don't simply show that a subclone characterized by a large number of copy number events is less differentiated. We go all the way proving that those copy number alterations are associated with specific cell states that produce specific histology (Fig. 5). It required a combination of single cell transcriptomics, spatial transcriptomics and sophisticated computational analysis to establish that connection between genomic changes and histology. The fragmentation of epithelial sheets uncovered from CNV analysis had escaped the attention of pathologist colleagues and ours at first, this is not a parameter typically assessed in diagnostic, to our knowledge.
      • We don't simply show that there is a gradual loss of differentiation markers: this loss is ordered in a very specific way that mirrors the gain of markers during thyroid organoid differentiation. * Reviewer #2 wrote: 2. Considering tumor progression, comparison of PTC and ATC should preferably include specimens with the same driver mutation (BRAF or RAS), which is not the case here. This notion should be more clearly explained to readers. An optional improvement would be to conduct similar analyses on an ATC specimen that contains more differentiated PTC tumor portions arguably suggesting that PTC progresses to ATC (by mechanisms that are still largely unexplored).

      *

      __Reply plan: __This is clearly a limitation of our study. As already proposed in our reply to reviewer number one, we will extend to all our single cell results the replication of our analysis in the dataset of Lu e al., which includes ATCs and PTCs harboring the BRAF-mutation.

      * Reviewer #2 wrote: 3. Comments on findings of lymphocytic infiltration need to be balanced. Although autoimmune thyroid disease in infered a risk factor of developing malignancy it is unlikely that the majority of TCGA samples of PTC is associated with thyroiditis as indicated in Fig. 3 and Suppl Fig. 3. Immune cell abundance may rather reflect the tumor immune microenvironment (TIME).

      *

      __Reply plan: __The figure the reviewer is referring to demonstrates that PTC occurring in a background of thyroiditis also has a higher proportion of B cells. We did not claim, and the figure did not show, that "the majority of TCGA samples of PTC is associated with thyroiditis", because they don't. This point has been clarified.

      * Reviewer #2 wrote: 4. Some tissue sections seem of quite poor quality either shape-wise of containing rifts e.g. PTC7 in Fig. 3 and PTC2 in Fig. 5. The authors should explain whether and how this might influence analysis.

      *

      __Reply plan: __Spatial transcriptomics is typically performed on frozen sections. Frozen sections, which are obviously of lower visual quality than slice from FFPE preserved samples. Since no computational analyses were performed on the image, this lower quality has no impact on our results. Regarding RNA quality, the RINs were >7 for all tumors. RINs are now presented in Supp table S1.

      Reviewer #2 wrote: The experiment on mouse ESC/organoids (Fig. 6H-J) does not show much of an expected enhanced thyroid progenitor cell proliferation after induction of the mutant Braf allele by tamoxifen, which raises doupt whether the subsequent promoted growth by fibronectin at all is oncogene-related. This differs from the impact of BrafCA activation along with mouse thyroid development in vivo (Schoultz et al iScience 2023 PMID: 37534159). In the same experimental setup, it appears that mutant Braf prevents follicle formation (Fig. 6I). A control experiment investigating the influence of fibronectin in the absence of oncogene activation should be conclusive. The effect of Braf and fibronectin on thyroid organoid structure and function should be better explained, if necessary based on complementary experiments, and discussed in relation to the claimed association of fibronectin expression to "...low amounts of thyroid differentiation markers...) and "...loss of epithelial structure (PTC7, Figure 6E)." in the previous section of Results.

      __Reply plan: __The induction of the mutant Braf allele for 7 days increases the percentage of BrdU+ cells by 1.43 fold (p-value for Wilcoxon test = 0.035). The effects observed by Schoultz et al. are certainly more dramatic, but they result from an oncogenic activity spanning 1 to 6 months (4 to 26 times longer) in an in vivo model. Most importantly, oncogenic activity is initiated in Nkx2.1+ cells and not Tg+ cells, thus much earlier during development. These two models are thus not comparable. As for the effects of fibronectin on thyroid structure, we do not claim that our organoid model recapitulates the complex interactions between cancer cells and their microenvironment that shapes tissue morphology in vivo. This is now clarified in the text.

      We presented controls with no oncogene expression and no Fn1, controls with oncogene induction and no FN1 and organoids with oncogene induction and Fn1 treatment. This alone establishes the effect of Fn1 on induced organoids, which was our goal. We regard it as a novel and interesting but non-essential development in our paper.

      As the reviewer points out, while our results show an increased proliferation in Braf-mutated organoids treated with Fn1, they do not allow us to conclude on any potential interaction between Fn1 and the oncogenic process. The suggested experiment with Fn1 in absence of oncogene activation would add information, but we cannot follow up for practical lab management reasons detailed in Section 4 below.

      * Reviewer #2 wrote: 6. Concerning EMT profiling (Supplementary Fig. 7B) , there is a great similarity of ATC tumor cells and fibroblasts, and as stated in the text the malignant status of the former is indicated by chromosomal aberrations (refering to Suppl fig. 6). However, looking at Suppl. Fig. 7B it is evident that fractions of cells identified as fibroblasts express TG and TSHR suggesting mismatch. How was this comparison done in order to exclude mismatch? Is there no other profiled markers that distinguish cancer cells from stromal cells that can support conclusions?*

      Reply plan: TG-a thyrocyte marker-seems expressed by fibroblasts in Supplementary Figure 7B. The reviewer suggests this could be caused by an incomplete distinction between bona fide fibroblasts and thyrocytes in advanced EMT state. We argue that

      • Ambient TG RNA leaking out of thyrocytes nuclei contaminates the transcriptomes of all cell types. It is a well-known technical problem, with dedicated software packages to mitigate it. We preprocessed our data with one of them, SoupX, which corrected for most, but not all, ambient RNA contamination.
      • The plot below shows that there is nothing special about fibroblasts in that respect. For example, B and T cells are contaminated by TG at levels comparable to fibroblasts, endothelial cells and pericyte to higher levels.
      • In addition, the UMAP of Fig. 2A shows that EMT cells and fibroblast form very distinct clusters. Furthermore, the fibroblast cluster but not the two EMT clusters contain cells from PTC, and the PTC cluster do not contain cells with DNA copy number aberration. Thus, although both EMT cells and fibroblasts express the typical mesenchymal marker of Supplementary Fig. 7B, they are easy to distinguish on the basis of their overall transcriptomes.
      • The panel below has been added to the Supplementary Figure 7B. [Panel cannot be displayed here]

      Reviewer #2 wrote: *In the same figure, it appears there are no clear differences in EMT marker expression among PTC samples regardless of differentiation state, suggesting that the gradual loss of thyroid differentiation in PTC tumor cells and EMT are not parallel and potentially linked phenomena? Please clarify this dissociation of results. Is possible that refocusing on other EMT markers than the top 10, of which almost all concerns various collagen genes, might better reveal partial EMT in PTCs?

      *

      __Reply plan: __The technical basis of this comment is related to the previous point. Our perception is that the mesenchymal markers in Supplementary Fig. 7B show a binary effect, i.e. strong expression in ATC and no expression in PTC (beyond ambient RNA noise)-not a gradual effect. Thus, there is no correlation of COL1A1 and other mesenchymal markers with dedifferentiation in PTC as these markers are not expressed beyond the noise level of the experiments. A lot has been written about EMT in PTC, but one of the findings of our study is that while ATC undergo full EMT, EMT in PTC is very limited. PTC express FN1 but no other major mesenchymal markers such as collagens I and III, for example.

      • *

      Reviewer #2 wrote: *7. According to Suppl. Table 1, the ATC2 tumor does not harbor any mutations. What about chromosomal aberrations, was that included in analysis? Considering previous consistent reports of a high mutation burden in ATC, if not supported by other data (clinical, pathological) the diagnosis might be questioned for this particular case included in multiple analyses of the present study.

      *

      Reply: There is little doubt about the diagnostic of ATC2 by our pathologist collaborators

      • The histology of this tumor is strikingly anaplastic, i.e. without structure, as shown in the image below.
      • This tumor has a high level of macrophages infiltration typical of ATCs (Supplementary Fig. 4).
      • Reviewer #2 wrote: Minor comments: -The logical order of presentation of Results might benefit from first presenting specific PTC data following by ATC dito. I´m thinking of swapping the section of EMT in ATC to end of Results.*

      *Reply plan: We miss why the reviewer thinks that way. We believe that discussing the microenvironment, then tumor cells bring conciseness and clarity about how we propose to stratify the latter. By contrast, the suggested tumor type-centered structure entails going back and forth between the microenvironment and tumor cells, diluting the messages about both.

      * Reviewer #2 wrote: -Methods paragraph "Mouse ESC-derived thyroid organoids experiments" (starting with "ccc") seems to be missing some essential information.

      *Reply plan: A sentence was missing, indeed, and has been re-introduced in the manuscript. We thank the reviewer for catching that error.

    1. cette vidéo

      Wow ! Franchement, c'est impressionnant à quel point la technologie a évolué. Grâce à cette vidéo, je comprends mieux pourquoi il est important que des personnes "valides" réfléchissent aux problèmes d'accessibilité. Merci pour le partage.

    1. Reviewer #1 (Public review):

      Summary:

      The NF-kB signaling pathway plays a critical role in the development and survival of conventional alpha beta T cells. Gamma delta T cells are evolutionarily conserved T cells that occupy a unique niche in the host immune system and that develop and function in a manner distinct from conventional alpha beta T cells. Specifically, unlike the case for conventional alpha beta T cells, a large portion of gamma delta T cells acquire functionality during thymic development, after which they emigrate from the thymus and populate a variety of mucosal tissues. Exactly how gamma delta T cells are functionally programmed remains unclear. In this manuscript, Islam et al. use a wide variety of mouse genetic models to examine the influence of the NF-kB signaling pathway on gamma delta T cell development and survival. They find that the inhibitor of kappa B kinase complex (IKK) is critical to the development of gamma delta T1 subsets, but not adaptive/naïve gamma delta T cells. In contrast, IKK-dependent NF-kB activation is required for their long-term survival. They find that caspase 8-deficiency renders gamma delta T cells sensitive to RIPK1-mediated necroptosis, and they conclude that IKK repression of RIPK1 is required for the long-term survival of gamma delta T1 and adaptive/naïve gamma delta T cells subsets. These data will be invaluable in comparing and contrasting the signaling pathways critical for the development/survival of both alpha beta and gamma delta T cells.

      The conclusions of the paper are mostly well-supported by the data, but some aspects need to be clarified.

      (1) The authors appear to be excluding a significant fraction of the TCRlow gamma delta T cells from their analysis in Figure 1A. Since this population is generally enriched in CD25+ gamma delta T cells, this gating strategy could significantly impact their analysis due to the exclusion of progenitor gamma delta T cell populations.

      (2) The overall phenotype of the IKKDeltaTCd2 mice is not described in any great detail. For example, it is not clear if these mice possess altered thymocyte or peripheral T cell populations beyond that of gamma delta T cells. Given that gamma delta T cell development has been demonstrated to be influenced by gamma delta T cells (i.e, trans-conditioning), this information could have aided in the interpretation of the data. Related to this, it would have been helpful if the authors provided a comparison of the frequencies of each of the relevant subsets, in addition to the numbers.

      (3) The manner in which the peripheral gamma delta T cell compartment was analyzed is somewhat unclear. The authors appear to have assessed both spleen and lymph node separately. The authors show representative data from only one of these organs (usually the lymph node) and show one analysis of peripheral gamma delta T cell numbers, where they appear to have summed up the individual spleen and lymph node gamma delta T cell counts. Since gamma deltaT17 and gamma deltaT1 are distributed somewhat differently in these compartments (lymph node is enriched in gamma deltaT17, while spleen is enriched in gamma deltaT1), combining these data does not seem warranted. The authors should have provided representative plots for both organs and calculated and analyzed the gamma delta T cell numbers for both organs separately in each of these analyses.

      (4) The authors make extensive use of surrogate markers in their analysis. While the markers that they choose are widely used, there is a possibility that the expression of some of these markers may be altered in some of their genetic mutants. This could skew their analysis and conclusions. A better approach would have been to employ either nuclear stains (Tbx21, RORgammaT) or intracellular cytokine staining to definitively identify functional gamma deltaT1 or gamma deltaT17 subsets.

      (5) The analysis and conclusion of the data in Figure 3A is not convincing. Because the data are graphed on log scale, the magnitude of the rescue by kinase dead RIPK1 appears somewhat overstated. A rough calculation suggests that in type 1 game delta T cells, there is ~ 99% decrease in gamma delta T cells in the Cre+WT strain and a ~90% decrease in the Cre+KD+ strain. Similarly, it looks as if the numbers for adaptive gamma delta T cells are a 95% decrease and an 85% decrease, respectively. Comparing these data to the data in Figure 5, which clearly show that kinase dead RIPK1 can completely rescue the Caspase 8 phenotype, the conclusion that gamma delta T cells require IKK activity to repress RIPK1-dependent pathways does not appear to be well-supported. In fact, the data seem more in line with a conclusion that IKK has a significant impact on gamma delta T cell survival in the periphery that cannot be fully explained by invoking Caspase8-dependent apoptosis or necroptosis. Indeed, while the authors seem to ultimately come to this latter conclusion in the Discussion, they clearly state in the Abstract that "IKK repression of RIPK1 is required for survival of peripheral but not thymic gamma delta T cells." Clarification of these conclusions and seeming inconsistencies would greatly strengthen the manuscript. With respect to the actual analysis in Figure 3A, it appears that the authors used a succession of non-parametric t-tests here without any correction. It may be helpful to determine if another analysis, such as ANOVA, may be more appropriate.

      (6) The conclusion that the alternative pathway is redundant for the development and persistence of the major gamma delta T cell subsets is at odds with a previous report demonstrating that Relb is required for gamma delta T17 development (Powolny-Budnicka, I., et al., Immunity 34: 364-374, 2011). This paper also reported the involvement of RelA in gamma delta T17 development. The present manuscript would be greatly improved by the inclusion of a discussion of these results.

      (7) The data in Figures 1C and 3A are somewhat confusing in that while both are from the lymph nodes of IKKdeltaTCD2 mice, the data appear to be quite different (In Figure 3A, the frequency of gamma delta T cells increases and there is a near complete loss of the CD27+ subset. In Figure 1A, the frequency of gamma delta T cells is drastically decreased, and there is only a slight loss of the CD27+ subset.)

    2. Reviewer #3 (Public review):

      Summary

      The regulation of NF-κB signaling is complex and central to the differentiation and homeostasis of αβT cells, essential to adaptive immunity. γδ T cells are a distinct population that responds to stress/injury-induced cues by producing inflammatory cytokines, representing an important bridge between innate and adaptive immunity. This study from Islam et al. demonstrates that the IKK complex, a central regulator of NF-κB signaling, plays distinct and essential roles in the differentiation and maintenance of γδ T cells. The authors use elegant murine genetic models to generate clear data that disentangle these requirements in vivo.

      Although NF-κB activity was found to be dispensable for specification of γδ T cell progenitors and the generation of adaptive γδ T cells, it was required for both the ontogeny of type 1 γδ T cells and the survival of mature adaptive γδ T cells. Subunit-specific analyses revealed parallels with αβ T cells: RELA was necessary for type 1 γδ T cell development, while maintenance of adaptive γδ T cells relied upon redundancy between REL subunits, with cREL and p50 compensating in the absence of RELA but not vice versa. These findings reflect distinct biological requirements for ontogeny versus maintenance, likely driven by differences in receptor signaling, such as TCR and TNFRSF family members. Moreover, IKK also maintained γδ T cell survival through repression of RIPK1-mediated cell death, echoing its dual role in αβ T cells, where it both prevents TNF-induced apoptosis and provides NF-κB-dependent survival signals.

      Strengths:

      The multiple, unique murine genetic models employed for detailed analysis of in vivo γδ T cell differentiation and homeostasis are a major strength of this paper. NF-κB signaling processes are devilishly complex. The conditional mutants generated for this study disentangle the requirements for the various IKK-regulated pathways in γδ T cell differentiation, cell survival, and homeostasis. Data are clearly presented and suitably interpreted, with a helpful synthesis provided in the Discussion. These data will provide a definitive account of the requirements for NF-κB signaling in γδ T cells and provide new genetic models for the community to further study the upstream signals.

      Weaknesses:

      The paper would benefit greatly from a graphical abstract that could summarize the key findings, making the key findings accessible to the general immunology or biochemistry reader. Ideally, this graphic would distinguish the requirements for NF-κB signals sustaining thymic γδ T cell differentiation from peripheral maintenance, taking into account the various subsets and signaling pathways required. In addition, the authors should consider adding further literature comparing the requirements for NF-κB /necroptosis pathways in regulating other non-conventional T cell populations, such as iNKT, MAIT, or FOXP3+ Treg cells. These data might help position the requirements described here for γδ T cells compared to other subsets, with respect to homeostatic cues and transcriptional states.

      Last and least, there are multiple grammatical errors throughout the manuscript, and it would benefit from further editing. Likewise, there are some minor errors in figures (e.g., Figure 3A, add percentage for plot from IKKDT.RIPK1D138N mouse; Figure 7, "Adative").

    1. Im neuen Data Act sollen gleich drei weitere Gesetze aufgehen: die Open-Data-Richtlinie, die Verordnung über den freien Fluss nicht-personenbezogener Daten und der Data Governance Act.

      The DA will absorb ODD, DGA and free flow non-personal data Interesting, as ODD is a Directive now. ODD is based on national access regimes, so might be tricky as Reg, unless the focus is on HVD mostly.

    1. mais aussi les biais culturels et socioéconomiques qui affectent les patients

      Au delà des LLM, pour une compréhension de ces biais dans le domaine de la santé cf. Nathan, T. (2012). Médecins et sorciers. La Découverte.

    2. l'accélération et l'évolutivité des modèles d'IA non seulement entravent des discussions profondes

      Vous montrer dans votre article le contraire car vos analyses du phénomène vous conduise à discuter profondément de ce que sont ces IA. Nous pouvons penser avec mais aussi contre les IA...

    3. se transformer en espaces de co-design et de délibération, sensibles aux contextes, à la différence et au conflit

      Il est inévitable que les "supports cognitifs" imposent un point de vue particulier : ils sont là pour ça. Ce qui importe c'est qu'on est les pouvoirs de discerner ces point de vue, de les raisons et d'exprimer une position en rapport avec ce qu'ils imposent... Les "supports cognitifs" comme les environnements numériques peuvent stimuler ces pouvoirs ou au contraire les confisquer mais dans tous les cas, c'est à chacun de sentir les augmentations et les diminutions de ces pouvoirs pour choisir d'utiliser ou non ces "supports cognitifs".

    4. Pour repenser ces outils est nécessaire de les transformer dans sa forme de production pour qu'ils ne modèlent pas, mais qu'ils servent de médiateurs dans les environnements socio-techniques.

      Comment médier sans modeler ? Le médiateur influence tout autant que la forme car la médiation est aussi une forme.

    5. Est-il possible de construire des cadres normatifs du point de vue des personnes affectées par ces technologies ?

      Sans doute si c'est personnes le font. Peut-on mener une délibération éthique à la place d'un tiers ?

    6. Partant de l'idée que la philosophie est un « design conceptuel » (Floridi, 2019 ; concept qui n'est pas nécessairement compatible, mais qui est également présent chez Deleuze et Guattari, 1991), Voir Castro-Peña et al., 2021).

      En quoi n'est-il pas compatible ?

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Detailed point-by-point response

      __ __The Reviewers provided suggestions to improve the manuscript, most notably by adding experiments to (1) further support the role of Stim and Orai in epidermal heat-off responses and (2) further characterize the thermosensory responses of epidermal cells. We additionally propose to include a new set of calcium imaging experiments to visualize nociceptor sensitization by epidermal cells.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: Drosophila larvae are known to respond to noxious stimuli by rolling. The authors propose that this response arises not only by sensory response of nociceptive neurons but also by direct response of larval epidermal cells. They go onto test this idea by independently manipulating epidermal cells and nociceptive sensory neurons using GAL4 lines, GCAMPs and RNAis. The behavioural data are convincing and presented clearly with good statistical analysis. However the involvement of epidermal cells in evoking the behaviour as well as STIM/Orai mediated Ca2+ entry requires further experiments. Use of another independent GAL4 strain for epidermal cells, alternate RNAi lines for STIM and Orai, mutants for STIM and Orai and overexpression constructs for STIM and Orai would significantly enhance the data. Thus, as of now the key results require more convincing. The following additional experiments would be required to support their claims:

      1) Either use a second epidermal GAL4 strain to show key results OR provide images of the epidermal GAL4 expression double-labelled with a ppk driver using a different fluorescent protein to establish NO overlap of the epidermal GAL4 with neurons. These strains should be available free in Bloomington.

      We agree that the specificity of the GAL4 driver is an important point. In a recent publication (Yoshino et al, eLife, 2025) we provide the most comprehensive analysis of larval epidermal GAL4 drivers published to date. Included in this study is expression analysis of R38F11-GAL4 demonstrating that it is indeed specifically expressed in the epidermis. Based on the detailed expression analysis and functional analysis provided in that paper, R38F11-GAL4 was chosen for these studies as it is both highly specific for epidermal cells and provides uniform expression across the body wall.

      In our revised manuscript, we will more clearly detail how the driver was chosen for this study and provide a citation to the prior work to accompany our description of R38F11-GAL4 as an epidermis-specific driver line.

      2) Authors need to provide better data for the involvement of STIM and Orai in the Calcium responses observed. A single RNAi for each gene with marginal change in response is insufficient. The authors also do not state if the RNAis used are validated by them or anyone else. Minimally they should repeat their experiments with at least one other validated RNAi and rescue these with overexpression constructs of STIM and Orai (available in Bloomington). It is well established in literature that overexpression of STIM/Orai can rescue SOCE in Drosophila. Ideally, to be fully convincing they should test a Drosophila knockout for STIM (available in Bloomington). Heterozygotes of this are viable and should be tested. Additionally a UAS Orai dominant negative (OraiDN) strain is available in Bloomington and can be tested.

      We appreciate the Reviewer’s perspective on the importance of characterizing the efficacy of the reagents we used in this study. However, we disagree with the characterization of the change in response as “marginal”. Our results demonstrate that epidermal knockdown of Stim or Orai causes a significant reduction in the heat-off response of epidermal cells and heat-induced nociceptive sensitization.

      In a prior published study (Yoshino et al, eLife, 2025) we validated for their efficacy of these RNAi lines in combination with the same GAL4 driver at the same developmental stage. Specifically, we demonstrated that R38F11GAL4-mediated expression of UAS-Stim RNAi or UAS-Orai RNAi significantly attenuated store operated calcium entry following story depletion by thapsigargin. In the revised manuscript, we will add a statement referring to this prior validation along with a citation. In light of this prior characterization, we disagree that additional RNAi lines are required to corroborate the results.

      The most salient point of the Reviewer’s comment is that additional evidence should be provided to demonstrate more convincingly the requirement of Stim/Orai in epidermal heat-off responses. We detail our plans to address this point below, but first address the specific experimental suggestions the Reviewer provides.

      First, the Reviewer suggests the use of a dominant-negative version of Orai, and we agree that this could prove complimentary to our RNAi experiments.

      The Reviewer suggests two additional genetic approaches which are well-reasoned but problematic. First, they suggest rescuing the RNAi knockdowns with overexpression approaches. In addition to requiring the generation of new, RNAi-refractory transgenes, this approach is confounded by the effects of overexpressing CRAC channel components. Orai channels exhibit highly cooperative activation by Stim, and we previously showed that epidermal Stim overexpression drove mechanical nociceptive sensitization. Although this dosage effect confounds the rescue assays, we will examine whether epidermal Stim overexpression similarly sensitizes larvae to noxious thermal inputs as we would predict from our model.

      The final experiment the Reviewer suggests – phenotypic analysis of Stim knockouts – is not possible due to the lethal phase of the mutants. Furthermore, it is not possible using traditional mosaic analysis to generate mutant epidermal clones that span the entire epidermis. Such an approach might be possible with a newly engineered FLP-out Stim allele, but generating that reagent is beyond the scope of this work. The Reviewer suggests characterization of Stim heterozygotes, but Drosophila genes rarely show strong dosage effects as heterozygotes (though we acknowledge that dosage effects can be amplified in the cases of genetic interactions), hence a negative result (no effect on heat-off responses) would not be meaningful. In principle we could test whether Stim hetorozygosity enhances effects of epidermal Stim RNAi. Although a negative result will not be telling, the experiment is straightforward, and an enhancement of the effect of Stim RNA would support the model that RNAi provides an incomplete functional knockdown of Stim. We will therefore perform this experiment and incorporate the results into the revised manuscript, pending a postitive outcome.

      To better define the contributions of Stim and Orai to heat-off responses of epidermal cells, we will incorporate results from the following new experiments into our revised manuscript:

      • We will monitor effects of epidermis-specific expression of a dominant negative form of Orai on epidermal heat-off responses (calcium imaging) and heat-induced nociceptive sensitization (behavioral assays).
      • We will monitor effects of epidermis-specific co-expression of Stim+Orai RNAi on epidermal heat-off responses (calcium imaging) and heat-induced nociceptive sensitization (behavioral assays)
      • Orai channels exhibit highly cooperative activation by Stim, therefore we will examine whether epidermal Stim overexpression increases the amplitude of heat-off responses (calcium imaging) and sensitizes larvae to noxious thermal inputs (behavioral assays) as we would predict from our model.

        Minor comments that can be addressed:

      1) Figure 1: Further details required on how the rolling response is measured. Figure is uninformative. A video would be really helpful.

      We appreciate the suggestion. We will add a more detailed explanation of how the behaviors were scored along with an annotated video.

      2) I could not find Figure 1I described in the text. This section should be explained properly.

      Figure 1I is described in the figure legend and we will add an in-text citation.

      3) Figure 3: There appears to a small response at 32oC - why is this ignored in the text? It would be useful to have S3 in the main figure.

      The small response at 32C is not ignored, though that individual response is better understood in the context of all responses plotted in Figure 3D. We will reword the phrase “At temperature maxima below 35°C epidermal cells rarely exhibited heat-off responses” to reflect the small response that is observed at lower temperatures. We will also replace the trace in the figure – the original submission contained the one outlier sample that exhibited robust responses at 32 C.

      We appreciate the suggestion to include Fig S3 in the main text – we initially included it, but moved it to the supplement for space considerations. We will include it as a main figure in our revised submission.

      4) Fig 4: The DF/F traces for the two RNAis should be included in this figure.

      We appreciate the suggestion; we will add these traces to our revised submission.

      5) Extent of knockdown in the epidermis by each RNAi should be shown by RTPCRs.

      We note that efficacy of the knockdowns has been validated by us in acutely dissociated epidermal cells. RTPCR validation as described would require FACS-sorting of acutely dissociated, GFP-labeled epidermal cells from each specimen, an extremely time- and resource intensive experiment that provides limited information. The more relevant information is the physiological readout of Stim/Orai functional knockout using these reagents which we previously conducted. As described above, we will add a description of these experiments and the relevant citation.

      6) The authors need to explain why only a small change in the Ca2+ response is seen with either RNAi. Are there other Ca2+ channels involved? Ideally they could test mutants/RNAi for the TRP channel family. Loss of SOCE in Drosophila neurons changes the expression of other membrane channels - is this possible here? Minimally, this possibility needs to be discussed.

      We agree with the Reviewer that this topic warrants further discussion. Pending the results of our planned experiments (Orai dominan negative, Stim+Orai RNAi), we will incorporate a discussion of other channels that may contribute to the heat-off response. We appreciate the Reviewers point that loss of SOCE in Drosophila neurons can change the expression of membrane channels – that is an intriguing possibility that might explain the modest effects of Stim or Orai knockdown. We have not investigated effects of epidermal Stim/Orai knockdown on expression of other channels, but will incorporate this possibility into our discussion.

      7) In the methods section please explain how the % DF/F calculations are done and how are they normalised to the ionomycin response.

      We will incorporate these additional details in the methods section.

      8) Authors need to look at previous work on STIM and Orai in Drosophila and reference appropriately.

      We appreciate the suggestion and will incorporate additional discussion of relevant Drosophila work on STIM and Orai.

      **Referees cross-commenting**

      Reviewers 2 and 3 have raised some additional queries to what I had mentioned in my review. I agree with their comments. The authors should attempt to address all comments by all three reviewers.

      We address their comments below.

      Reviewer #1 (Significance (Required)):

      This is an interesting study that identifies epidermal cells in Drosophila with the ability to sense a drop in temperature after receiving noxious heat stimuli and invoke appropriate behaviour. Behaviour experiments are well conducted and convincing. So far only nociceptive neurons were thought to control such behavioural responses so the work is significant and important for the field. The mechanism identified needs further convincing and I have suggested experiments that would be of help. With the additional experiments suggested the work will be of interest to neuroethologists, Drosophila neuroscientists and scientists in the field of Ca signaling.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Noxious heat can have a strong adverse effect on animals, resulting in sensitization when noxious thermal stimuli are applied repeatedly. Noxious heat induces a characteristic rolling behavior in Drosophila melanogaster larvae. This study investigates sensitization, whereby a second heat stimulus evokes this behavior with significantly shorter latency (e.g., 3.4 seconds) than the initial exposure (e.g., 8.79 seconds). While prior research has implicated central and peripheral neurons in this process, recent findings in mammalian systems suggest a role for keratinocytes.

      In this manuscript, Yoshino et al. report that epidermal cells are necessary and sufficient to mediate heat sensitization in D. melanogaster larvae. Using an ex vivo epidermal imaging system, the authors demonstrate that calcium influx in epidermal cells is crucial for sensitization. Importantly, this calcium influx was observed only when the temperature was lowered from a dangerously high to a safe temperature. The calcium channel system Orai and Stim facilitates this influx.

      Major comments:

      (1) The authors clearly demonstrate the heat-off reaction using calcium influx imaging. However, all of the imaging shows the response to the first stimulation. Since the study focuses on sensitization, which shows a quicker response to the second heat stimulus, it would be helpful if the authors showed calcium influx when the second stimulus was applied. It would also be interesting to see how many times the epidermal cells can react to heat stimulation.

      We appreciate the suggestion from the Reviewer but note that the calcium influx we show occurs in epidermal cells, which signal to neurons to potentiate future responses in our model. We have emphasized this point in our revised manuscript.

      The relevant response to visualize the sensitization is the heat-evoked calcium response in nociceptors, not epidermal cells. We have verified that C4da neurons exhibit calcium responses to the warming stimulus we use in our heat-off paradigm and our preliminary studies suggest that the heat-off stimulus potentiates future responses to noxious heat in nociceptors. We will therefore examine (1) whether epidermal stimulation triggers a sensitization of nociceptors to thermal stimuli by monitoring heat-induced calcium responses using GCaMP, and (2) whether epidermal Stim and Orai are required for this sensitization.

      The second comment addresses the response of epidermal cells to repeated rounds of stimuli. We agree that this is an interesting point. We have verified that epidermal cells indeed respond to multiple rounds of heat-off stimuli. We will incorporate results from a paradigm in which epidermal cells are presented with two successive heat-off stimuli, spaced by 5 minutes to allow epidermal cytosolic calcium to return to baseline. We will incorporate new analysis examining the relative magnitude of epidermal cells to the first and second stimulus.

      (2) Figure 5 only shows one condition: a 30-second interval between the first and second heat application. While the rolling latency of the Luciferase RNAi control ranges from 4 to 12 seconds (with a median of 5 seconds), Fig. 1E shows a latency ranging from 6 to 12 seconds (with a median of 10 seconds) under the same 30-second interval conditions. This difference makes interpreting the effect of Stim and Orai confusing. The authors need to clarify whether the knockdowns accelerate the first response or delay the second response.

      The Reviewer notes that we assayed effects of Stim/Orai RNAi on heat-induced nociceptive sensitization in only one paradigm. Given the kinetics of cytosolic calcium increases following Stim or Orai RNAi in epidermal cells (Fig. 4F), we agree that an additional set of behavior experiments investing sensitization following a 60 sec recovery is warranted. For our revision we will conduct a time-course to assay requirements of epidermal Stim and Orai (using epidermal expression of Stim/Orai RNAi and Orai dominant negative transgenes) on heat-induced nociceptive sensitization. Our preliminary studies indicate that Stim and Orai RNAi significantly reduce heat-induced sensitization following 60 s of recovery (we present results from 30 s of recovery in the original submission).

      The Reviewer raises some questions about differences in behavioral latencies in Figure 1E and Figure 5B. We intentionally avoid such comparisons both because the genetic backgrounds are different and the experiments were conducted at very different times (more than 1 year apart). In both experiments the salient feature that we discuss is the presence or absence of sensitization, not the mean latency. We note that we do compare mean latency values in Figure 1B, but that was a distinct experimental paradigm (global heat of variable temperatures followed by focal noxious heat) designed specifically to define heat stimuli that generate the maximum level of sensitization. In that case, the genotype was fixed and all assays were conducted concurrently.

      Minor comments:

      (i) In Fig. 2C´´, the authors observed clear calcium influx in epidermal cells by combining the GCaMP genetic tool with an ex vivo thermal perfusion system. Although this system applies heat uniformly across the epidermal tissue, calcium influx is spatially restricted, appearing primarily in the head and tail regions of the epidermis. These results suggest that the heat-responsive epidermal cells are localized to these regions or that there are regional differences in sensitivity. The authors should explain the spatial relationship between the heat-applied epidermal cells and the occurrence of calcium influx.

      The Reviewer notes that intensity of the epidermal GCaMP signal is particularly intense in the anterior and posterior portions of the fillet preparation (Fig. 1B-1C), and we agree that it would be useful to include an explanation of this result, which is an artifact of the sample preparation.

      The specimens we use for calcium preparation are “butterfly” preparations – the body wall is filleted along the long axis with the exception of regions at the head and tail that are pinned down on sylgard plates. Hence, the regions in the head and tail contain intact tissue (including a double layer of skin when we image in widefield), not a single layer of skin (the rest of the prep). More significantly, the head and tail regions are pinned down, creating a wound that triggers lasting local calcium transients (note signal in the absence of temperature stimulus, Figure 1B’ and 1B”, 1C’). We therefore exclude this region from our analysis. We note that our behavior studies relied on stimuli presented to the abdominal segments we sample in the semi-intact calcium imaging. Similarly, we dissociated epidermal cells exclusively from these segments for imaging of acutely isolated epidermal cells.

      We do note that there is a periodicity to the signal – within each segment there are local maxima and minima of signal, and we agree with the Reviewer that this spatial segregation is an interesting point for discussion. We will add 1-2 sentences to our discussion of the result to acknowledge this point.

      (ii) Related to comment (i) above, if heat stimuli are applied topically using a heat probe under the ex vivo imaging system, how large an area reacts to the stimuli?

      The Reviewer raises an interesting question about the local response to heat stimuli. In our dissociated cell experiments we found that the overwhelming majority of isolated epidermal cells exhibit heat-off responses, and we likewise find that the majority of cells in our semi-intact preparation respond to heat-off stimuli. However, our current probe for delivering local heat stimuli is not compatible with our imaging system. We are working to incorporate an IR laser to focally deliver heat stimulus to explore whether epidermal cells signal to neighbors following stimulation, but such studies are beyond the scope of the current work.

      (iii) Providing supplementary movie(s) of the calcium live imaging would enhance the reader's understanding.

      We agree with the Reviewer that this would be a useful supplement. We will add representative movies as experimental supplements in our revised manuscript.

      (iv) The time point of the image in Fig. 2C´ ("before heat") is not the most informative for demonstrating a "heat-off" response. The authors should replace it with an image taken during the heat application to provide a more direct comparison with the post-stimulus influx shown in Fig. 2C´´.

      We appreciate the Reviewer’s suggestion and agree this would be a better choice to visually represent the change in fluorescence induced by the heat-off response. We will make this change in our revised manuscript.

      (v) The authors state that sensitization occurs "primarily in the 30-45 ºC range." However, the rolling probability and latency developed oppositely at 45 ºC stimulation than at 40 ºC. It would be doubtful that 45 ºC may be approaching a noxious or damaging threshold that engages a different phenomenon. The authors should reconsider including 45 ºC within the optimal sensitization range or provide a justification.

      We agree with the Reviewer that a more detailed discussion of the effects of temperature at the end of the range (45 C) is warranted. Exposure to a 45 C global heat stimulus triggered temporary paralysis in some larvae, and we suspect that this accounts for the apparent reduction in roll probability following the second stimulus. We can add a plot depicting the proportion of larvae that exhibited paralysis during 45 C global heat and determine whether these heat-paralyzed larvae exhibited distinct responses from larvae that were not paralyzed and provide a more detailed account of the optimal sensitization range.

      Treatment with 45 C stimuli still triggered a significant reduction in roll latency (sensitization), but we did not examine whether the latency was significantly different from what was observed at 40 C. We can add that analysis in the revision.

      (vi) In the sentence "To this end, we developed a perfusion system, that would deliver thermal ramps from ~20-45ºC ...," the tilde ~ should be replaced with "approximately".

      Noted. We will make the change.

      (vii) Throughout the manuscript, please clarify in the figure legends whether the sample size (n) refers to the number of individual animals or the number of cells.

      Noted. We will add the relevant details to our sample sizes notations.

      (viii) The Key Resources Table does not specify the wild-type (WT) strain used for the control experiments (e.g., in Fig. 1). Please provide the full genotype of the control strain used.

      We included the experimental genotypes in each figure legend, which we find more useful than the key resource table, which contains a list of all reagents used in the study (Drosophila alleles included).

      Reviewer #2 (Significance (Required)):

      General Assessment

      This study addresses a fundamental question in sensory biology: whether epidermal cells, long regarded as passive participants in somatosensation, actively contribute to noxious heat detection and avoidance behavior. While previous work has defined the neuronal circuits and TRP channel mechanisms underlying thermal nociception in Drosophila larvae, the potential sensory role of skin cells has remained largely unexplored. The authors integrate behavioral analysis with in vitro and ex vivo calcium imaging to provide a rigorous, multi-level investigation of epidermal thermosensitivity.

      Advancement

      The work advances the field by revealing that Drosophila epidermal cells are intrinsically thermosensitive and can acutely sensitize larval nociceptive responses to noxious heat through heat-off signaling. This discovery shifts the current paradigm of thermal nociception from a neuron-centric model to one that incorporates epidermal contributions, highlighting a conserved and previously underappreciated role of skin cells in active environmental sensing.

      The reviewer's expertise: Molecular genetics, developmental biology, insect physiology and endocrinology.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This manuscript describes the temperature responses of Drosophila larval epidermal cells. These cells are activated by cooling and also exhibit strong heat-off responses. Orai and Stim are required in epidermal cells for these heat-off responses. The heat-off responses sensitize the epidermal cells, leading to a greater proportion of animals displaying rolling behaviors and a reduced latency to initiate rolling following noxious heating treatment. The following comments are intended to help improve the manuscript.

      Major:

      1. In Figure 3A, the conclusion will be strengthened by testing heat-off responses from 10 {degree sign}C to 40 {degree sign}C.

      The Reviewer makes an important point. In our original experiment, the lack of response in the 10C – 30C experiment could be due to some cold-induced suppression of the off response. We have found that this is not the case – we have found that off responses following a 10C-40C ramp are indistinguishable from responses to a 20C-40C ramp. In our revised manuscript we will incorporate new results showing epidermal heat off responses to a 10C-40C ramp as well as normalization to 20C-40C responses performed in parallel.

      Figure 4C shows that 2-APB suppresses the heat-off response. Since 2-APB blocks both Orai and TRP channels, it is unclear why the authors focused exclusively on the Orai pathway without testing TRP channels.

      We found that epidermal cells exhibited minimal responses to warming stimuli, as would be expected for the epidermally expressed TRP channel TRPA1. In addition, the heat-off response we identified was remarkably similar to characteristic heat-off responses of mammalian CRAC channels. Hence, we focused our attention on the Orai pathway. While we agree that contributions of TRP channels could be of interest, especially if our additional analyses (double RNAi and Orai Dominant Negative) support the model that additional channels likely contribute to the heat-off response, the characteristic temperature responses of CRAC channels made them the most plausible candidate.

      In parallel to the experiments to further characterize Stim/Orai contributions to the heat-off response, we will assay requirements of TRPA1 to heat-induced nociceptor sensitization.

      While 2-APB completely abolishes the heat-off response, Orai and Stim RNAi only slightly (although significantly) reduce calcium responses. The knockdown efficiency of the RNAi constructs should be validated. Furthermore, testing whether combining Orai RNAi and Stim RNAi produces a stronger reduction in calcium responses would be informative.

      We addressed the question of knockdown efficiency above, and agree that testing the effects of Orai RNAi and Stim RNAi in combination is worthwhile. We detailed our plans for these experiments above.

      The study uses third-instar larvae. Please specify whether early, mid, or late third instar were used.

      In our original submission we stated “Third-instar larvae (96-120 AEL) larvae were used in all experiments” We provide additional details on the staging of larvae for all experiments in the methods section of our revised submission. To synchronize cultures, embryos were collected from experimental crosses for 24 h, aged for 96 h, and foraging mid-third instar larvae (96-120 h old) were used for all experiments.

      Please provide more details about the thin layer of water used. Specifically, indicate the size of the Peltier plate and the volume of water applied.

      We provide additional details on the application of global heat stimulus in the methods section of our revised manuscript. “For assays testing effects of varying the temperature of prior thermal stimuli on thermal nociception, larvae were individually transferred to a pre-warmed Peltier plate (11 x 7 cm; Torrey Pines Scientific). Peltier plates were warmed to the indicated temperatures, a thin layer of water was applied to the surface using a paint brush, and the temperature was verified using an infrared thermometer. Larvae were transferred individually to the Peltier plate, incubated for the indicated time, and recovered to 2% Agar Pads using a paint brush. Following 10 s of recovery, larvae were stimulated with a 41.5°C thermal probe, as above, and latency to the first complete roll was recorded.”

      Minor:

      1. There is an inconsistency between the text and the figure regarding the sample number in Figure 1D.

      We thank the reviewer for identifying the discrepancy. This inconsistency has been corrected in the revised submission.

      Please provide the raw representative data for the time course of heat-off calcium responses in Figure 1E.

      We will incorporate representative traces for the heat-off responses plotted in Figure 1E.

      A period is missing at the end of the sentence: "For curve fitting, sample-averaged fluorescence traces were fitted with a single exponential decay function using R to extract a representative time constant (τ) and assess response kinetics."

      We thank the reviewer for identifying the omission. The period has been added.

      In the sentence "Behavior Responses were analyzed post-hoc blind to genotype and were plotted according to roll probability and roll latency," the word Responses should begin with a lowercase r.

      This has been corrected in the revised submission.

      Reviewer #3 (Significance (Required)):

      This manuscript describes the heat-off responses of larval epidermal cells and investigates their underlying molecular mechanisms as well as associated behavioral consequences.

      The calcium responses and behavioral assays are clearly presented. However, the contribution of Stim and Orai to this process is not convincing.

      The study may be of interest to researchers working on Drosophila and temperature sensation, as well as to those studying Orai and Stim function.

      I am a researcher specializing in Drosophila thermosensation.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript describes the temperature responses of Drosophila larval epidermal cells. These cells are activated by cooling and also exhibit strong heat-off responses. Orai and Stim are required in epidermal cells for these heat-off responses. The heat-off responses sensitize the epidermal cells, leading to a greater proportion of animals displaying rolling behaviors and a reduced latency to initiate rolling following noxious heating treatment. The following comments are intended to help improve the manuscript.

      Major:

      1. In Figure 3A, the conclusion will be strengthened by testing heat-off responses from 10 {degree sign}C to 40 {degree sign}C.
      2. Figure 4C shows that 2-APB suppresses the heat-off response. Since 2-APB blocks both Orai and TRP channels, it is unclear why the authors focused exclusively on the Orai pathway without testing TRP channels.
      3. While 2-APB completely abolishes the heat-off response, Orai and Stim RNAi only slightly (although significantly) reduce calcium responses. The knockdown efficiency of the RNAi constructs should be validated. Furthermore, testing whether combining Orai RNAi and Stim RNAi produces a stronger reduction in calcium responses would be informative.
      4. The study uses third-instar larvae. Please specify whether early, mid, or late third instar were used.
      5. Please provide more details about the thin layer of water used. Specifically, indicate the size of the Peltier plate and the volume of water applied.

      Minor:

      1. There is an inconsistency between the text and the figure regarding the sample number in Figure 1D.
      2. Please provide the raw representative data for the time course of heat-off calcium responses in Figure 1E.
      3. A period is missing at the end of the sentence: "For curve fitting, sample-averaged fluorescence traces were fitted with a single exponential decay function using R to extract a representative time constant (τ) and assess response kinetics."
      4. In the sentence "Behavior Responses were analyzed post-hoc blind to genotype and were plotted according to roll probability and roll latency," the word Responses should begin with a lowercase r.

      Significance

      This manuscript describes the heat-off responses of larval epidermal cells and investigates their underlying molecular mechanisms as well as associated behavioral consequences.

      The calcium responses and behavioral assays are clearly presented. However, the contribution of Stim and Orai to this process is not convincing.

      The study may be of interest to researchers working on Drosophila and temperature sensation, as well as to those studying Orai and Stim function.

      I am a researcher specializing in Drosophila thermosensation.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: Drosophila larvae are known to respond to noxious stimuli by rolling. The authors propose that this response arises not only by sensory response of nociceptive neurons but also by direct response of larval epidermal cells. They go onto test this idea by independently manipulating epidermal cells and nociceptive sensory neurons using GAL4 lines, GCAMPs and RNAis. The behavioural data are convincing and presented clearly with good statistical analysis. However the involvement of epidermal cells in evoking the behaviour as well as STIM/Orai mediated Ca2+ entry requires further experiments. Use of another independent GAL4 strain for epidermal cells, alternate RNAi lines for STIM and Orai, mutants for STIM and Orai and overexpression constructs for STIM and Orai would significantly enhance the data. Thus, as of now the key results require more convincing. The following additional experiments would be required to support their claims:

      1) Either use a second epidermal GAL4 strain to show key results OR provide images of the epidermal GAL4 expression double-labelled with a ppk driver using a different fluorescent protein to establish NO overlap of the epidermal GAL4 with neurons. These strains should be available free in Bloomington.

      2) Authors need to provide better data for the involvement of STIM and Orai in the Calcium responses observed. A single RNAi for each gene with marginal change in response is insufficient. The authors also do not state if the RNAis used are validated by them or anyone else. Minimally they should repeat their experiments with at least one other validated RNAi and rescue these with overexpression constructs of STIM and Orai (available in Bloomington). It is well established in literature that overexpression of STIM/Orai can rescue SOCE in Drosophila. Ideally, to be fully convincing they should test a Drosophila knockout for STIM (available in Bloomington). Heterozygotes of this are viable and should be tested. Additionally a UAS Orai dominant negative (OraiDN) strain is available in Bloomington and can be tested.

      Minor comments that can be addressed:

      1) Figure 1: Further details required on how the rolling response is measured. Figure is uninformative. A video would be really helpful.

      2) I could not find Figure 1I described in the text. This section should be explained properly.

      3) Figure 3: There appears to a small response at 32oC - why is this ignored in the text? It would be useful to have S3 in the main figure.

      4) Fig 4: The DF/F traces for the two RNAis should be included in this figure.

      5) Extent of knockdown in the epidermis by each RNAi should be shown by RTPCRs.

      6) The authors need to explain why only a small change in the Ca2+ response is seen with either RNAi. Are there other Ca2+ channels involved? Ideally they could test mutants/RNAi for the TRP channel family. Loss of SOCE in Drosophila neurons changes the expression of other membrane channels - is this possible here? Minimally, this possibility needs to be discussed.

      7) In the methods section please explain how the % DF/F calculations are done and how are they normalised to the ionomycin response.

      8) Authors need to look at previous work on STIM and Orai in Drosophila and reference appropriately.

      Referees cross-commenting

      Reviewers 2 and 3 have raised some additional queries to what I had mentioned in my review. I agree with their comments. The authors should attempt to address all comments by all three reviewers.

      Significance

      This is an interesting study that identifies epidermal cells in Drosophila with the ability to sense a drop in temperature after receiving noxious heat stimuli and invoke appropriate behaviour. Behaviour experiments are well conducted and convincing. So far only nociceptive neurons were thought to control such behavioural responses so the work is significant and important for the field. The mechanism identified needs further convincing and I have suggested experiments that would be of help. With the additional experiments suggested the work will be of interest to neuroethologists, Drosophila neuroscientists and scientists in the field of Ca signaling.

    1. due to the highvascularity of these areas, which leads to rapid absorption andpeak plasma concentrations similar to intravenousadministration.

      ③ due to the high vascularity of these areas, which leads to rapid absorption and peak plasma concentrations similar to intravenous administration. çünkü bu bölgeler yüksek vaskülariteye sahiptir; bu da ilacın hızlı emilmesine ve intravenöz uygulamaya benzer plazma zirve konsantrasyonlarına yol açar.

    2. as well as roles in DNA transport in DNAvaccines.

      ③ as well as roles in DNA transport in DNA vaccines. ve DNA aşılarında DNA taşınmasında da rol oynarlar.

    Annotators

    1. un contributo a un ipotetico movimento culturale per un’ecologia della conoscenza in un ambiente mediatico a dir poco controverso. Perché non si pensa bene se si è informati male. Lo sviluppo dell’attuale forma di economia digitale, realizzato con una strategia di innovazioni disattenta alla qualità delle relazioni, ha generato esiti culturali evidentemente negativi, dei quali è tempo che la società si faccia carico, ripensando il sistema dei media in modo da renderlo compatibile con gli obiettivi democratici, ponendo fine al senso di ineluttabilità diffuso.

      The book aims for a cultural movement / ecology of knowledge in spite of the toxic platforms. Because those erode thinking well (title). Innovation now disregards the quality of relationships. (Nice, this chimes with my [[Menselijk en digitaal netwerk zijn gelijksoortig 20200810142551]] and the unfulfilled potential of exploring that congruence). The book proposes to review the mediasystem to make it align with democratic values.

    1. Nel mio libro, questo conduce a costruire un progetto di liberazione dal dominio delle mega-piattaforme attuali e suggerisce la possibilità di costruire molte piattaforme alternative a quelle gigantesche e sfruttatrici che attualmente costituiscono il sistema dei media. Nel libro di Ferraris, questo conduce a una proposta di redistribuzione del valore dei dati, in funzione di un progetto politico che porti a un nuovo, moderno, sistemico e pragmatico comunismo.

      Luca compares two books (his and Ferraris') in how they solve that relative value problem. Luca by proposing dismantling the dominant platforms and replace them with a multitude of others (as the volume of personal data aids its exploitation I presume, spreading it out means having to collect it and collection is costly if users don't actively bring it to you already). Ferraris proposes otoh an active redistribution of the value gained as political project a new 'pragmatic' and modern communism. De Biase proposal seems more achievable imo (in the sense that we're already doing it)

    2. È ovvio: il valore dei dati personali non esiste in sé ma in relazione a un contesto.

      Often not acknowledged what Luca says here: that it's obvious that n:: the value of personal data is not intrinsic but w.r.t. to a specific context of use. It's relative (and that is also why data protection of personal data isn't absolute, but weighed against other factors.) Perhaps capture it as Notie.

    1. intervenu au mois de décembre s de Pompidou

      It seems like Bernard is referring to an event with the Institute for Research and Innovation (IRI) at the Pompidou Centre in 2018: The intelligence of cities and the new urban revolution [L’intelligence des villes et la nouvelle révolution urbaine] as part of talks organised at the Pompidou Centre: The Discussions of New Industrial World' [Les Entretiens du Nouveau Monde Industriel] on December 18–19th, 2018. Description

    1. Técnicas del nacimiento y de la obstetricia Técnicas de la infancia, crianza y alimentación del niño Técnicas de la adolescencia Técnicas del adulto Técnicas de la actividad y del movimiento Técnicas del cuidado del cuerpo. Frotar, lavar, enjabonar. Técnicas de la consumición, comer. Técnicas de la reproducción. Técnicas del cuidado, de lo anormal.

      Y? Para algunas cosas también puedes utilizar gráficos

    1. La lógica es clara: si va a usarse la inteligencia artificial en entornos profesionales, la supervisión humana debe integrarse en el proceso. En IE University, señalo a mis estudiantes que pueden usar modelos de inteligencia artificial generativa, pero deben compartir las secuencias de prompts, presentar su revisión del resultado, y asumir su responsabilidad. Si un profesional falla en ese control, su trabajo no puede darse por válido. Ya no estamos hablando de «nuevas tecnologías» ni de cosas que nacieron anteayer: si no sabes usar bien un algoritmo generativo, obtienes respuestas absurdas y no solo las das por buenas, sino que vas y las presentas como buenas, eres simplemente un irresponsable, un mal profesional.

      Instrucciones claras para los alumnos

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      • *

      __Reviewer #1 __


      Major comments


      1. The manuscript posits that the loss of function of MASh components (Ogc1 and Aralar) decreases adrenergic-stimulated lipolysis by altering the cytosolic NAD⁺/NADH ratio, with AMPK/ACC mentioned as possible mediators. However, this remains speculative. Please provide mechanistic data directly linking MASh-dependent NAD⁺/NADH changes to the regulation of lipolysis in brown adipocytes during adrenergic stimulation. Answer 1) The reviewer raises an important point regarding the direct assessment of cytosolic NAD⁺/NADH redox changes as a mechanistic link for altered lipolysis in brown adipocytes lacking MASh components. To address this point, we added new data to the revised manuscript showing lactate/pyruvate ratio as measured by metabolomics. This is a well-established surrogate marker to monitor changes in redox balance. Notably, under basal (non-stimulated) conditions, the lactate/pyruvate ratio did not display any significant differences between Aralar 1 KD and control cells, suggesting preservation of cytosolic NAD⁺/NADH levels in the absence of functional MASh under these conditions. This finding is consistent with reports showing the robustness of NAD⁺ regeneration via multiple shuttles and the possibility of metabolic compensation when one shuttle is compromised (PMID: 40540398; PMID: 37647199).

      The results have been added as new supplementary Figure 1 as following:

      Our new metabolomics data also revealed substantial reductions in the aspartate/glutamate ratio in Aralar 1 knockdown cells, serving as a metabolomic signature of impaired MASh function and reduced exchange of these amino acids between the cytosol and mitochondria. Given that the MASh is a major mechanism for exporting cytosolic reducing equivalents into the mitochondria under high metabolic demand, its loss would be expected to impact redox homeostasis, particularly under adrenergic stimulation when glycolytic flux and lipolytic activity are elevated (PMID: 40540398).

      Importantly, although our redox surrogate marker did not detect alterations, this may be explained by activation of compensatory pathways, most notably the glycerol phosphate shuttle (GPSh), which is highly expressed and active in brown adipocytes. Indirect support for this compensation comes from data shown in figure 4I showing reduced glycerol release in Aralar 1 KD cells upon norepinephrine stimulation and blocked lipolysis. This suggests a redirection of glycolytically derived G3P away from release and toward enhanced cycling within the GPSh, supporting cytosolic NAD⁺ regeneration via mitochondrial FAD-dependent G3PDH and cytosolic NAD⁺-dependent G3PDH activity. This is consistent with studies documenting that the combined action of MASh and GPSh maintains NAD redox homeostasis in brown adipocytes especially during non-thermogenic conditions (PMID: 168075; PMID: 40540398; PMID: 37647199). We have included a discussion about this possibility at page 9, third paragraph as follows:

      *“Previous studies have shown that BAT exhibits high activity of mitochondrial FAD-dependent glycerol-3-phosphate dehydrogenase (mG3PDH), which functions as an electron sink to sustain low cytosolic NADH levels essential for continuous glycolytic flux [11]. Accordingly, suppression of the MASh, either genetically or pharmacologically, is likely to induce a compensatory upregulation of the GPSh. This adaptation would enhance G3P turnover, contributing to the maintenance of cytosolic NAD redox balance. Moreover, the increased flux through the GPSh could favor fatty acid esterification and triglyceride synthesis or re-esterification, consistent with our findings in Ogc and/or Aralar 1 KD cells, where (i) triglyceride content rises (Fig. 3), (ii) overall respiratory rates remain largely unaltered (Figs. 2D–G), and (iii) glycerol release declines significantly (Fig. 4I). Notably, the decrease in glycerol release persists even when lipolysis is blocked by ATGlistatin, suggesting that the available G3P pool is rerouted from dephosphorylation and extracellular release toward oxidation to DHAP by mG3PDH to regenerate cytosolic NAD+ under MASh-deficient conditions. We propose that interference with the MASh does not directly impact lipolysis but instead alters the cellular balance between DHAP and G3P owing to enhanced activity of the GPSh. This metabolic shift would favor the esterification of G3P with free fatty acids, thereby promoting triglyceride synthesis. These results support the notion that, even during adrenergic stimulation—when long-chain unsaturated fatty acids and their CoA esters strongly inhibit mG3PDH activity [11]—the residual flux through the glycerophosphate shuttle remains critical for sustaining cytosolic NAD redox equilibrium [11,19,32].” *

      • *

      At the mechanistic level, adrenergic stimulation in brown adipocytes activates robust lipolysis and thermogenic gene programs, generating high NADH that must be efficiently reoxidized to sustain flux through glycolysis and lipolysis-linked pathways. Our findings are consistent with a model in which the loss of MASh does not prevent cytosolic NAD⁺ regeneration or lipolytic flux during acute adrenergic stimulation, due to compensatory upregulation of the GPSh, as suggested by the glycerol release changes. Thus, while MASh normally acts as a conduit for NADH export and aspartate/glutamate exchange, in its absence, the GPSh maintains cytosolic redox balance, thereby sustaining glycolytic and lipolytic capacity.

      We agree that future studies should employ direct measurements of cytosolic NAD⁺/NADH ratios (e.g., genetically-encoded redox sensors) during adrenergic stimulation and specific pharmacological inhibition of both shuttles to dissect these relationships in greater detail. We sincerely appreciate the reviewer's input, which has prompted us to clarify the indirect but robust evidence supporting a role for compensatory redox shuttle activity in preserving brown adipocyte lipolysis in the setting of MASh impairment.

      We have further added a new paragraph in the discussion section (page 10)::

      *“Mechanistically, the connection between the MASh and lipolysis appears to involve regulation of the cytosolic NAD⁺/NADH redox balance. MASh activity facilitates the regeneration of NAD⁺ from NADH in the cytosol, primarily through the reduction of oxaloacetate to malate by cytosolic malate dehydrogenase (Fig. 1G-H). Despite the theoretical expectation that reductions in MASh activity would disturb redox homeostasis, our metabolomic data show that the lactate/pyruvate ratio remains unchanged under conditions of MASh impairment, indicating that the overall cytosolic NAD⁺/NADH ratio is maintained (Figure S1A-C). While direct measurements of cytosolic NAD⁺/NADH were not performed, the preserved lactate/pyruvate ratio in Aralar 1 KD cells under basal conditions strongly suggests redox stability, likely due to compensatory activity by alternative mitochondrial shuttles or metabolic adaptations that maintain NAD redox homeostasis despite MASh impairment [18,33]. *

      Previous evidence indicates that BAT exhibits high activity of mitochondrial FAD-dependent glycerol-3-phosphate dehydrogenase (G3PDH), which acts as an electron sink to sustain low cytosolic NADH levels critical for glycolysis [34]. In this sense, it is conceivable that genetic or pharmacological suppression of MASh triggers compensatory enhancement of the G3P shuttle, increasing G3P availability and facilitating the maintenance of cytosolic NAD redox balance. This adaptation could also promote fatty acid esterification and triglyceride synthesis or re-esterification, aligning with our observations that in Ogc and/or Aralar 1 KD cells: (i) triglyceride levels increase (Fig. 3); (ii) overall respiratory rates are preserved (Figs. 2D–G); and (iii) glycerol release is significantly reduced (Fig 4I).”

      • *

      __ The absence of in vivo analysis of lipid-droplet size in MASh loss-of-function models is a major concern. In vitro results could be confounded by differences in differentiation stage between groups. Please document equivalent adipogenesis across groups (e.g., Pparg/Cebpa/Plin1/Fabp4 expression).__

      Answer 2) We thank the reviewer for the thoughtful and constructive comment regarding potential confounding by differences in differentiation stage, and for highlighting the importance of documenting equivalence between experimental groups. We appreciate the opportunity to clarify and provide additional assurance on this point.

      As detailed in our manuscript, we have performed qPCR analysis of multiple well-established markers of brown adipocyte differentiation, including Ucp1, Elovl3, Prdm16, Pparg, Cebpa, Plin1, and Fabp4, in both scramble, aralar1 KD, and Ogc KD cells (see Fig. S1A and accompanying text). Our results show no apparent effect of these genetic interventions on overall differentiation, as the expression levels of these key markers were consistently unaltered across groups. Furthermore, adenoviral-mediated knockdown of Ogc achieved an approximate 80% reduction in Ogc mRNA (see Fig. S1B), yet most differentiation markers remained unaffected. We did observe significant increases in Atgl, Pgc1α, and Tfam mRNA levels, which may indicate a degree of pathway reprogramming without affecting the general differentiation profile. We propose that interference with the MASh does not directly impact lipolysis but instead alters the cellular balance between DHAP and G3P owing to enhanced activity of the GPSh. This metabolic shift would favor the esterification of G3P with free fatty acids, thereby promoting triglyceride synthesis.

      Additional experimental support for equivalent differentiation can be drawn from our respirometry data presented in Figures 2E and 2G. These figures demonstrate that respiratory rates upon norepinephrine stimulation, which is a sensitive indicator of brown adipocyte thermogenic capacity, were essentially identical in scramble, aralar1 KD, and Ogc KD cells. Since norepinephrine-stimulated respiration requires both functional mitochondria and the full differentiation of brown adipocytes, these results strongly support the conclusion that silencing either MASh component does not impair the fundamental ability of cells to undergo brown adipocyte differentiation or achieve functional thermogenic competence.

      This is consistent with published findings showing that norepinephrine triggers robust respiration and thermogenic activation only in fully differentiated and functional brown adipocytes, making such measurements a widely accepted proxy for differentiation status and mitochondrial integrity. Thus, the equivalent respiratory responses observed in all groups further validate that differentiation was not compromised by the genetic interventions.

      We hope this clarifies that equivalent adipogenesis was carefully documented and that any observed phenotypes are unlikely to be attributable to differences in differentiation stages. Thank you again for your rigorous assessment and for helping to ensure the robustness of our study.

      __ Please include rescue experiments (add-back OGC1 and Aralar) to rule out siRNA/shRNA off-target effects and verify that the phenotype stems from MASh loss of function.__

      Answer 3) We thank the reviewer for this important suggestion regarding the inclusion of rescue experiments with add-back of Ogc and Aralar to definitively exclude off-target effects of the siRNA/shRNA-mediated knockdowns.

      We would like to kindly point out that although we did not perform add-back rescue experiments directly, the consistency of phenotypes observed across two independent genetic interventions—aralar 1 KD and Ogc KD—strongly argues against off-target effects being responsible for the observed metabolic and functional alterations. Specifically, both knockdowns yielded remarkably similar phenotypes in multiple assays, including respirometry analyses, mitochondrial morphology, lipid droplet homeostasis, and lipid metabolism, supporting the conclusion that these effects stem from MASh loss of function rather than nonspecific silencing.

      Furthermore, our new supplementary data (new Supplementary Figure 1A-F) reveals a significant reduction in the aspartate/glutamate ratio in Aralar 1 KD cells, a compelling functional readout for MASh impairment. This molecular evidence corroborates that our genetic interventions effectively disrupted MASh activity as intended.

      We sincerely appreciate the reviewer’s thorough evaluation and understand the importance of rescue experiments. While recognizing their value, we believe the convergent genetic, metabolic, and functional evidence presented across two different MASh components provides strong and consistent support that the phenotypes observed are due to specific loss of MASh function.


      __ Please expand on physiological significance: What is the importance of MASh regulation of BAT lipolysis in long-term adaptive thermogenesis?__

      Answer 4) This is a very interesting aspect, and we have included a new paragraph in the discussion section (page 14) to address it as follows:

      “Our results, supported by recent literature, strongly indicate that the malate–aspartate shuttle (MASh) plays a key role in facilitating fatty acid–dependent thermogenesis in brown adipocytes. Specifically, BAT-targeted overexpression of GOT1 has been shown to enhance β-oxidation and support acute cold-induced thermogenesis (PMID: 40540398). Interestingly, genetic ablation of GOT1—and thus MASh inhibition—preserves cold-induced thermogenesis by promoting a metabolic shift from fatty acid to glucose oxidation. Our findings corroborate and extend these observations by demonstrating that MASh impairment sustains overall respiratory activity in norepinephrine-stimulated brown adipocytes (Figures 2D–2G), while concurrently impairing lipolysis and resulting in an accumulation of small lipid droplets (Figures 3 and 4). Collectively, these data suggest that MASh not only modulates substrate preference towards fatty acid oxidation but also facilitates lipolysis, an essential upstream step that enables lipid oxidation and supports thermogenic heat production.”

      Minor comments

      1. __ Fig. 4 legend/title contains a typo ("lypolysis" → lipolysis).__ Answer 1) Corrected

      __ In Fig. 2 legend line: "Adevirus-mediated" → Adenovirus-mediated; "OCAR" → OCR.__

      Answer 2) Corrected

      __ For lipolysis imaging, you already show Forskolin/Atglistatin/Etomoxir controls; add a vehicle-only time course overlay in the main figure (currently in text/legend) to aid visual comparison.__

      Answer 3) We thank the reviewer for pointing this out. To improve clarity, we have updated the labeling in Figures 3 and 4: “basal” now clearly refers to the unstimulated/untreated condition, and the previously labeled “UT” condition has been clarified as “untransduced.” These changes make the figure legends and data presentation more consistent and easier to interpret.

      __ Ensure consistent gene symbols (Atgl/Pnpla2), and protein capitalization.__

      Answer 4) Corrected.

      __Reviewer #2 __

      Major points:

      1. __ In the current manuscript, mitochondrial morphology (area, aspect ratio, and roundness) was analyzed in OGC1 KD cells using TMRE, whereas MitoTracker Deep Red (MTDR) was used in Aralar1 KD cells. Notably, TMRE is a ΔΨm-dependent probe. The signal intensity can change, or the distribution may reflect alterations in membrane potential rather than true morphological changes. Therefore, the observed differences in OGC1 KD cells based on TMRE staining may be confounded by the dye's functional dependence, potentially biasing the conclusions. It is recommended to evaluate mitochondrial morphology with consistent trackers across conditions. In addition, in the subsequent OCR analysis, mitochondrial area was used for normalization. Please clarify which staining method was employed, and provide justification for its suitability.__ Answer 1) We thank the reviewer for this insightful comment. Indeed, TMRE is a membrane potential-sensitive dye and could therefore potentially affect measurements of mitochondria.

      We would like to point out that mitochondrial morphology was quantified based on mitochondrial area rather than fluorescence intensity. To create an accurate binary map of mitochondria, we used a low threshold, which allowed us to include even weakly stained mitochondria and thereby detect them independently of their membrane potential. In all imaged cells, TMRE signal was sufficient to reliably identify mitochondrial pixels. Moreover, these images were acquired using a confocal microscope, where the risk of pixel expansion due to higher fluorescence intensity is minimized. Lastly, given that overall mitochondrial oxygen consumption in these cells remains largely intact, we do not expect a substantial loss of membrane potential, although minor effects cannot be entirely excluded.

      We opted to use TMRE for imaging Ogc KD cells because the scramble control for these shRNA viruses carries an mKate fluorescent tag, which overlaps with the MTDR signal. Since accurate assessment of transduction efficiency relied on detecting mKate, MTDR could not be used in these experiments. Importantly, we only compare mitochondrial morphology within the same staining condition and do not draw conclusions across cells stained with different dyes.

      To ensure transparency, we have added a new section at the discussion (page 17, 2nd paragraph) highlighting the potential influence of ΔΨm-dependent dyes on morphological measurements as follows:

      “It is also important to note that mitochondrial morphology was quantified using MTDR in Aralar 1 KD cells and TMRE in Ogc KD cells due to experimental constraints (see Methods). TMRE is a membrane potential–dependent dye, which could potentially influence morphology measurements. To minimize this risk, we used confocal microscopy, which reduces the likelihood of pixel expansion due to higher fluorescence intensity, and set thresholds to detect even weakly stained mitochondria. Nonetheless, we cannot fully exclude the possibility that the differences in morphology observed between Aralar 1 and Ogc KD are influenced by the use of different dyes; however, statistical comparisons were never performed across samples stained with different dyes.”

      Also, we have expanded the Methods section (page 22, 2nd paragraph) to include a rationale for using these dyes and describe the analysis protocol as following:

      “TMRE was used for Ogc KD cells because the scramble control for the shRNA viruses carries an mKate fluorescent tag, which overlaps with MTDR fluorescence, preventing its use. MTDR was used for Aralar KD cells. Image Analysis was performed in FIJI (ImageJ, NIH). For the quantification of mitochondrial morphology and area, images stained with TMRE or MTDR were analyzed. Thresholds were adjusted to ensure that even weakly stained mitochondria were detected and included in the analysis. Only the mitochondrial area was evaluated, independent of fluorescence intensity.”

      Minor points:

      1. __ In the introduction, the authors state that "LDH activity increases in the context of BAT activation". This point is important for the logic of the manuscript, reference [10] cited here is not sufficient to support this claim. It is recommended to provide appropriate references to support this statement.__ Answer 1) We have substantially changed this paragraph in the revised manuscript to better explain why LDH would not act as a major player in contributing to NAD redox balance in the context of BAT thermogenesis, as follows:

      “In mammalian cells, cytosolic NAD⁺ is regenerated through lactate dehydrogenase (LDH), the glycerol-3-phosphate shuttle (GPSh), or the malate-aspartate shuttle (MASh). In BAT, however, lactate production rises only slightly with adrenergic activation and most lactate is oxidized via the TCA cycle, suggesting that LDH primarily consumes NAD⁺ rather than regenerating it [PMID: 30456392; PMID: 37337122; PMID: 30456392; PMID: 37802078; PMID: 40982723]. Consequently, mitochondrial redox shuttles become critical for sustaining cytosolic NAD⁺ supply”.

      We have also provided additional references to support this new section at the introduction.

      __ In Fig. 1A and B-D, there are inconsistencies and duplications in the abbreviation labels. Please check and revise accordingly. __

      Answer 2) We thank the reviewer for this comment. We would like to clarify that Figure 1A is a schematic overview of the system, while Figures 1B–D show protein expression in specific contexts: whole BAT (B), whole liver (C), and BAT mitochondria (D). In Figures 1B and 1C, all components are shown because both cytosolic (MDH1 and GOT1) and mitochondrial proteins (MDH2, GOT2, Aralar 1 and 2 and OGC) are present. In contrast, Figure 1D shows only mitochondrial components (OGC, Aralar1, MDH2, and GOT2). Although Aralar2 is a mitochondrial protein, it was not detected in this study (Forner et al., 2009). Similarly, cytosolic components such as MDH1 and GOT1 are not shown in Figure 1D because they are absent in the mitochondrial fraction. We have revised the figure legend to make these distinctions clearer.

      __ In Fig. S1, the number of n indicated does not match the number of data points shown. Please clarify whether these represent technical replicates or biological replicates, and provide a detailed description of the statistical methods used throughout the manuscript.__

      Answer 3) We thank the reviewer for catching this and allowing us to correct our mistakes. In the revised version, we have corrected the figure legend of Supplementary Figure 1 so that the number of n matches the data points shown.

      __ Please provide details on the normalization strategy used in the BODIPY-C12/BODIPY-493 staining analysis, such as whether fluorescence intensity was quantified as mean or integrated values, and whether the analysis was normalized to lipid droplet area, cell number, or baseline. Since lipolytic stimulation can reduce droplet size and increase droplet number, these factors may bias the results. __

      Answer 4) We thank the reviewer for this important comment and apologize for the lack of detail regarding this analysis. The analysis of BODIPY-C12 and BODIPY-493 was performed by quantifying the mean fluorescence intensity of BODIPY-C12 detected within a mask generated from the BODIPY-493 signal. This approach allowed us to define all lipid droplets and measure the release of previously esterified C12. To account for variability across samples, the data were normalized to each sample’s individual baseline at time point 0 and expressed as fold change relative to this baseline. In the revised manuscript we have included this description in the Methods section (page 18, last paragraph) for clarity and reproducibility, as following:

      “Lipid Droplet area was defined based on Bodipy 493/503 signal, which was used to generate a mask identifying all lipid droplets. Within this mask, the mean fluorescence intensity of BODIPY C12 was quantified over time to monitor the release of previously esterified C12. To account for variability between samples, data were normalized to each sample’s individual baseline at time point 0 and expressed as fold change relative to this baseline.”

      __ The manuscript notes that the unexpected result in Fig. 3K-M in parallel with increased Atgl mRNA expression might be because it does not reflect protein levels or enzymatic activity. To strengthen this point, it is recommended to include data on ATGL and phosphorylation ATGL. __

      Answer 5) We thank the reviewer for this constructive comment. We have clarified these aspects in the revised Results and Discussion sections to reflect this interpretation more accurately as follows:

      “Notably, Atgl mRNA measurement in our study was primarily used as a marker of brown adipocyte differentiation, rather than as a direct indicator of ATGL protein abundance or enzymatic activity. We detected increased Atgl expression only in Ogc KD cells (Fig. S1H), but not in Aralar 1 KD cells (Fig. S1G). This likely does not reflect a major difference in differentiation status, as other brown adipocyte markers and norepinephrine-stimulated respiration were comparable between scramble and knockdown cells (Fig. 2D-G and 2N-O and S1G-H). Although lipolysis was not evaluated in Ogc KD cells, in Aralar 1 KD cells basal lipolysis remained unchanged (Fig. 4D-E and 4G-I), whereas norepinephrine-stimulated lipolysis was delayed or partially inhibited. Notably, the enhanced fatty acid esterification observed in Ogc KD cells despite elevated Atgl expression is not contradictory, since in brown adipocytes lipolysis and re-esterification occur concurrently to sustain high lipid turnover [34].

      __ Red-on-black is not a great color code for IMFs, how about black-and-white? __

      Answer 6) We have changed color text for white on figures 2H and K as suggested.

      __Reviewer #3 __

      Major points;

      1. __ Although in the manuscript Veliova and coworkers demonstrated that MAS is functional in brown adipocytes showing kinetic parameters equivalent to that previously described in other tissues, surprisingly, when its components are downregulated, no effect, or very little, on mitochondrial respiration is found (figure 2). This is an intriguing result since MAS disruption has been widely reported to impair respiration in different cell types and tissues. However, since no direct evidence of MAS dysfunction is provided, it is possible that MAS may still remain partially or fully functional under the conditions used by the authors, and therefore this point needs to be clarified to validate these results.__ Answer 1) We thank the reviewer for the insightful comment and the opportunity to clarify these important points regarding MASh dysfunction validation in our study. We acknowledge the reviewer’s observation that mitochondrial respiration was largely unaffected by MASh component knockdown, which is indeed intriguing. Importantly, as already indicated in our responses to Reviewer 1, we have provided new data showing direct molecular evidence of MASh impairment through substantial reductions in the aspartate/glutamate ratio in Aralar 1 KD cells (new Supplementary Figure S1F). This ratio is a well-established functional readout reflecting MASh activity and amino acid exchange between cytosol and mitochondria, as demonstrated in original experimental studies of MASh function in multiple tissues including brown adipocytes (PMID: 4436323). The reduction in the aspartate/glutamate ratio directly confirms loss of MASh functionality even though respiratory rates remained unchanged, likely due to metabolic compensation by robust glycerol phosphate shuttle (GPSh) activity, as further supported by our data showing reduced glycerol release upon norepinephrine stimulation in Aralar 1 KD cells cells (Figure 4I). This metabolic rerouting maintains cytosolic NAD⁺ regeneration and partially preserves respiration and energy metabolism under these experimental conditions (PMID: 168075; PMID: 40540398; PMID: 37647199). Thus, the combination of metabolomic, respirometry, and functional lipid data strongly indicates that MASh activity was disrupted specifically and effectively by our genetic interventions. This molecular evidence was already signposted in our original manuscript and responses, underscoring that MASh loss of function—and not residual or compensatory MASh activity—is responsible for the phenotypes reported. We greatly appreciate the reviewer’s insightful attention to this critical mechanistic issue and hope this provides clear reassurance that MASh impairment was indeed achieved and functionally validated within our study framework.

      Furthermore, strategies used to downregulate MAS components produce only a partial reduction in mRNA levels, about 70 %, but its outcome on protein levels has not been determined. and the remaining protein level could be sufficient to maintain shuttle activity. Therefore, the effect of silencing at protein level should be analyzed, because as authors also point out on page 16; "mRNA levels may not reflect actual protein levels or activity".

      Answer 2) We thank the reviewer for this important point. Our knockdowns resulted in ~70–80% reduction in mRNA levels. While not complete, this represents a substantial decrease and is sufficient to produce strong functional effects. At the time the experiments were performed, we did not have access to suitable antibodies, and the available antibodies did not provide reliable signals in our samples, which is why we used qPCR to estimate knockdown efficiency. Importantly, we observed clear phenotypic changes in both knockdowns (Aralar and OGC), and both showed very similar phenotypes. This suggests that the level of knockdown was sufficient to significantly impair MAS activity. In the revised version we added new data which further validated the functional impact of Aralar KD (given that this protein has an alternative isoform, as pointed out by the reviewer). We performed metabolomics experiments measuring aspartate and glutamate levels. Our new data shows that the aspartate-to-glutamate ratio is significantly reduced in Aralar KD cells. This ratio serves as a proxy for glutamate catabolism, and the observed decrease suggests reduced glutamate catabolism, likely due to impaired MAS activity. Therefore, the reduced whole-cell aspartate/glutamate ratio serves as a metabolic signature of MAS impairment, consistent with Aralar KD. These data indicate that Aralar is sufficiently downregulated to produce a functional effect, supporting our conclusion that MAS activity is impaired. The results have been added as new supplementary Figure 1 as follows:

      __ In the case of aspartate/glutamate carriers (AGCs) the role of citrin/slc25a13, the second AGC paralog, should also be analyzed. This AGC isoform is discarded based on proteomic data from brown adipose tissue, but, as it is shown in figure 1B, its levels are similar those of Aralar/slc25a12, the only AGC silenced. Besides, primary brown adipocytes differentiated for 7 days are used here, and it is possible that factors such as culture conditions or differentiation itself could alter AGC levels. Therefore, it is necessary to determine the protein levels of citrin/AGC2, and, if necessary, downregulate it together with the Aralar/AGC1 isoform. citrin/AGC2 activity may be responsible for the observed difference between the OGC and Aralar/AGC1 KD adipocytes.__

      Answer 3) We thank the reviewer for this important point. We chose Aralar1 because it is the isoform predominantly expressed in brown adipose tissue (PMID: 23436904). We acknowledge, however, that compensatory increases in Citrin/AGC2 upon Aralar1 knockdown are possible. To address this, we have included new metabolomics data in the revised manuscript (added as Supplementary Figure 1), which provides additional support that downregulation of Aralar1, even if not complete, is sufficient to cause a metabolic change reflected by a reduced aspartate/glutamate ratio in these cells. This functional change supports that the knockdown of Aralar1 alone is sufficient to study its role in brown adipocytes, although minor compensation by Citrin/AGC2 cannot be entirely excluded.

      To address this explicitly, we have added a paragraph to the discussion (page 13, 2nd paragraph) highlighting the potential for partial compensation by Citrin/AGC2 and explaining why the observed metabolic effects are still attributable to Aralar 1 knockdown, as follows:

      “Phenotypes observed in Aralar 1 KD cells closely resemble those in Ogc KD cells, particularly in terms of lipid metabolism alterations and energy expenditure. The main difference lies in mitochondrial morphology, which is altered in Ogc KD cells but remains unchanged in Aralar 1-silenced cells (Fig. 2J,M). Unlike Ogc, which lacks an alternative isoform, Aralar 1 has a paralog Aralar 2 (Citrin, or SLC25A13) that may partially compensate for its loss. This potential compensation might explain the preservation of mitochondrial morphology in Aralar 1 KD cells. Nonetheless, our metabolomics data demonstrate that downregulation of Aralar 1 alone significantly reduces the aspartate/glutamate ratio (Fig. S1D-F). Since this ratio reflects glutamate catabolism, its decrease indicates impaired malate-aspartate shuttle activity and reduced glutamate catabolism. Therefore, although compensation by Aralar 2 cannot be entirely excluded, Aralar 1 KD alone suffices to cause substantial impairment of malate-aspartate shuttle function”.

      • *

      __ OGC and Aralar/AGC1 silencing is associated with the accumulation of smaller lipid droplets and impaired norepinephrine-induced lipolysis, but no mechanistical evidence is provided. The authors discuss a role for AMPK signaling associated with the redox unbalance generated by MAS disfunction but neither of them is proven.__

      Answer 4) We thank the reviewer for this insightful question, which was also raised by Reviewer 1 (see Reviewer 1, Question 1 above). Here, we aim to clarify the mechanistic basis by which MASh may regulate lipolysis in BAT in a complementary and refined manner.

      Our new data directly addresses this issue by examining cytosolic redox status through the lactate/pyruvate ratio, a well-established indicator of NAD⁺/NADH balance. Under basal conditions, Aralar 1 KD cells showed no change in this ratio compared to controls, indicating preserved cytosolic NAD⁺ regeneration despite reduced MASh activity. This observation is consistent with previous studies demonstrating the resilience of cellular redox homeostasis through overlapping NAD⁺-regenerating systems (PMID: 40540398; PMID: 37647199). The new results are shown in Supplementary Figure 1.

      At the same time, we detected a marked decrease in the aspartate/glutamate ratio in Aralar 1 KD cells, confirming impaired MASh function and reduced amino acid exchange between cytosol and mitochondria. The lack of redox imbalance likely reflects compensatory mechanisms, most notably the GPSh, which is highly active in brown adipocytes. Supporting this view, Aralar 1 KD cells displayed significantly reduced glycerol release upon norepinephrine stimulation (Fig. 4I), suggesting enhanced metabolic cycling of G3P through mitochondrial and cytosolic G3PDH, thereby sustaining NAD⁺ regeneration and redox equilibrium.

      We therefore propose that, although MASh normally facilitates NADH export and aspartate/glutamate exchange, its loss activates GPSh-mediated compensation that preserves cytosolic NAD⁺/NADH balance and maintains lipolytic flux during adrenergic stimulation. These findings refine our mechanistic understanding of how redox shuttle interplay supports glycolytic and lipolytic processes in BAT. Future studies employing NAD⁺/NADH sensors and simultaneous blockade of both shuttles will be essential to dissect these compensatory mechanisms in greater detail.

      Minor points;

      1. __ Is pyruvate present in respiration medium? If so, no effect on respiration is expected as pyruvate reverses the respiratory defects caused by MAS inactivation. __ Answer 1) Thanks for this important insight. In fact, as indicated in the methods section (page 17, last paragraph) all respirometry experiments were carried out in the absence of pyruvate in the media. Therefore, preserved overall respiratory rates in Aralar 1 and Ogc KD cannot be explained by compensatory pyruvate oxidation present in the media.

      __ In figure 4, only data from Aralar KD cells in relation to norepinephrine-stimulated lipolysis are shown. What happens when OGC is silenced? __

      Answer 2) This is a very interesting and relevant question. We did not perform the norepinephrine-stimulated lipolysis experiments in Ogc-silenced cells, since in most of the other experiments presented in the manuscript Ogc and Aralar 1 silencing converged to very similar, if not identical, phenotypes. Based on these consistent overlaps, we anticipate that Ogc KD would likely lead to comparable effects on lipolysis as observed in Aralar 1 KD cells. Nonetheless, we fully agree that direct assessment of lipolysis upon Ogc KD would strengthen this conclusion, and we consider this an important aspect for future studies.

      __ Nomenclature used for mitochondrial carriers is confusing. Please do not use OGC1 as there is only one isoform. Furthermore, different names for OGC are used in the manuscript; oxoglutarate carrier, malate-ketoglutarate carrier or OGC1/SLC25A11. In the case of citrin/AGC2, Aralar2 is used and is a uncommon designation.__

      Answer 3) We corrected all OGC naming in the revised manuscript. We also changed “aralar 2” for “citrin” since this was more commonly used in the literature.

      __ Some panels of figures 3 and 4 should be improved. Panels 3J, 3L and 4G are difficult to see. In panel 3J please clarify UT line from untreated/NE, are they not transduced? No equivalents conditions are assayed in Aralar KD and OGC KO cells.__

      Answer 4) We thank the reviewer for giving us the opportunity to improve this figure and apologize for the confusing labeling. In the revised version, we have clarified the labels in panels 3J, 3L, and 4G to improve visibility, and we have added descriptions of all abbreviations to the figure legends, accordingly.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This manuscript presents novel findings on the role of the malate-aspartate shuttle (MASh) in brown adipose tissue (BAT). Building on the recent advances in elucidating the contribution of MASh to BAT metabolism, the present study provides new evidence by offering direct biochemical validation using a reconstituted BAT mitochondrial system and by introducing genetic data on the mitochondrial carriers OGC1 and Aralar1, thereby adding significant new insight. However, the following points require further clarification.

      Major points:

      1. In the current manuscript, mitochondrial morphology (area, aspect ratio, and roundness) was analyzed in OGC1 KD cells using TMRE, whereas MitoTracker Deep Red (MTDR) was used in Aralar1 KD cells. Notably, TMRE is a ΔΨm-dependent probe. The signal intensity can change, or the distribution may reflect alterations in membrane potential rather than true morphological changes. Therefore, the observed differences in OGC1 KD cells based on TMRE staining may be confounded by the dye's functional dependence, potentially biasing the conclusions. It is recommended to evaluate mitochondrial morphology with consistent trackers across conditions. In addition, in the subsequent OCR analysis, mitochondrial area was used for normalization. Please clarify which staining method was employed, and provide justification for its suitability.

      Minor points:

      1. In the introduction, the authors state that "LDH activity increases in the context of BAT activation". This point is important for the logic of the manuscript, reference [10] cited here is not sufficient to support this claim. It is recommended to provide appropriate references to support this statement.
      2. In Fig. 1A and B-D, there are inconsistencies and duplications in the abbreviation labels. Please check and revise accordingly.
      3. In Fig. S1, the number of n indicated does not match the number of data points shown. Please clarify whether these represent technical replicates or biological replicates, and provide a detailed description of the statistical methods used throughout the manuscript.
      4. Please provide details on the normalization strategy used in the BODIPY-C12/BODIPY-493 staining analysis, such as whether fluorescence intensity was quantified as mean or integrated values, and whether the analysis was normalized to lipid droplet area, cell number, or baseline. Since lipolytic stimulation can reduce droplet size and increase droplet number, these factors may bias the results.
      5. The manuscript notes that the unexpected result in Fig. 3K-M in parallel with increased Atgl mRNA expression might be because it does not reflect protein levels or enzymatic activity. To strengthen this point, it is recommended to include data on ATGL and phosphorylation ATGL.
      6. Red-on-black is not a great color code for IMFs, how about black-and-white?

      Referees cross-commenting

      To my opinion, all three reviewers have provided constructive criticism of the work.

      Significance

      The work dives deeper into mitochondrial function and metabolism of brown adipocytes and, thus, advances our understanding of thermogenesis in an incremental fashion. The work will be relevant to brown adipose tissue researchers and mitochondrial biologist.

    1. Author response:

      Reviewer 1:

      Comment 1. The reviewer was under the impression that that we did not perform biological replicates of our ChIP-seq experiments. All ChIP-seq (and ATAC-seq) experiments were performed with biological replicates and the Pearson’s correlations (all >0.9) between replicates were provided in Supplementary Table 1. We had indicated this in the text and methods but will try to make this even clearer.

      Reviewer 2:

      Comment 2. The reviewer states that our claim of H3K115ac being associated with fragile nucleosomes is based solely on MNase sensitivity and fragment length. This is not correct. Figure 3C and D show the results of sucrose gradient sedimentation experiments, followed by ChIP-seq clearly showing that H3K115ac fractionates with chromatin particles that are enriched for fragile nucleosomes and subnucleosomes. By contrast, H3K115ac is not enriched in stable mononucleosome

      Comment 3. The reviewer states that our H3K122ac and H3K64ac comparison rely on publicly available datasets. We would emphasize that these are our own datasets generated and published previously (Pradeepa et. al., 2016) but using exactly the same native MNase ChIP protocol as used here for H3K115ac and processed with identical computational pipelines.

      Reviewer 3:

      Reviewer 3 is mistaken in thinking our ChIP experiments are performed under cross-linked conditions. As clearly stated in the main text and methods, all our ChIP-seq for histone modifications is done on native MNase-digested chromatin – with no cross-linking. This includes the spike-in experiment shown in Fig S1B to test H3K115ac antibody specificity against the bar-coded SNAP-ChIP® K-AcylStat Panel from Epicypher. We could not include H3K115ac bar-coded nucleosomes in that experiment since they are not available in the panel. 

      Following that, we would propose to make minor revisions in response to specific reviewer recommendations before posting a version of record. These would include:

      (1) Figure 2: title needs change: "H3K115ac marks CpG island promoters poised for activation". this is to make sure it reads with the title for the corresponding section in the main text. Also see: Reviewer 1 comment 7 in Recommendations part. 

      (2) Figure S2B: legend should read: "Gene ontology analysis for the set of genes analysed in Figure 2C"

      (3) Figure F4D: Provide the replicates for western blot 

      (4) Figure 4A,B: Corrected formatting issues.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      The manuscript by Bru et al. focuses on the role of vacuoles as a phosphate buffering system for yeast cells. The authors describe here the crosstalk between the vacuole and the cytosol using a combination of in vitro analyses of vacuoles and in vivo assays. They show that the luminal polyphosphatases of the vacuole can hydrolyse polyphosphates to generate inorganic phosphate, yet they are inhibited by high

      concentrations. This balances the synthesis of polyphosphates against the inorganic phosphate pool. Their data further show that the Pho91 transporter provides a valve for the cytosol as it gets activated by a decline in inositol pyrophosphate levels. The authors thus demonstrate how the vacuole functions as a phosphate buffering system to maintain a constant cytosolic inorganic phosphate pool. 

      This is a very consistent and well-written manuscript with a number of convincing experiments, where the authors use isolated vacuoles and cellular read-out systems to demonstrate the interplay of polyphosphate synthesis, hydrolysis, and release. The beauty of this system the authors present is the clear correlation between product inhibition and the role of Pho91 as a valve to release Pi to the cytosol to replenish the cytosolic pool. I find the paper overall an excellent fit and only have a few issues, including: 

      (1) Figure 3: The authors use in their assays 1 mM ZnCl2 or 1mM MgCl2. Is this concentration in the range of the vacuolar luminal ion concentration? Did they also test the effect of Ca2+, as this ion is also highly concentrated in the lumen? 

      The concentrations inside vacuoles reach those values. However, given that polyP can chelate divalent metal ions, what would matter are the concentrations of free Zn<sup>2+</sup> or Mg<sup>2+</sup> inside the organelle. These are not known. This is not critical since we use those two conditions only as a convenient tool to differentiate Ppn1 and Ppn2 activity in vitro. In our initial characterisation of Ppn2 (10.1242/jcs.201061), we had also tested Mn, Co, Ca, Ni, Cu. Only Zn and Co supported activity. Ca did not. Andreeva et al. (10.1016/j.biochi.2019.06.001) reached similar conclusions and extended our results.

      (2) Regarding the concentration of 30 mM K-PI, did the authors also use higher and lower concentrations? I agree that there is inhibition by 30 mM, but they cannot derive conclusions on the luminal concentration if they use just one in their assay. A titration is necessary here. 

      The concentration of 30 mM was not chosen arbitrarily. It is the luminal P<sup>i</sup> concentration that the vacuoles reached through polyP synthesis and hydrolysis when they entered a plateau of luminal P<sup>i</sup>. We consider this as an upper limit because polyP kept increasing which luminal P<sup>i</sup> did not. Thus, there is no physiological motivation for trying higher values. We have nevertheless added a titration to the revised version (new Fig. 3A).

      (3) What are the consequences on vacuole morphology if the cells lack Pho91? 

      We had not observed significant abnormalities during a screen of the genome-wide deletion collection of yeast (10.1371/journal.pone.0054160), nor in other experiments with pho91 mutants, which we have not included in this manuscript due to a lack of effect.

      (4) Discussion: The authors do not refer to the effect of calcium, even though I would expect that the levels of the counterion should affect the phosphate metabolism. I would appreciate it if they would extend their discussion accordingly. 

      The situation is much more complex because Ca2+ is not the only counterion. Major pools of counterions (up to hundreds of mM) are constituted by vacuolar lysine, arginine, polyamines, Mg, Zn etc. Their interplay with polyP is probably complex and worth to be treated in a dedicated project. If we wanted to limit the discussion of this complexity not to the simple statement that it is not understood, which is not very useful, we would have to engage in a lot of speculation. We feel that this would make the discussion lose focus and not contribute concrete insights.

      (5) I would appreciate a brief discussion on how phosphate sensing and control are done in human cells. Do they use a similar lysosomal buffer system? 

      Mammalian cells have their Pi exporter XPR1 mainly on a lysosome-like compartment (10.1016/j.celrep.2024.114316). Whether and how it functions there for Pi export from the cytosol is not entirely clear. We have addressed this situation in the revised discussion section.

      Reviewer #2 (Public review): 

      Summary: 

      This manuscript presents a well-conceived and concise study that significantly advances our understanding of polyphosphate (polyP) metabolism and its role in cytosolic phosphate (Pi) homeostasis in a model unicellular eukaryote. The authors provide evidence that yeast vacuoles function as dynamic regulatory buffers for Pi homeostasis, integrating polyP synthesis, storage, and hydrolysis in response to cellular metabolic demands. The work is methodologically sound and offers valuable insights into the conserved mechanisms of phosphate regulation across eukaryotes. 

      Strengths: 

      The results demonstrate that the vacuolar transporter chaperone (VTC) complex, in conjunction with luminal polyphosphatases (Ppn1/Ppn2) and the Pi exporter Pho91, establishes a finely tuned feedback system that balances cytosolic Pi levels. Under Pi-replete conditions, inositol pyrophosphates (InsPPs) promote polyP synthesis and storage while inhibiting polyP hydrolysis, leading to vacuolar Pi accumulation. 

      Conversely, Pi scarcity triggers InsPP depletion, activating Pho91-mediated Pi export and polyP mobilization to sustain cytosolic phosphate levels. This regulatory circuit ensures metabolic flexibility, particularly during critical processes such as glycolysis, nucleotide synthesis, and cell cycle progression, where phosphate demand fluctuates dramatically. 

      From my viewpoint, one of the most important findings is the demonstration that vacuoles act as a rapidly accessible Pi reservoir, capable of switching between storage (as polyP) and release (as free Pi) in response to metabolic cues. The energetic cost of polyP synthesis-driven by ATP and the vacuolar proton gradient-highlights the evolutionary importance of this buffering system. The study also draws parallels between yeast vacuoles and acidocalcisomes in other eukaryotes, such as Trypanosoma and Chlamydomonas, suggesting a conserved role for these organelles in phosphate homeostasis. 

      Weaknesses: 

      While the manuscript is highly insightful, referring to yeast vacuoles as "acidocalcisome-like" may warrant further discussion. Canonical acidocalcisomes are structurally and chemically distinct (e.g., electrondense, in most cases spherical, and not routinely subjected to morphological changes, and enriched with specific ions), whereas yeast vacuoles have well-established roles beyond phosphate storage. A comment on this terminology could strengthen the comparative analysis and avoid potential confusion in the field.  

      Yeast vacuoles show all major chemical features of acidocalcisomes. They are acidified, contain high concentrations of Ca, polyP (which make them electron-dense, too), other divalent ions, such as Mg, Zn, Mn etc, and high concentrations of basic amino acids. Thus, they clearly have an acidocalcisome-like character. In addition, they have hydrolytic, lysosomelike functions and, depending on the strain background, they can be larger than acidocalcisomes described e.g. in protists. We have elaborated on this point in the introduction of the revised version.

      Reviewer #3 (Public review): 

      Bru et al. investigated how inorganic phosphate (Pi) is buffered in cells using S. cerevisiae as a model. Pi is stored in cells in the form of polyphosphates in acidocalcisomes. In S. cerevisiae, the vacuole, which is the yeast lysosome, also fulfills the function of Pi storage organelle. Therefore, yeast is an ideal system to study Pi storage and mobilization. 

      They can recapitulate in their previously established system, using isolated yeast vacuoles, findings from their own and other groups. They integrate the available data and propose a working model of feedback loops to control the level of Pi on the cellular level. 

      This is a solid study, in which the biological significance of their findings is not entirely clear. The data analysis and statistical significance need to be improved and included, respectively. The manuscript would have benefited from rigorously testing the model, which would also have increased the impact of the study. 

      It is not clear to us what the reviewer would see as a more rigorous test of the model.  

      Reviewer #1 (Recommendations for the authors): 

      (1) Figure 2: Why do the authors label the blue curve in A and B as BY and in C and D as WT? Is this a different genetic background they used here? This should be specified in the legend. 

      No, it is the same background. The figures had been reshuffled before submission and we overlooked to replace "BY" by "WT". This has been corrected. Now we consistently use WT in all figures

      (2) Figure 4 has different scaling for the two panels, which should be labeled as A and B. I am aware that the authors do this for comparison, but it is rather confusing at first glance. I recommend having them at the same scale. 

      We chose this representation on two separate scales because this figure shall primarily illustrate that the shift between pho91 and WT curves vanishes in the presence of IP7. We now highlight in the figure legend that the scales are different to avoid confusion.  

      (3) Figure 8: I would appreciate a model with normal and low Pi concentrations in comparison, as this is what the authors worked out. 

      We have modified the figure. It now compares Pi-rich and Pi-limited scenarios.

      (4) Minor issue: Wouldn't it make more sense to show the molar concentration in the Figures rather than the nmol of Pi/ug of protein? I am aware that this would require information on the vacuole volume rather than the reaction volume, and the authors do this calculation later on. 

      It depends. We often chose this representation because it illustrates the price to pay (metabolic input in terms of protein that must be dedicated to this task) to sequester a certain quantity of P<sup>i</sup>. But, as we provide the corresponding P<sup>i</sup> concentration in the text, this information is accessible to the reader, too.

      Reviewer #2 (Recommendations for the authors): 

      As stated above in the weaknesses section, while functional parallels exist, canonical acidocalcisomes are structurally and chemically distinct, typically smaller, electron-dense, and enriched with cations. Whereas yeast vacuoles are larger, multifunctional organelles with well-established roles beyond phosphate storage. Explicitly addressing these differences would strengthen the comparative framework and prevent potential confusion in interpreting the evolutionary relationships between these organelles. 

      We agree to some degree, which is the reason why we refer to vacuoles as acidocalcisome-like organelles. In fact, vacuoles share virtually all defining chemical traits of acidocalcisomes. They just have a second functional domain as hydrolytic, lysosome-like organelles. Given the plasticity of endo-lysosomal compartments, and acidocalcisomes belong to this group because of their biogenesis through the AP3 pathway, this is not shocking to us. But the reviewer's comment made us realize that it is better to explicitly address this point. We have added a section to the introduction to do this.

      Reviewer #3 (Recommendations for the authors): 

      (1) Page 8: It is unclear why the authors only estimated the Pi concentration in wild-type vacuoles. This should also be done for vacuoles from other strains. 

      This information is inherent in Figure 2. PolyP hyperaccumulating strains show the same plateau as the wildtype, meaning that they also reach around 30 mM luminal Pi concentration, whereas vtc4 vacuoles reach only around 1/10th of that increase, indicating that they remain at 3 mM. We mention this now in the text.

      (2) The attempts of the localization of Pho91 through tagging are not satisfactory. The author described different localizations for Pho91 depending on whether it was tagged on the N- or C-terminus or when Nterminally tagged and overexpressed using two strong promoters. While it is not uncommon that proteins show different localization patterns, depending on where the tag is inserted, it is possible that one of the tags would reflect the localization of the endogenous protein. There is an easy way to test this, in particular when Pho91 is endogenously tagged. pho91∆ has reported phenotypes such as abnormal vacuolar morphology or increased autophagy. They could also measure PI content in vacuoles. The authors could compare the phenotypes of the endogenously tagged strains with WT and a pho91∆ strain. 

      Indeed, the attempts to localise the protein through fluorescent tags are unsatisfactory, in our hands as in the hands of others. We would not have created a series of many different tagged versions (we present only a selection of these in the manuscript) if the creation of a faithful reporter for Pho91 localisation were so straightforward. Expression from the endogenous promoter yields quite low signals (which is why others have overexpressed their GFP fusion from strong promotors). But overexpression brings at least a significant part of the protein to the cell surface, where it can then function as Pi importer and suffice to restore much of the maximal Pi uptake capacity that genuine plasma membrane transporters provide and support normal growth of the cells (Wykoff & O’Shea, 2001). But the localisation pattern of Pho91-GFP, likewise overexpressed from a strong promotor, does not reflect this plasma membrane localisation (see the references that the reviewer mentioned under (3)). The published overexpressed GFP-fusions localise only to the vacuole, suggesting that even in this case the GFP tag may create an artefact. Therefore, we went through a large variety of Pho91 gene fusions, which led us to the conclusion that the protein is very sensitive to tags at both ends and that fusion proteins hence are unlikely to reliably report the correct location of the protein. Given this, we resorted to quantitative proteomics to clarify the issue. This quantitative experiment goes beyond previously published proteomics analyses that the reviewer mentions under (3), which found the protein in the vacuolar fraction but did not calculate the enrichment factors, which is crucial. 

      A strong phenotype of abnormal vacuolar morphology is not apparent in our cultures. 

      (3) Moreover, Pho91 has been identified as a component enriched in vacuolar-mitochondria contact sites (vCLAMP), and this localization was confirmed with GFP-Pho91 (PMID: 25026036). Likewise, PMID: 35175277 also detected Pho91 by mass spectrometry as a vacuolar protein and showed endogenously tagged GFP-Pho91 on the vacuole (co-staining with Vph1). The authors may request the strains from the authors of these papers and use them for their experiments. PMID: 17804816, the oldest of the three reports (from 2007) reports a GFP-Pho91 under either TEF or ADH promoter that localizes to the vacuole. They also showed that the fusion protein is functional. These and other experiments led them to conclude that Pho91 exports phosphate from the vacuolar lumen to the cytoplasma. 

      We have now included these references. As argued above, we have analysed also the strains from PMID17804816. The observed clear localisation of the fusion protein to vacuoles is only visible upon overexpression, not upon expression from the endogenous locus. Apparently also this construct is unlikely to report Pho91 localisation reliably (though, by chance, overexpression leads it to the correct location). Thus, we maintain our conclusion that C- or N-terminally GFP-tagged versions of Pho91 are unreliable tools for localising the protein.

      (4) The impact of pho91∆ on Pho4-GFP nuclear localization is modest at best (increase from 5% of cells showing Pho4-GFP in the nucleus in WT vs 10% in pho91∆), and only somewhat stronger in ppn1∆/ppn2∆. This means 90% of pho91∆ cells do not respond, and Pho4-GFP stays cytoplasmic. It is unclear how the author can derive a meaningful conclusion from these data. Moreover, are these data really supporting the model, or do these data rather indicate that there are additional factors/pathways needed? What is the biological significance of the marginal increase from 5% to 10% of cells that would respond? What happens to the cells that cannot respond? Will they die or at least have a growth disadvantage? It would be useful to provide some functional studies. 

      We should have explained the nature of the assay better. The experiment exploits the fact that dividing yeast cells transiently fall into a state of Pi scarcity during S-phase. Since S-phase is less than a quarter of the cell cycle, only a small fraction of the cells transiently activates the PHO pathway. These cannot be well characterised by ensemble assays, but microscopy circumvents this background of the whole population and picks them up very clearly, allowing to quantify them. We have adapted the respective chapter in the results section to improve the description of this experiment.

      (5) The quantification of the data is suboptimal, as in most assays the mean and standard error of the mean (SEM) are given. SEM is not really appropriate in these cases because it gives only the error of the mean and not of the entire data. Therefore, the standard deviation (SD) is needed, which reports on the variability of the data, and which is usually much larger than the SEM. Using the SD, would also allow the authors to do proper statistical analysis, which is missing entirely in this manuscript. 

      SEM also comprises the variability of the data. It is linked with the SD (SEM=SD/SQRT(n)), but SEM also considers the number of the experiments n. The main goal is to compare the means, and SEM is an appropriate and frequently used tool for this because it illustrates how well the arithmetic mean may estimate the true mean of the population. Therefore, we kept the SEM but have added tests of significance for the differences shown.

      (6) Statistical testing in Figure 7 is essential as the effects are very small. Again, are these changes big enough for a biologically meaningful response? The authors should at least discuss this. 

      Our previous time course analyses of InsPP dynamics, performed under comparable conditions as in this study, showed that InsP8 decreases by around 50% in the first 30 min after transfer to Pi starvation (DOI: https://doi.org/10.7554/eLife.87956) and that this decline is already sufficient to trigger the PHO starvation program, as assessed by Pho4-GFP translocation into the nucleus. Thus, a 50% decrease, which is observed in ppn1 ppn2 mutants, is functionally significant. We have now also evaluated statistical significance in Fig. 7, which is given for the 50% reduction of InsP8 and 1-InsP7 in ppn1 ppn2. 

      Minor points: 

      (1) There are a number of smaller edits (use of italic or better the absence thereof, lacking information in the reference list, and some typos). 

      Thank you. We have corrected those.

      (2) The exact n should be given in the Figure legend. 

      Corrected.

      (3) Page 8, line 8: it would be nice to have a picture of the wild-type vacuoles and what you measured. 

      We now present a sample image in the new Suppl. Fig. 1.

      (4) PMID: 11779791 showed already that Pho91 cannot rescue the absence of the plasma membrane Pi transporters. This study should be at least cited. 

      This is not quite correct. The study that the reviewer mentions showed that Pho91 supports slower growth and the authors concluded that "A synthetic lethal phenotype was observed when (all) five phosphate transporters were inactivated...". We had cited the same group and the same first author, just using their later study (Wykoff et al., 2007) that had recapitulated the results from PMID11779791 and showed in addition quite good growth of the PHO91 expressing strain on YPD (Suppl. Fig. 2). We had obtained the strains from this group. In reproducing their experiments, we noticed that the growth of Pho91 that these authors had observed is due to incomplete repression of Pho84. They had overexpressed Pho84 from a galactose inducible promotor to generate a background with a regulatable Pi transporter. This trick allowed them to conveniently manipulate the strain and reduce (but not abolish) Pho84 expression by transferring the cells from galactose to glucose for their experiments. Therefore, we chose a more rigorous plasmid shuffling strategy to test the individual P<sub>i</sub> transporter, which allows an assessment without the leaky background expression of Pho84 on glucose. In contrast to O'Shea and colleagues, we observed zero growth of a strain expressing only PHO91. We have revised the results section to make this discrepancy more evident and provide a better motivation for our experiment.

      (5) It would be nice to see the actual data in Figure 6; not only a quantification. 

      We illustrate the phenotype of nuclear Pho4-GFP in panel A. Showing all the images necessary to appreciate the differences between the strains would require including many dozens of images into the figure, which would not be useful.

    1. Synthèse du Webinaire : « Mon enfant est différent. Et alors ? »

      Résumé

      Ce document de synthèse analyse les informations clés du webinaire « Mon enfant est différent. Et alors ? », organisé par la Fédération des Conseils de Parents d'Élèves (FCPE).

      L'événement visait à informer, dédramatiser et fournir des outils concrets aux familles d'enfants présentant des spécificités neurodéveloppementales.

      En partenariat avec trois associations expertes — **HyperSupers -

      • TDAH France**,
      • la Fédération Française des DYS (FFDYS) et
      • l'Association Nationale Pour les Enfants Intellectuellement Précoces (ANPEIP) —,

      le webinaire a abordé le

      • Trouble du Déficit de l'Attention avec ou sans Hyperactivité (TDAH),
      • les troubles DYS, et
      • le Haut Potentiel Intellectuel (HPI).

      Les points essentiels qui en ressortent sont :

      1. Prévalence et Normalisation : Les troubles et spécificités abordés sont courants, représentant en moyenne plus d'un élève par classe en France.

      Il est crucial de comprendre qu'il s'agit de conditions neurodéveloppementales, et non de conséquences d'une mauvaise éducation parentale ou d'un manque d'efforts de la part de l'enfant.

      2. Importance du Diagnostic : Un repérage précoce et un diagnostic précis et différentiel sont fondamentaux. Ils permettent de mettre en place un accompagnement adapté, d'éviter les interprétations erronées des comportements de l'enfant (paresse, provocation) et de prévenir la dégradation de l'estime de soi.

      3. Vers une École Inclusive : L'inclusion scolaire est un droit et une nécessité. La clé réside dans une collaboration étroite entre les parents, les équipes éducatives et les associations.

      La FCPE réaffirme que « l'école inclusive, ce n'est pas une école à part, c'est l'école pour toutes et tous ».

      4. Ressources et Soutien : Des dispositifs d'accompagnement scolaire (PAI, PAP, PPS) existent pour répondre aux besoins spécifiques des élèves.

      Les associations jouent un rôle indispensable en offrant une expertise, des ressources documentaires, une formation et un soutien par les pairs, brisant ainsi l'isolement souvent ressenti par les familles.

      Contexte et Objectifs du Webinaire

      Organisé par la FCPE et animé par Aline, secrétaire générale adjointe, le webinaire a été conçu comme un « moment d'échange utile, bienveillant et concret ».

      L'objectif principal était de répondre à la préoccupation de nombreux parents : « Mon enfant ne rentre pas tout à fait dans les cases, comment l'aider à s'épanouir à l'école ? ».

      Le constat de départ est que ces différences, bien que faisant « partie du paysage ordinaire de l'école », sont trop souvent « ni suffisamment repérées ni suffisamment accompagnées ».

      Prévalence des Troubles et Spécificités en Milieu Scolaire

      Catégorie

      Prévalence

      Représentation en Classe

      Troubles DYS

      5 à 8 % des enfants

      Environ 1 à 2 élèves par classe

      TDAH

      Environ 5 % des enfants

      Environ 1 élève par classe

      Haut Potentiel Intellectuel (HPI)

      2 à 3 % des enfants

      Environ 1 élève par classe

      Total combiné

      > 10 % des élèves

      Plus d'un enfant par classe en moyenne

      Analyse des Troubles et Spécificités

      1. Le Trouble du Déficit de l’Attention avec ou sans Hyperactivité (TDAH)

      Présenté par Daniel de HyperSupers - TDAH France, le TDAH est un trouble du neurodéveloppement (TND) qui affecte les fonctions cérébrales liées à l'organisation de la pensée, la mémoire, la communication et l'apprentissage.

      Symptômes Cardinaux : Le TDAH se manifeste à travers trois axes principaux dont l'intensité varie selon les individus :

      Inattention : Difficulté à maintenir son attention, oublis fréquents, tendance à être distrait ("dans la lune"), évitement des tâches exigeant une concentration soutenue.

      C'est le symptôme le plus persistant à l'âge adulte.  

      Hyperactivité : Agitation motrice incessante chez l'enfant, qui se transforme souvent en hyperactivité mentale (idées qui fusent) à l'adolescence et à l'âge adulte.  

      Impulsivité : Difficulté à attendre son tour, tendance à interrompre les autres, réponses précipitées avant la fin d'une question.

      Prévalence et Comorbidités :

      ◦ Touche environ 5 % des enfants (350 000 en France) et 3 % des adultes.  

      ◦ Les garçons sont deux fois plus fréquemment diagnostiqués, mais le trouble est sous-diagnostiqué chez les filles, où l'inattention est souvent le symptôme prédominant.  

      50 % des personnes avec TDAH présentent au moins un trouble associé (comorbidité), comme des troubles DYS, un trouble du spectre de l'autisme, des troubles anxieux ou dépressifs, ou un trouble oppositionnel avec provocation.

      Diagnostic et Prise en Charge :

      ◦ Le diagnostic est clinique et se base sur des questionnaires validés (ex: DSM-5), qui exigent que les symptômes soient présents avant 12 ans, dans au moins deux sphères de vie (école, famille), et qu'ils aient un impact significatif sur la qualité de vie.  

      ◦ La prise en charge est multimodale : psychoéducation (expliquer le trouble à l'enfant et aux parents), aménagements scolaires (PAP, PPS), guidance parentale (ex: méthode Barkley), et éventuellement un traitement médicamenteux.

      Impact Scolaire : L'élève avec TDAH peut être perçu comme rêveur, perturbateur ou paresseux.

      Il a du mal à suivre les consignes, perd ses affaires et fournit un rendement scolaire faible malgré une grande dépense d'énergie, ce qui entraîne une fatigue importante et une baisse de l'estime de soi.

      2. Les Troubles Spécifiques des Apprentissages (Troubles DYS)

      Présentés par Fabienne de la Fédération Française des DYS (FFDYS), les troubles DYS sont également des troubles du neurodéveloppement.

      Principes Fondamentaux :

      ◦ Ils ne sont ni une maladie (on n'en guérit pas), ni un trouble psychique, ni une déficience intellectuelle ou sensorielle. L'intelligence est préservée.  

      ◦ Ils ne sont pas dus à un manque de stimulation ou à un environnement socioculturel défavorable.  

      ◦ Leur caractéristique centrale est une difficulté à automatiser certaines fonctions cognitives, ce qui oblige l'enfant à être en surcharge cognitive permanente, provoquant une grande lenteur et une fatigue intense.

      Les Différents Troubles DYS :

      Dyslexie / Dysorthographie : Trouble de l'identification des mots écrits. La lecture est lente, hachée (déchiffrage), ce qui entrave l'accès au sens.

      Il s'accompagne quasi systématiquement d'une dysorthographie (difficulté à automatiser les règles orthographiques).  

      Dysphasie (Trouble Développemental du Langage) : Trouble de la communication orale, affectant la compréhension et/ou l'expression. L'enfant doit faire un effort majeur pour comprendre les consignes orales et pour se faire comprendre.  

      Dyscalculie : Trouble de la cognition logico-mathématique, affectant la compréhension du sens du nombre, des quantités et des opérations.  

      Dyspraxie (Trouble Développemental de la Coordination) / Dysgraphie : Trouble de la planification et de l'automatisation des gestes.

      L'enfant est qualifié de "maladroit", a des difficultés avec la motricité fine (écriture, laçage, utilisation des couverts), l'organisation spatiale (géométrie, lecture de tableaux) et la gestion du temps.

      3. Le Haut Potentiel Intellectuel (HPI)

      Présenté par Frédéric de l'ANPEIP, le HPI n'est pas un trouble mais une spécificité reconnue par l'Éducation Nationale comme un "besoin éducatif particulier".

      Définition et Identification :

      ◦ Il se caractérise par un fonctionnement intellectuel qualitativement différent, validé par des études en neuro-imagerie.   

      ◦ L'identification repose sur un bilan psychologique complet réalisé par un professionnel, et ne se résume pas à un chiffre de QI (supérieur à 130). Le bilan analyse l'estime de soi, l'anxiété, les relations sociales, etc.  

      Un individu ne se résume pas à un chiffre.

      Caractéristiques :

      ◦ Questionnements incessants sur des sujets existentiels (vie, mort), grande curiosité.   

      ◦ Compréhension très rapide, capacité à faire des liens et des raccourcis.  

      ◦ Grande sensibilité et sens critique développé très tôt.

      Concepts Clés :

      Dyssynchronie : Un décalage entre le développement intellectuel (souvent en avance) et les développements affectif, social ou psychomoteur (qui correspondent à l'âge réel).

      Un enfant de 6 ans peut avoir une pensée très mature mais la motricité d'un enfant de son âge, rendant l'écriture difficile.  

      Double ou Triple Spécificité : Un enfant HPI peut également présenter un TDAH et/ou des troubles DYS.

      Le HPI peut alors masquer les troubles pendant un temps, rendant le diagnostic complexe et souvent tardif (fin de collège ou lycée).

      Impact Scolaire : Le décalage peut mener à l'ennui, à un désinvestissement et à des difficultés de socialisation. L'appréciation "peut mieux faire, a des capacités mais ne les exploite pas" est fréquente.

      L'Accompagnement des Enfants et le Soutien aux Familles

      Le Rôle de la FCPE

      La FCPE, en tant qu'association nationale de parents d'élèves, est présente à toutes les strates du système éducatif.

      Représentation : Elle siège dans les instances nationales, régionales, départementales et locales (conseil d'école, conseil d'administration, CESCE, commissions d'appel, etc.).

      Partenariats : Elle collabore avec les municipalités, les académies, les ministères, mais aussi avec des organismes comme la CDAPH (Commission des droits et de l'autonomie des personnes handicapées), la CPAM, l'ARS et la MDA (Maison des Adolescents).

      Missions : Elle porte une attention particulière aux droits de l'enfant et au respect des besoins éducatifs particuliers. Elle fait partie de collectifs comme le Réseau Éducation Sans Frontières (RESF) pour accompagner les familles en situation de précarité.

      Le Soutien des Associations Partenaires

      Chaque association offre un soutien crucial basé sur l'expertise et la pair-aidance.

      Association

      Actions et Ressources Clés

      HyperSupers - TDAH France

      • - Soutien par les pairs : Groupes de parole (GSP), forums en ligne, hotline "SOS Rentrée Scolaire".<br>\
        • Ressources : Site internet (tdah-france.fr), brochures, livres, web-documentaires, chaîne YouTube.<br>\
        • Formation : Modules de formation en ligne pour les adhérents.<br>\
        • Plaidoyer : Représentation dans les instances nationales de santé et du handicap.

      Fédération Française des DYS (FFDYS)

      • - Réseau Local : Fédération de 150 associations locales accessibles via une carte sur le site ffdys.com.<br>\
        • Événements : Journée Nationale des DYS, colloques scientifiques (disponibles en replay).<br>\
        • Information : Podcasts, vidéos, guides pratiques (orientation, emploi).<br>\
        • Formation : Organisme de formation pour les professionnels de l'éducation, de la santé et de l'emploi.

      ANPEIP

      • - Réseau Régional : Fédération de 14 associations régionales.<br>\
        • Rupture de l'isolement : Cafés parents, sorties, ateliers pour enfants et adolescents pour qu'ils se retrouvent entre pairs.<br>\
        • Information : Conférences et ateliers pour démystifier le HPI.<br>\
        • Plaidoyer : Partenaire de l'Éducation Nationale, travaille à l'harmonisation des pratiques et à la mise à jour des documents officiels (Vade-mecum HPI).

      Citations Clés

      Aline (FCPE) : « Une école inclusive, ce n'est pas une école à part, c'est l'école pour toutes et tous. »

      Fabienne (FFDYS) : « [Les troubles DYS] ce ne sont pas des maladies, ce qui veut dire qu'on n'en guérit pas. On va garder ces troubles tout au long de sa vie. »

      Frédéric (ANPEIP) : « Un individu ne se résume pas à un chiffre. »

      Daniel (TDAH France), à propos des adultes diagnostiqués tardivement : « Croyez-moi, c'est une libération pour ces adultes là, ils repartent d'un pied nouveau. »

    1. In teaching, Garcıa and Wei point out that while there are no specific pedagogicalpractices that can be ascribed to translanguaging, it would be wise to look from theground up as these are developed and later theorized into practice. The authors do pointto strategies like pairing students with greater English proficiency; grouping students ofsimilar language repertoires to discuss meaning; highlighting cognates; and incorporatingmultimodal ways of communication. Above all, the essence of translanguaging demandsopenness to co-learning with students in an atmosphere that is safer for taking risk andopen to resistance and social justice. The historic marginalization of certain languagesnecessitates a proactive stance towards justice education in the classroom.

      There aren’t strict rules for teaching with translanguaging, but the authors suggest things like pairing students by skill level, grouping them by language background, using words that sound similar in both languages, and using different ways to communicate. They stress the importance of being open, taking risks, and focusing on fairness and justice in the classroom. However, schools rarely teach or test translanguaging directly.

    2. Garcıa and Wei deviate by taking a Derridian view and believe that while we are all‘inscribed’ with language, ‘languaging’ is not enough to capture the complexity multilin-guals faced today. Instead, ‘A translanguaging approach to bilingualism extends the rep-ertoire of semiotic practices of individuals and transforms them into dynamic mobileresources that can adapt to global and local sociolinguistic situations. At the same time,Language and Education, 2015Vol. 29, No. 6, 566576

      The authors argue that just talking about ‘languaging’ isn’t enough. Translanguaging means using all your language skills in flexible ways, depending on the situation, and recognizing that language and bilingualism are shaped by society.

    3. As Garcıa and Wei explain, language crossing was seen to be straying from structural-ist notions of language autonomy. This dual vision of language has slowly evolved to adynamic vision, where ‘the language practices of bilinguals are complex and interrelated;they do not emerge in a linear way or function separately since there is only one linguisticsystem’ (14). They borrow from dynamic systems theory and admire the radical idea pro-posed by Makoni and Pennycook (2007) that language is a European invention havingbeen forged to perpetuate and enhance colonial thinking. English is only English when itis contrasted with other languages like French or Chinese. All languages exist among andborrow from each other. Labels used by linguists and never by speakers themselves onlyserve to fortify nation-state interests

      The authors explain that the way people think about language has changed over time. They mention that languages are not natural categories but are created by society, often to support things like colonialism or national identity we see even today.

    4. From the beginning of the book, Garcıa and Wei begin to ground translanguaging onfirm theoretical underpinnings. Challenging Saussurean notions of the formal system oflanguage (signified and signifier) and Chomsky’s proposal of one Universal Grammar,the authors instead embrace Bakhtin’s heteroglossia, which is explicitly context drivenand dynamically conceived. The term ‘languaging’ was coming into use by the middle ofthe twentieth century. Language was no longer conceived as monolithic or context-free,but as agentive and capable of making new meaning in a rapidly changing world.

      García and Wei base their ideas on newer theories about language. They disagree with older thinkers who saw language as fixed and separate. Instead, they support the view that language is always changing and shaped by its social context

    5. Ofelia Garcıa and Li Wei argue for a dynamic style of bilingualism rooted in what theyrefer to as ‘translanguaging’. The authors begin the book with the Welsh origins of theterm and progress through its treatment by various scholars. They point out that becausepeople translanguage all the time, it is a resource that should be utilized in the classroom.Indeed, teachers use translanguaging as a scaffolding technique to help students accesscontent. What the authors are more interested in working toward is beyond translanguag-ing as scaffold or technique. It is the acceptance of translanguaging as a legitimate prac-tice, at once transformative and transgressive as it seeks to challenge dominant narrativesand communicative structures

      The book “Translanguaging: language, bilingualism and education” by Ofelia García and Li Wei introduces the idea of translanguaging. The authors believe it’s more than just a teaching trick it should be seen as a powerful classroom tool that can change how we think about language and challenge old rules about which languages matter most.

    Annotators

    1. Deficits in social information processing may also offer a potential explanation for the association between na

      It could be a cognitive distortion (a faulty way of processing social cues) that develops in response to the chronic negative feedback from peers.

  3. Local file Local file
    1. Studio studies

      chrome-extension://bjfhmglciegochdpefhhlphglcehbmek/pdfjs/web/viewer.html?file=file%3A%2F%2F%2FUsers%2Fprestontaylor%2FDownloads%2F2015_Studio_studies_Notes_for_a_research.pdf

    Annotators

    1. Première conséquence : la contrainte énergétique russe. La capacité de raffinage russe a été réduite d’environ 500 000 barils par jour (environ 79 500 m³/jour) à l’automne 2025, selon l’Agence internationale de l’énergie. Ce niveau de perte ne devrait pas être totalement résorbé avant mi-2026. Cela veut dire que la Russie devra importer davantage de carburant raffiné ou rediriger du pétrole brut vers des installations plus éloignées et potentiellement moins efficaces. C’est une pression durable sur son économie de guerre. Deuxième conséquence : la réaction occidentale. Les alliés occidentaux ont intensifié les sanctions sur les géants pétroliers russes, et ont restreint l’accès de la Russie aux technologies nécessaires pour réparer rapidement ses raffineries. Washington et Bruxelles savent que le secteur pétrolier finance l’effort militaire russe. Il faut dire les choses simplement : couper la capacité de raffinage russe, c’est réduire la trésorerie disponible pour payer des bombes qui tombent ensuite sur Kharkiv ou Odessa. C’est brutal, mais c’est exact. Troisième conséquence : l’autonomie stratégique ukrainienne. Kyiv montre qu’elle peut frapper loin sans attendre un feu vert occidental pour l’emploi de missiles longue portée comme Tomahawk. Les drones sont conçus localement, modifiés localement, lancés localement. Cette autonomie technique réduit le levier politique des alliés occidentaux sur le tempo opérationnel ukrainien. Elle a un prix humain : les opérateurs parlent ouvertement du fait qu’ils apprennent « en direct », qu’ils perdent des camarades, et qu’ils considèrent la mission comme une obligation morale vis-à-vis de leurs enfants. Ce discours ne relève pas du marketing militaire. Il traduit une réalité : l’Ukraine a compris qu’elle ne peut pas se permettre une guerre uniquement défensive. Elle doit imposer un coût direct à la Russie, sur le territoire russe, et elle le fait.

      5.

  4. jpf-projects.bubbleapps.io jpf-projects.bubbleapps.io
    1. <svg xmlns="http://www.w3.org/2000/svg" width="385" height="590" version="1.1"><rect width="1100" height="600" fill="#FFFFFF"/><g transform="translate(192.5,295)"><text text-anchor="middle" transform="translate(-46, -87)" style="font-size: 70px; user-select: none; cursor: default; font-family: Lato; fill: rgb(166, 65, 130);">origo</text><text text-anchor="middle" transform="translate(-19, -7)" style="font-size: 70px; user-select: none; cursor: default; font-family: Lato; fill: rgb(166, 65, 130);">folder</text><text text-anchor="middle" transform="translate(67, 98)" style="font-size: 43px; user-select: none; cursor: default; font-family: Lato; fill: rgb(166, 65, 130);">clues</text><text text-anchor="middle" transform="translate(-57, 31)" style="font-size: 43px; user-select: none; cursor: default; font-family: Lato; fill: rgb(166, 65, 130);">meta</text><text text-anchor="middle" transform="translate(90, 57)" style="font-size: 43px; user-select: none; cursor: default; font-family: Lato; fill: rgb(166, 65, 130);">reflective</text><text text-anchor="middle" transform="translate(-79, 93)" style="font-size: 43px; user-select: none; cursor: default; font-family: Lato; fill: rgb(166, 65, 130);">integral</text><text text-anchor="middle" transform="translate(4, -148)" style="font-size: 43px; user-select: none; cursor: default; font-family: Lato; fill: rgb(59, 50, 115);">autopoietic</text><text text-anchor="middle" transform="translate(-18, 142)" style="font-size: 43px; user-select: none; cursor: default; font-family: Lato; fill: rgb(59, 50, 115);">design</text><text text-anchor="middle" transform="translate(47, 163)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(59, 50, 115);">word</text><text text-anchor="middle" transform="translate(70, -57)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(59, 50, 115);">cloud</text><text text-anchor="middle" transform="translate(44, 14)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(59, 50, 115);">manually</text><text text-anchor="middle" transform="translate(-75, 170)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(59, 50, 115);">maintained</text><text text-anchor="middle" transform="translate(-40, 52)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(242, 236, 145);">outline</text><text text-anchor="middle" transform="translate(89, -6)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(242, 236, 145);">inter</text><text text-anchor="middle" transform="translate(81, -127)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(242, 236, 145);">related</text><text text-anchor="middle" transform="translate(-40, -187)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(242, 236, 145);">salient</text><text text-anchor="middle" transform="translate(68, -92)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(242, 236, 145);">words</text><text text-anchor="middle" transform="translate(124, -27)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(242, 158, 109);">adjacenses</text><text text-anchor="middle" transform="translate(91, 120)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(242, 158, 109);">neologism</text><text text-anchor="middle" transform="translate(-125, 119)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(242, 158, 109);">indwelling</text><text text-anchor="middle" transform="translate(34, -191)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(242, 158, 109);">gestalt</text><text text-anchor="middle" transform="translate(-100, 55)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(242, 158, 109);">focal</text><text text-anchor="middle" transform="translate(-113, -191)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(242, 158, 109);">subsidiary</text><text text-anchor="middle" transform="translate(-74, -212)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(242, 125, 114);">situational</text><text text-anchor="middle" transform="translate(-107, -69)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(242, 125, 114);">awareness</text><text text-anchor="middle" transform="translate(96, 185)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(242, 125, 114);">omni-optional</text><text text-anchor="middle" transform="translate(-43, 193)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(242, 125, 114);">co-evolutional</text><text text-anchor="middle" transform="translate(16, 211)" style="user-select: none; cursor: default; font-family: Lato; fill: rgb(242, 125, 114);">generative</text></g></svg>

    1. Synthèse du Webinaire : Concilier les Enjeux de l'Alimentation Durable et la Précarité

      Résumé

      Ce document de synthèse résume les échanges du webinaire "Comment concilier les 4 enjeux de l'alimentation durable et la précarité ?", organisé par le CRES et le GRAINE PACA.

      Il met en lumière la complexité de la précarité alimentaire, un phénomène hétérogène et difficile à quantifier, qui toucherait environ 8 millions de personnes en France.

      La région PACA se distingue par un taux de pauvreté élevé, le troisième plus important de France, exacerbant les inégalités d'accès à une alimentation de qualité.

      Les interventions scientifiques ont démontré que les quatre piliers de l'alimentation durable (nutrition/santé, environnement, socio-économique, socio-culturel) ne convergent pas naturellement.

      Cependant, des études approfondies révèlent qu'un régime alimentaire à la fois sain et à faible impact environnemental peut être moins coûteux.

      La clé réside dans une "végétalisation saine" de l'alimentation : une réduction de la consommation de produits animaux, notamment la viande de ruminant, compensée par un apport accru en céréales complètes, légumineuses, fruits et légumes.

      La région PACA dispose d'un écosystème structuré pour aborder ces défis, avec des instances de coordination comme la COALIM et des réseaux thématiques (Précalim, Éducalim, Régalim, PAT) visant à décloisonner les approches.

      Des programmes nationaux comme "Mieux Manger Pour Tous" et des réglementations telles que la loi EGalim offrent des cadres financiers et légaux pour transformer les systèmes alimentaires, y compris l'aide alimentaire.

      Enfin, l'étude de cas de l'épicerie sociale de Mouans-Sartoux illustre une transition réussie d'un modèle d'aide basé sur les invendus à une offre de produits frais, bio et locaux.

      Cette transformation, rendue possible par la volonté politique, des partenariats stratégiques (Biocoop, producteurs locaux) et l'accès à des financements dédiés, prouve qu'il est possible d'améliorer radicalement la qualité et la durabilité de l'aide alimentaire tout en respectant la dignité des bénéficiaires.

      --------------------------------------------------------------------------------

      1. Introduction et Contexte du Webinaire

      Organisé par le CRES PACA (Comité Régional d'Éducation pour la Santé) et le GRAINE PACA (Réseau Régional pour l'Éducation à l'Environnement et au Développement Durable), ce webinaire a bénéficié du soutien financier de la DREAL, et a été mené en partenariat avec la DRAF et l'ADEME Provence-Alpes-Côte d'Azur.

      Il s'inscrit dans le cadre du programme Mieux Manger Pour Tous et fait partie d'un cycle de deux webinaires portés par deux réseaux régionaux majeurs :

      Précalim : Réseau régional de lutte contre la précarité alimentaire.

      Éducalim : Réseau régional de l'éducation à l'alimentation durable et au goût.

      Les objectifs principaux du webinaire étaient les suivants :

      • Approfondir les connaissances sur la notion d'alimentation durable et les leviers pour concilier ses enjeux chez les personnes en situation de précarité.

      • Identifier les principales réglementations liées à l'alimentation durable pour tous.

      • Découvrir une action de terrain inspirante et reproductible.

      2. Le Cadre Stratégique et les Réseaux d'Acteurs en Région PACA

      2.1. L'Écosystème Régional pour une Alimentation Durable

      Présenté par Peggy Bucas (DRAF), le maillage régional en PACA est conçu pour maximiser l'efficacité des actions en faveur de l'alimentation durable.

      La COALIM : Cette instance réunit les institutions régionales (DRAF, DREAL, DREETS, ARS, ADEME, Région) qui pilotent des missions et des financements liés à l'alimentation durable. Elle assure une concertation et une complémentarité des actions.

      Les Réseaux Thématiques Régionaux : Quatre réseaux principaux apportent un soutien thématique et méthodologique aux porteurs de projet.

      Précalim : Focalisé sur la lutte contre la précarité alimentaire.    ◦ Éducalim : Centré sur l'éducation à l'alimentation durable et au goût.    ◦ Régalim : Dédié à la lutte contre le gaspillage et les pertes alimentaires.    ◦ Réseau des PAT : Anime les 29 Projets Alimentaires Territoriaux (PAT) de la région.

      Les PAT sont des leviers essentiels pour favoriser les coopérations, casser les fonctionnements en silo et développer une approche systémique. Ils ont pour mission d'intégrer un volet "justice sociale" pour réduire la précarité alimentaire.

      2.2. Le Réseau Précalim et le Programme "Mieux Manger Pour Tous"

      Présentés par Sandrine Fort (DREETS), le réseau Précalim et le programme MMPT sont des piliers de la lutte contre la précarité alimentaire dans la région.

      Le Réseau Précalim :

      Membres : Près de 600 membres (institutions, associations, collectivités). Un appel est lancé pour intégrer davantage d'acteurs agricoles.

      Objectifs :

      ◦ Créer de l'interconnaissance entre les acteurs.  

      ◦ Favoriser le partage d'initiatives et les retours d'expérience.  

      ◦ Promouvoir les synergies et les coopérations.  

      ◦ Valoriser les actions et les acteurs.

      Actions : Journées de rencontre, webinaires thématiques, ateliers "accélérateurs de projets" et une plateforme collaborative sur l'espace de l'ADEME.

      Le Programme "Mieux Manger Pour Tous" (MMPT) :

      Origine : Issu du plan d'action pour la transformation de l'aide alimentaire.

      Budget national : 60 millions d'euros en 2023, avec une progression de 10 millions par an prévue jusqu'en 2027.

      Objectifs :

      1. Améliorer la qualité nutritionnelle et gustative de l'aide alimentaire.   

      2. Soutenir la participation et l'accompagnement des personnes précaires.   

      3. Transformer les dispositifs locaux de lutte contre la précarité alimentaire (ex: paniers solidaires, groupements d'achat).   

      4. Réduire l'impact environnemental du système d'aide alimentaire.

      Chiffres du programme en PACA :

      2023 : 51 projets financés pour 1,7 million d'euros.    ◦ 2024 : 62 projets financés pour 2,5 millions d'euros.    ◦ 2025 : Enveloppe de 3,3 millions d'euros, avec 46 projets supplémentaires en cours d'instruction.

      3. La Précarité Alimentaire : Définitions et Chiffres Clés

      3.1. Définitions Fondamentales

      Terme

      Définition

      Alimentation Durable (FAO)

      Régimes alimentaires qui contribuent à protéger la biodiversité, sont culturellement acceptables, économiquement équitables et accessibles, et nutritionnellement sûrs et sains. Elle repose sur quatre enjeux : Nutrition/Santé, Environnement, Socio-économique, et Socio-culturel.

      Lutte contre la Précarité Alimentaire

      Favoriser l'accès à une alimentation sûre, diversifiée, de bonne qualité et en quantité suffisante pour les personnes en situation de vulnérabilité, dans le respect de leur dignité et en développant leur capacité d'agir.

      Aide Alimentaire

      Fourniture de denrées alimentaires aux personnes vulnérables, assortie d'une proposition d'accompagnement.

      Insécurité Alimentaire (FAO)

      Situation dans laquelle une personne n'a pas un accès régulier à suffisamment d'aliments sains et nutritifs pour une croissance et une vie active et saine. Elle est mesurée par l'échelle FIES (Food Insecurity Experience Scale).

      3.2. État des Lieux de la Précarité Alimentaire

      La mesure de la précarité alimentaire en France est complexe en raison de l'absence de méthode de recensement homogène et régulière.

      Les données sont issues du croisement de plusieurs sources (statistiques publiques, études ponctuelles comme INCA 3, données des associations).

      Chiffres nationaux (estimations) :

      Personnes en insécurité alimentaire : 8 millions, soit 11% de la population (Anses).

      Insatisfaction alimentaire : 16% des personnes déclarent ne pas avoir assez à manger et 45% ne pas avoir les aliments souhaités (CRÉDOC, 2022).

      Bénéficiaires de l'aide alimentaire : Entre 2 et 9 millions. La DGCS recense 5,3 millions de personnes inscrites auprès des associations habilitées.

      Non-recours à l'aide alimentaire : 75% des personnes en insécurité alimentaire n'ont pas recours à l'aide alimentaire (étude INCA 3, 2015).

      Difficultés financières : 38% des Français rencontrent des difficultés financières pour consommer des fruits et légumes frais tous les jours (Baromètre Ipsos/Secours Populaire, 2024).

      Impacts sur la santé :

      • La prévalence de l'obésité est près de quatre fois plus élevée chez les adultes les plus pauvres.

      • La consommation de fruits et légumes est deux fois plus faible chez les personnes en insécurité alimentaire (230g/jour en moyenne contre une recommandation de 400g/jour).

      Situation en région PACA :

      Taux de pauvreté : 3ème plus élevé de France, touchant environ 850 000 personnes.

      Niveau de vie médian des personnes pauvres : 10 600 € par an, soit plus de deux fois inférieur au niveau de vie médian de l'ensemble de la population de la région (22 000 €).

      Département le plus pauvre : Le Vaucluse, avec un taux de pauvreté de 20% (5ème plus élevé de France).

      Groupes les plus touchés :

      ◦ Les ménages dont le référent a moins de 30 ans (25% de taux de pauvreté).   

      ◦ Les familles monoparentales (30,2%).  

      ◦ Les seniors (la part des retraités parmi les ménages pauvres est de 30,4%).

      4. Éclairages Scientifiques : Vers une Alimentation Durable et Abordable

      Florent Vieux (MS-Nutrition) a présenté plusieurs études visant à quantifier les dimensions de l'alimentation durable (nutrition, environnement, coût) à partir de bases de données de référence (INCA 3, Ciqual, Agribalyse, Kantar).

      4.1. Hiérarchie des Groupes Alimentaires

      Cette étude montre que le classement des aliments en termes de coût et d'impact environnemental dépend fortement de l'unité fonctionnelle choisie.

      Unité Fonctionnelle

      Constats Clés

      Par kilogramme (€/kg)

      - Les plus chers/impactants : Viande de ruminant, produits de la mer. <br> - Les moins chers/impactants : Fruits, légumes, légumineuses.

      Par 100 kilocalories (€/100 kcal)

      - Les fruits et légumes deviennent très chers et impactants en raison de leur faible densité énergétique. <br> - Les produits laitiers et les œufs restent en position intermédiaire.

      Par unité de qualité nutritionnelle

      - Les produits de la mer redeviennent plus "abordables". <br>

      • Les fruits, légumes et légumineuses restent des choix très pertinents (faible coût/impact rapporté à leur densité nutritionnelle).

      Conclusion principale : Le classement du coût et de l'impact environnemental des catégories d'aliments est très similaire.

      Certains aliments comme les légumineuses, les pommes de terre et les céréales complètes sont systématiquement peu coûteux et peu impactants, quelle que soit l'unité fonctionnelle.

      4.2. Approche par "Déviance Positive"

      Cette étude a comparé les régimes alimentaires d'individus ayant une bonne qualité nutritionnelle mais des impacts environnementaux différents.

      Le groupe "plus durable" (bonne nutrition, faible impact) présentait également un coût alimentaire plus faible.

      Marqueurs d'une bonne qualité nutritionnelle (communs aux deux groupes) :

      ◦ Consommation élevée de fruits et légumes.  

      ◦ Consommation élevée de produits laitiers.  

      ◦ Faible consommation de boissons sucrées.

      Ce qui distingue le groupe à faible impact environnemental :

      ◦ Une consommation beaucoup plus faible de viande de ruminant.  

      ◦ Une consommation nettement plus élevée de céréales complètes pour compenser.

      4.3. Conclusion et Recommandations

      L'ensemble des études convergent vers un message principal : la "végétalisation saine".

      Il s'agit de réduire la consommation de produits animaux (surtout la viande) et de la substituer par des choix végétaux éclairés (céréales complètes, légumineuses, fruits et légumes).

      Enjeu spécifique pour les personnes précaires : L'augmentation de la consommation de fruits et légumes est prioritaire, car leur niveau de consommation de départ est particulièrement bas.

      Empreinte carbone : Si les plus pauvres ont une empreinte carbone globale bien plus faible que les plus riches, la différence est moins marquée pour le poste "alimentation". Agir sur ce levier reste donc pertinent pour tous.

      5. Cadre Réglementaire et Levier d'Action

      5.1. La Loi EGalim comme Modèle

      Clara Vigan (DRAF) a présenté la loi EGalim, appliquée à la restauration collective, comme un levier puissant pouvant inspirer des actions au-delà de ce secteur.

      Objectifs de la loi :

      50% de produits de qualité et durables, dont au moins 20% de produits bio.  

      Diversification des sources de protéines avec l'introduction de menus végétariens, ce qui permet de réduire les coûts.  

      Lutte contre le gaspillage alimentaire.    ◦ Réduction de l'usage du plastique.

      Ces principes peuvent être transposés à l'aide alimentaire pour améliorer la qualité de l'offre tout en maîtrisant les budgets.

      5.2. L'Impératif de la Sécurité Sanitaire des Aliments

      Peggy Bucas (DRAF) a rappelé les règles fondamentales du "Paquet Hygiène", cruciales pour toute structure distribuant des denrées.

      Principes clés : traçabilité des dons, respect de la chaîne du froid/chaud, hygiène des locaux et du personnel.

      Distinction essentielle :

      DLC (Date Limite de Consommation) : Dépassement impérativement interdit.    ◦ DDM (Date de Durabilité Minimale) : "à consommer de préférence avant", le produit reste consommable sans risque sanitaire après la date.

      6. Étude de Cas : La Transformation de l'Épicerie Sociale de Mouans-Sartoux

      Rémy Georgon (CCAS de Mouans-Sartoux) a partagé le retour d'expérience de la transformation de l'épicerie sociale de la commune.

      Le déclic : Une prise de conscience collective en 2020 face à la baisse de qualité des dons issus des invendus. La structure réalisait qu'elle distribuait "des produits que personne n'a achetés".

      La stratégie de transformation :

      1. Partenariats stratégiques : Une collaboration avec le magasin Biocoop local a permis d'instaurer une offre de produits en vrac (alimentaire et hygiène) et de créer un rayon de produits bio achetés.  

      2. Recherche de financements : Mobilisation des appels à projets "France Relance" (pour renouveler les équipements de froid) et "Mieux Manger Pour Tous".   

      3. Approvisionnement local et de saison : Mise en place d'un système de commande groupée de légumes frais et de saison auprès d'un producteur local.  

      4. Synergie avec la politique de la ville : Le projet MMPT a permis de financer l'embauche d'un maraîcher par le CCAS, mis à disposition de la régie agricole municipale pour augmenter la production de légumes bio à destination de l'épicerie.  

      5. Implication des bénéficiaires : Les usagers ont été consultés pour définir les produits frais prioritaires à acheter (produits laitiers).

      Résultats quantitatifs :

      ◦ En 2024, les produits bio représentaient 7% du stock (avec 0% de fruits et légumes).   

      ◦ Au premier semestre 2025, ce chiffre est passé à 46% de produits bio en poids, dont 62% sont des fruits et légumes.  

      ◦ Le budget d'achat de denrées est passé de 4 000 € à 25 000 €, soutenu par des subventions.

      Facteurs clés de succès :

      ◦ La conviction et l'engagement du responsable.  

      ◦ Une forte volonté politique et le soutien de la mairie.    ◦ La capacité à chercher des partenaires et des financements externes.  

      ◦ Le choix de privilégier la qualité sur la quantité.

    1. he secure cluster showed similar scores in all dimensions to the mean scores of the healthy subjects in the Andersson et al. study while the insecure anxious cluster differed regarding DC and CO scores and the insecure/avoidant-anxious cluster regarding DC, CO, NA and PR scores. All three clusters showed similar scores in the RS dimension

      The "insecure/avoidant-anxious" cluster (b) is mostly BPD, not ADHD. The "insecure/avoidant" cluster (a) has 11 of the 20 ADHD patients. This separates the two disorders.

  5. bafybeigcrsxbwkpxjgdjyhjsl675hejqzn5swdg7xktlw5w47fa6jkkc74.ipfs.localhost:8080 bafybeigcrsxbwkpxjgdjyhjsl675hejqzn5swdg7xktlw5w47fa6jkkc74.ipfs.localhost:8080
    1. ♒/ℹ️/=/🗺️/MEMEplEX/Map/index.html

      Saving indy0pad created html with hypothesys embed.js script preloaded

      Just download the html file

      add the script tag embed.js

      ``` html

      <script src="https://hypothes.is/embed.js" async></script>

      ```

      http://bafybeigcrsxbwkpxjgdjyhjsl675hejqzn5swdg7xktlw5w47fa6jkkc74.ipfs.dweb.link/?filename=hyperpost%3D%F0%9F%97%BA%EF%B8%8FMEMEplEX.Map.html

    1. en insistant sur l’importance du lien dans la construction du psychisme.

      Cette phrase renforce l’idée centrale de mon devoir : Internet ne détruit pas le lien, il transforme notre manière d’être ensemble.

    2. Il donne à voir la construction de son propre rapport aux technologies numériques et propose des pistes pour une psycho(patho)logie du virtuel quotidien

      Rinaudo conclut en insistant sur la subjectivité face au numérique.

    3. ne pas céder leur place à la machine afin de préserver une parole et les phénomènes transférentiels

      Cette phrase résume la nécessité de garder l’humain au centre du lien. Internet ne doit pas remplacer la rencontre réelle.

    4. la volonté de maîtrise mise en œuvre via le numérique et l’Internet objet supposé savoir, renforce des vécus de déliaision psychique et de possibilité d’un savoir sans sujet.

      Rinaudo (via Méloni) parle ici du danger d’un savoir sans sujet. Cela rejoint l’idée que les relations numériques peuvent perdre leur dimension incarnée.

    5. Elle repère le paradoxe entre une demande sociale adressée aux adolescents de se déterminer quant à leur orientation scolaire et professionnelle, d’une part, et le vécu de l’adolescence caractérisé par l’écroulement des repères anciens

      Ce passage montre que le numérique peut accentuer les tensions psychiques chez les jeunes — un écho à la “dépendance relationnelle” dont parle Janssen.

    6. Ils sont donc investis d’une fonction, c’est-à-dire d’un lien, au sens de Bion (1957/1983),

      Cette référence à Bion met en évidence que les outils numériques jouent un rôle symbolique dans la construction du lien psychique.

    7. Pour d’autres, les technologies numériques sont justement un élément supplémentaire au côté d’autres « outils » comme leur voix ou leur posture, qui leur permet d’affirmer leur signature.

      Rinaudo montre que le numérique peut aussi soutenir la créativité et l’identité professionnelle — une vision plus positive proche de celle de Tisseron.

    8. ils vivent leur rapport aux technologies numériques comme une attaque contre la liaison qui contribue à défaire ou morceler leur identité de sujet professionnel.

      Certains professionnels ressentent le numérique comme une menace pour le lien. Cela illustre la “déliaison psychique” évoquée dans le débat sur la solitude numérique.

    9. les métiers du lien sont une praxis, un art de faire porté par des sujets en interaction.

      Certains professionnels ressentent le numérique comme une menace pour le lien. Cela illustre la “déliaison psychique” évoquée dans le débat sur la solitude numérique.

    10. une peur sourde d’un possible remplacement de l’homme dans ses activités par une machine

      Rinaudo évoque la crainte du remplacement humain. Cela rejoint l’idée d’une relation déshumanisée, proche de la solitude émotionnelle décrite par Janssen.

    11. les chercheurs étudient les dimensions subjectivantes tout comme les dimensions aliénantes des pratiques numériques

      Rinaudo souligne l’ambivalence du numérique : il peut soutenir la construction du sujet (subjectivation) ou, au contraire, aliéner. C’est l’idée du pharmakon.

    12. Les technologies numériques ont envahi notre vie quotidienne.

      Rinaudo montre que le numérique est devenu omniprésent, ce qui fait écho à l’hyperconnexion évoquée par Janssen dans la “solitude du lien”.

    13. ils contribuent à éclairer la psycho(patho)logie du virtuel quotidien

      Cette expression montre que nos usages numériques ont une dimension psychique. On peut relier cela à l’idée de solitude connectée : Internet influence notre manière de vivre les relations.

    1. Synthèse d'une Recherche sur les Associations et leurs Territoires

      Synthèse

      Ce document de synthèse présente les principaux enseignements d'un travail de recherche doctoral analysant les relations complexes entre les associations et leurs territoires.

      La recherche démontre que le territoire d'une association n'est pas une simple donnée géographique, mais une construction dynamique et relationnelle façonnée par les interactions, les ressources mobilisées et les proximités (géographique, organisationnelle, institutionnelle) avec un écosystème d'acteurs.

      L'étude distingue une zone d'activité, souvent locale, d'une zone d'influence (liée au projet associatif) beaucoup plus large, soulignant que ces deux dimensions sont complémentaires.

      Il ressort que la coopération, centrale dans ce processus, est fortement guidée par l'appartenance sectorielle et le partage de valeurs, ce qui a des implications directes sur la recherche de financements et la légitimité des associations.

      La méthodologie mixte, combinant une analyse quantitative nationale de 1600 bassins de vie et une étude qualitative approfondie de huit territoires, confère une robustesse significative à ces conclusions.

      Ces travaux offrent des arguments concrets pour le plaidoyer, permettant aux associations de mieux valoriser leur contribution multidimensionnelle au développement et à l'attractivité des territoires.

      1. Contexte et Problématique de la Recherche

      Un Contexte Institutionnel en Tension

      La recherche s'inscrit dans un contexte institutionnel marqué par un "vrai questionnement aujourd'hui et mise en péril du monde associatif". Cette tension est illustrée par une dichotomie fondamentale :

      D'une part, une reconnaissance croissante de l'importance des associations. Un rapport de la Cour des comptes de septembre souligne que "les associations mettent en œuvre des activités sociales relevant du périmètre de l'État", attestant de leur rôle crucial dans la "soutenabilité et la durabilité voir l'inclusion de notre société".

      D'autre part, une remise en cause systématique de leurs financements et de leur existence même.

      Cette situation crée un paradoxe entre la vision nationale, qui reconnaît leur apport systémique, et les réalités territoriales, où la légitimité des associations à agir et à être financées est constamment interrogée.

      L'Enjeu de la Relation au Territoire

      La question centrale qui motive la recherche est de qualifier la relation entre les associations et les territoires. Le financement et la légitimité d'une association sur un territoire sont souvent liés à son périmètre géographique d'influence et d'activité. La recherche vise donc à dépasser l'idée que les associations sont simplement "non délocalisables". En effet, une association peut fermer des postes dans une ville pour en ouvrir dans une autre, ce qui constitue une forme de délocalisation. Le travail de recherche se propose de déconstruire la notion de "local" pour analyser comment une association passe de la simple localisation (présence dans un espace) à l'ancrage (relations établies) et à la territorialisation (devenir une composante spécifique et indissociable du territoire).

      2. Le Cadre du Projet de Recherche Doctoral

      Cette recherche est menée dans le cadre d'une thèse en CIFRE (Convention Industrielle de Formation par la Recherche) au sein du Réseau National des Maisons des Associations (RNMA), débutée en 2022.

      Le Réseau National des Maisons des Associations (RNMA)

      Le RNMA est un réseau national dont les membres sont des Maisons des Associations (MDA), qu'elles soient de statut associatif ou des services municipaux. Ses missions incluent :

      • Faire remonter les problématiques du niveau local au niveau national.

      • Accompagner le métier d'accompagnateur de la vie associative.

      • Développer l'ingénierie, notamment en accompagnant la mise en place d'observatoires locaux de la vie associative, considérés comme des outils de co-construction de politiques publiques.

      Les Objectifs de la Thèse

      La thèse vise à qualifier et interpréter les relations entre les associations et le territoire à travers trois objectifs principaux :

      1. Identifier et qualifier les variables socio-économiques qui expliquent la présence des établissements associatifs employeurs sur un territoire (approche quantitative).

      L'hypothèse est que le tissu associatif est lié aux caractéristiques historiques, géographiques et culturelles d'un lieu.

      2. Tester et démontrer les relations entre les caractéristiques du territoire et les caractéristiques organisationnelles et sectorielles des associations (approche qualitative).

      3. Identifier les facteurs de diversité des associations, en analysant le processus qui mène de la localisation à la territorialisation.

      3. Cadre Conceptuel et Méthodologie

      Définitions Opérationnelles : Association et Territoire

      La recherche s'appuie sur des définitions précises pour structurer son analyse :

      L'Association :

      Elle est appréhendée non pas par sa définition juridique (loi 1901), mais comme une "forme organisationnelle construite" qui associe de manière intrinsèque une activité (réponse à des besoins) et un projet collectif.

      Elle mobilise pour cela des moyens humains (bénévoles, militants, salariés) et matériels.

      Le Territoire : Il n'est pas considéré comme un espace géographique statique, mais comme "une construction qui vient de l'histoire de la fondation de la structure [...] et surtout des interactions qu'elle va avoir avec d'autres organisations". Le territoire est donc le produit des relations entre les acteurs.

      Une Approche Méthodologique Mixte

      La robustesse de l'étude repose sur une méthodologie en deux temps :

      1. Phase Quantitative :

      Périmètre : Environ 1600 "bassins de vie" (définition INSEE) en France métropolitaine.   

      Analyse : Une méthode statistique a été utilisée pour croiser les caractéristiques socio-démographiques des bassins de vie (pyramide des âges, revenus, types d'emplois) avec la présence d'établissements associatifs employeurs (données INSEE Floress 2021).  

      Résultat : L'analyse a permis de répartir l'ensemble des bassins de vie en trois grands groupes ("clusters"), c'est-à-dire trois familles partageant des caractéristiques similaires dans l'articulation entre leur profil socio-économique et la présence associative.

      2. Phase Qualitative :

      Échantillon : Huit bassins de vie ont été sélectionnés, représentatifs des trois clusters et caractérisés par la présence d'une Maison des Associations (associative ou municipale).

      Les territoires étudiés sont : Grenoble, Dijon, Amiens, Concarneau, Montrevault-sur-Èvre, Niort, Crayon et Mauguio.   

      Collecte de données : 28 entretiens semi-directifs ont été menés (3 à 4 par bassin de vie) avec des associations de tous secteurs, employeuses comme non employeuses.

      Le choix de se concentrer initialement sur les associations employeuses pour la partie quantitative s'explique par la disponibilité de données statistiques fiables et consolidées au niveau national (via les déclarations URSSAF), ce qui n'est pas le cas pour les associations non employeuses.

      4. Étude de Cas : Le Bassin de Vie de Dijon

      Pour illustrer la démarche d'analyse, le cas d'une association culturelle dans le bassin de vie de Dijon est présenté.

      Caractéristiques Socio-économiques du Territoire

      Indicateur

      Donnée

      Département

      Côte-d'Or

      Évolution de la population (2016-2021)

      +2,5 %

      Structure démographique

      23 % de moins de 20 ans, 26 % de retraités

      Catégories socio-professionnelles

      11 % de cadres

      Économie

      83 % de l'emploi dans le secteur tertiaire

      Tissu associatif employeur

      947 établissements, générant près de 13 000 salaires

      Le territoire est perçu par les acteurs locaux comme offrant une bonne qualité de vie ("bon vivre"), avec une université, une offre culturelle importante, mais aussi un côté "un petit peu bourgeois".

      Modélisation d'une Association Culturelle

      Objet : Association de musique électronique, créée en 2004 par des passionnés suite à la fermeture d'un club.

      Structuration : D'abord bénévole, elle se professionnalise à partir de 2012 et compte aujourd'hui 3 salariés et une gouvernance de 8 personnes.

      Activités :

      Programmation/Production : Concerts, festivals.    ◦ Création : Studios de mixage.   

      Militantisme : Promotion de la musique électronique, professionnalisation du secteur, et mise en avant des valeurs de tolérance et de diversité.  

      Publics diversifiés : Ateliers en EHPAD, sensibilisation pour les jeunes en MJC, "booms" pour enfants.

      Écosystème : L'association interagit avec de multiples acteurs à différentes échelles (commune, métropole, département, national) :

      • autres associations (culturelles, environnementales),
      • la MDA municipale,
      • les financeurs institutionnels (Ville, DRAC, Conseil Régional), et
      • des réseaux (Ligue de l'enseignement, fédérations culturelles).

      Analyse via le Prisme des Proximités

      La relation entre l'association et son écosystème est analysée à travers trois types de proximités :

      Proximité Géographique : Évidente avec ses salariés, son public local, les EHPAD, les MJC et les autres associations locales. Elle facilite la rencontre et la coopération.

      Proximité Organisationnelle : Le partage de modes de fonctionnement.

      Elle existe avec toutes les associations (gouvernance démocratique, non-lucrativité) mais est beaucoup plus forte avec les associations du même secteur culturel, qui partagent des règles et des logiques d'action communes (ex: organiser un festival).

      Proximité Institutionnelle : Le partage de valeurs et de normes. De même, si des valeurs comme la solidarité sont partagées largement dans le monde associatif, cette proximité est nettement plus marquée au niveau sectoriel.

      5. Principaux Résultats et Conclusions Transversales

      L'étude de cas et les autres entretiens permettent de dégager des conclusions plus générales.

      Le Territoire : Une Construction Dynamique et Relationnelle

      Les résultats montrent que les associations ne sont pas simplement "localisées".

      Leur capacité à s'ancrer et à se territorialiser repose sur des mécanismes complexes où la coopération occupe une place centrale.

      Le territoire n'est donc "pas du tout figé ni fixe, il est tout à fait dynamique" ; il est multi-scalaire, multi-acteurs, et incarné par la capacité des associations à mobiliser des ressources et des proximités.

      Distinction entre Zone d'Activité et Zone d'Influence

      Une distinction fondamentale est établie :

      La Zone d'Activité : L'espace où se déploient les activités concrètes de l'association. Dans le cas de Dijon, elle est principalement concentrée sur la commune et la métropole.

      La Zone d'Influence : L'espace beaucoup plus large sur lequel rayonne le projet associatif (le militantisme, les valeurs, la reconnaissance du mouvement). Elle "dépasse largement tous ces périmètres là".

      Ces deux zones sont complémentaires et dynamiques.

      Le Rôle Central des Proximités et du Secteur d'Activité

      La Proximité Géographique est structurante : Elle est la condition première de la rencontre et de la connaissance mutuelle.

      Les MDA jouent un rôle clé de "lieu ressources" et de "facilitateurs".

      Le Secteur d'Activité est déterminant : Les proximités organisationnelle et institutionnelle sont décuplées au sein d'un même secteur.

      Les associations d'un même domaine partagent des règles, des logiques et surtout des valeurs spécifiques beaucoup plus fortes.

      Les Valeurs comme guide de l'action : La proximité institutionnelle (le partage de valeurs) est un facteur crucial pour la coopération et la recherche de financement.

      Comme l'indique un verbatim marquant de l'étude : "On travaille pas avec quelqu'un tout court si on n'a pas les mêmes valeurs".

      6. Implications pour le Plaidoyer Associatif

      Les résultats de cette recherche offrent des pistes concrètes pour que les associations valorisent leurs activités auprès des acteurs publics et des financeurs.

      1. Dépasser la logique de l'activité seule : Il est crucial de montrer que l'association ne se résume pas à son activité (qui contribue à l'attractivité et au développement local), mais qu'elle possède également un projet et une zone d'influence qui rayonnent bien au-delà.

      2. Démontrer l'effet de levier : Un financement local (par exemple, "1 € d'un acteur d'une commune") n'est pas une simple subvention.

      Il a un effet levier qui permet d'aller chercher d'autres financements à d'autres échelles (régionale, nationale), contribuant ainsi au rayonnement global de l'association et du territoire.

      3. Valoriser l'attraction de ressources externes : Les associations, par leur réseau et leur zone d'influence, attirent des ressources extérieures (artistes, expertises, financements) qu'elles mettent à disposition des habitants et du territoire, renforçant ainsi son attractivité.

    1. Regular Expressions Notepad++ regular expressions (“regex”) use the Boost regular expression library v1.85 (as of NPP v8.6.6), which was originally based on PCRE (Perl Compatible Regular Expression) syntax, only departing from it in very minor ways. Complete documentation on the precise implementation is to be found on the Boost pages for search syntax and replacement syntax. (Some users have misunderstood this paragraph to mean that they can use one of the regex-explainer websites that accepts PCRE and expect anything that works there to also work in Notepad++; this is not accurate. There are many different “PCRE” implimentations, and Boost itself does not claim to be “PCRE”, though both Boost and PCRE variants have the same origins in an early version of Perl’s regex engine. If your regex-explainer does not claim to use the same Boost engine as Notepad++ uses, there will be differences between the results from your chosen website and the results that Notepad++ gives.) The Notepad++ Community has a FAQ on other resources for regular expressions. Note: Regular expression “backward” search is disallowed due to sometimes surprising results. (For example, in the text to the test they travelled, a forward regex t\w+ will find 5 results; the same regex searching backward will find 17 matches.) If you really need this feature, please see Allow regex backward search to learn how to activate this option. Important Note: Syntax that works in the Find What: box for searching will not always work in the Replace with: box for replacement. There are different syntaxes. The Control Characters and Match by character code syntax work in both; other than that, see the individual sections for Searches vs Substitutions for which syntaxes are valid in which fields. Regex Special Characters for Searches In a regular expression (shortened into regex throughout), special characters interpreted are: Single-character matches . or \C ⇒ Matches any character. If you check the box which says . matches newline, or use the (?s) search modifier, then . or \C will match any character, including newline characters (\r or \n). With the option unchecked, or using the (?-s) search modifier, . or \C only match characters within a line, and do not match the newline characters. Any Unicode character within the Basic Multilingual Plane (BMP) (with a codepoint from U+0000 through U+FFFF) will be matched per these rules. Any Unicode character that is beyond the BMP (with a codepoint from U+10000 through U+10FFFF) will be matched as two separate characters instead, since the “surrogate code” uses two characters. (See the Match by Character Code section for more on how surrogate codes work.) \X ⇒ Matches a single non-combining character followed by any number (zero or more) combining characters. You can think of \X as a “. on steroids”: it matches the whole grapheme as a unit, not just the base character itself. This is useful if you have a Unicode encoded text with accents as separate, combining characters. For example, the letter ǭ̳̚, with four combining characters after the o, can be found either with the regex (?-i)o\x{0304}\x{0328}\x{031a}\x{0333} or with the shorter regex \X (the latter, being generic, matches more than just ǭ̳̚, inluding but not limited to ą̳̄̚ or o alone); if you want to limit the \X in this example to just match a possibly-modified o (so “o followed by 0 or more modifiers”), use a lookahead before the \X: (?=o)\X, which would match o alone or ǭ̳̚, but not ą̳̄̚. \$ , \( , \) , \* , \+ , \. , \? , \[ , \] , \\ , \| ⇒ Prefixing a special character with \ to “escape” the character allows you to search for a literal character when the regular expression syntax would otherwise have that character have a special meaning as a regex meta-character. The characters $ ( ) * + . ? [ ] \ | all have special meaning to the regex engine in normal circumstances; to get them to match as a literal (or to show up as a literal in the substitution), you will have to prefix them with the \ character. There are also other characters which are special only in certain circumstances (any time a character is used with a non-literal meaning throughout the Regular Expression section of this manual); if you want to match one of those sometimes-special characters as literal character in those situations, those sometimes-special characters will also have to be escaped in those situations by putting a \ before it. Please note: if you escape a normal character, it will sometimes gain a special meaning; this is why so many of the syntax items listed in this section have a \ before them. Match by character code It is possible to match any character using its character code. This allows searching for any character, even if you cannot type it into the Find box, or the Find box doesn’t seem to match your emoji that you want to search for. If you are using an ANSI encoding in your document (that is, using a character set like Windows 1252), you can use any character code with a decimal codepoint from 0 to 255. If you are using Unicode (one of the UTF-8 or UTF-16 encodings), you can actually match any Unicode character. These notations require knowledge of hexadecimal or octal versions of the character code. (You can find such character code information on most web pages about ASCII, or about your selected character set, and about UTF-8 and UTF-16 representations of Unicode characters.) \0ℕℕℕ ⇒ A single byte character whose code in octal is ℕℕℕ, where each ℕ is an octal digit. (That’s the number 0, not the letter o or O.) This notation works for for codepoints 0-255 (\0000 - \0377), which covers the full ANSI character set range, or the first 256 Unicode characters. For example, \0101 looks for the letter A, as 101 in octal is 65 in decimal, and 65 is the character code for A in ASCII, in most of the character sets, and in Unicode. \xℕℕ ⇒ Specify a single character with code ℕℕ, where each ℕ is a hexadecimal digit. What this stands for depends on the text encoding. This notation works for codepoints 0-255 (\x00 - \xFF), which covers the full ANSI character set range, or the first 256 Unicode characters. For instance, \xE9 may match an é or a θ depending on the character set (also known as the “code page”) in an ANSI encoded document. These next two only work with Unicode encodings (so the various UTF-8 and UTF-16 encodings): \x{ℕℕℕℕ} ⇒ Like \xℕℕ, but matches a full 16-bit Unicode character, which is any codepoint from U+0000 to U+FFFF. \x{ℕℕℕℕ}\x{ℕℕℕℕ} ⇒ For Unicode characters above U+FFFF, in the range U+10000 to U+10FFFF, you need to break the single 5-digit or 6-digit hex value and encode it into two 4-digit hex codes; these two codes are the “surrogate codes” for the character. For example, to search for the 🚂 STEAM LOCOMOTIVE character at U+1F682, you would search for the surrogate codes \x{D83D}\x{DE82}. If you want to know the surrogate codes for a given character, search the internet for “surrogate codes for character” (where character is the fancy Unicode character you need the codes for); the surrogate codes are equivalent to the two-word UTF-16 encoding for those higher characters, so UTF-16 tables will also work for looking this up. Any site or tool that you are likely to be using to find the U+###### for a given Unicode character will probably already give you the surrogate codes or UTF-16 words for the same character; if not, find a tool or site that does. You can also compute surrogate codes yourself from the character code, but only if you are comfortable with hexadecimal and binary. Skip the following bullets if you are prone to mathematics-based PTSD. Start with your Unicode U+######, calling the hexadecimal digits as PPWXYZ. The PP digits indicate the plane. subtract one and convert to the 4 binary bits pppp (so PP=01 becomes 0000, PP=0F becomes 1110, and PP=10 becomes 1111) Convert each of the other digits into 4 bits (W as wwww, X as xxvv, Y as yyyy, and Z as zzzz; you will see in a moment why two different characters are used in xxvv) Write those 20 bits in sequence: ppppwwwwxxvvyyyyzzzz Group into two equal groups: ppppwwwwxx and vvyyyyzzzz (you can see that the X ⇒ xxvv was split between the two groups, hence the notation) Before the first group, insert the binary digits 110110 to get 110110ppppwwwwxx, and split into the nibbles 1101 10pp ppww wwxx. Convert those nibbles to hex: it will give you a value from \x{D800} thru \x{DBFF}; this is the High Surrogate code Before the second group, insert the binary digits 110111 to get 110111vvyyyyzzzz, and split into the nibbles 1101 11vv yyyy zzzz. Convert those nibbles to hex: it will give you a value from \x{DC00} thru \x{DFFF}; this is the Low Surrogate code Combine those into the final \x{ℕℕℕℕ}\x{ℕℕℕℕ} for searching. For more on this, see the Wikipedia article on Unicode Planes, and the discussion in the Notepad++ Community Forum about how to search for non-ASCII characters Collating Sequences [[._col_.]] ⇒ The character the col “collating sequence” stands for. For instance, in Spanish, ch is a single letter, though it is written using two characters. That letter would be represented as [[.ch.]]. This trick also works with symbolic names of control characters, like [[.BEL.]] for the character of code 0x07. See also the discussion on character ranges. Control characters \a ⇒ The BEL control character 0x07 (alarm). \b ⇒ The BS control character 0x08 (backspace). This is only allowed inside a character class definition. Otherwise, this means “a word boundary”. \e ⇒ The ESC control character 0x1B. \f ⇒ The FF control character 0x0C (form feed). \n ⇒ The LF control character 0x0A (line feed). This is the regular end of line under Unix systems. \r ⇒ The CR control character 0x0D (carriage return). This is part of the DOS/Windows end of line sequence CR-LF, and was the EOL character on Mac 9 and earlier. OSX and later versions use \n. \t ⇒ The TAB control character 0x09 (tab, or hard tab, horizontal tab). \c☒ ⇒ The control character obtained from character ☒ by stripping all but its 5 lowest order bits. For instance, \cA and \ca both stand for the SOH control character 0x01. You can think of this as “\c means ctrl”, so \cA is the character you would get from hitting Ctrl+A in a terminal. (Note that \c☒ will not work if ☒ is outside of the Basic Multilingual Plane (BMP) – that is, it only works if ☒ is in the Unicode character range U+0000 - U+FFFF. The intention of \c☒ is to mnemonically escape the ASCII control characters obtained by typing Ctrl+☒, it is expected that you will use a simple ASCII alphanumeric for the ☒, like \cA or \ca.) Special Control escapes \R ⇒ Any newline sequence. Specifically, the atomic group (?>\r\n|\n|\x0B|\f|\r|\x85|\x{2028}|\x{2029}). Please note, this sequence might match one or two characters, depending on the text. Because its length is variable-width, it cannot be used in lookbehinds. Because it expands to a parentheses-based group with an alternation sequence, it cannot be used inside a character class. If you accidentally attempt to put it in a character class, it will be interpreted like any other literal-character escape (where \☒ is used to make sure that the next character is literal) meaning that the R will be taken as a literal R, without any special meaning. For example, if you try [\t\R]: you may be intending to say, “match any single character that’s a tab or a newline”, but what you are actually saying is “match the tab or a literal R”; to get what you probably intended, use [\t\v] for “a tab or any vertical spacing character”, or [\t\r\n] for “a tab or carriage return or newline but not any of the weird verticals”. Ranges or kinds of characters Character Classes [_set_] ⇒ This indicates a set of characters, for example, [abc] means any of the literal characters a, b or c. You can also use ranges by putting a hyphen between characters, for example [a-z] for any character from a to z. You can use a collating sequence in character ranges, like in [[.ch.]-[.ll.]] (these are collating sequences in Spanish). Certain characters require special treatment inside character classes: To use a literal - in a character class: Use it directly as the first or last character in the enclosing class notation, like [-abc] or [abc-]; OR use it “escaped” at any position, like [\-abc] or [a\-bc] . To use a literal ] in a character class: Use it directly right after the opening [ of the class notation, like []abc]; OR use it “escaped” at any position, like [\]abc] or [a\]bc] . To use a literal [ in a character class: Use it directly like any other character, like [ab[c]; “escaping” is not necessary, but is permissible, like [ab\[c] . This character is not special when used alone inside a class; however, there are cases where it is special in combination with another: If used with a colon in the order [: inside a class, it is the opening sequence for a named class (described below); if you want to include both a [ and a : inside the same character class, do not use them unescaped right next to each other; either change the order, like [:[], or escape one or both, like [\[:] or [[\:] or [\[\:] . If used with an equals sign in the order [= inside a class, it is the opening sequence for an equivalence class (described below); if you want to include both a [ and a = inside the same character class, do not use them unescaped right next to each other; either change the order, like [=[], or escape one or both, like [\[=] or [[\=] or [\[\=] . To use a literal \ in a character class, it must be doubled (i.e., \\) inside the enclosing class notation, like [ab\\c] . To use a literal ^ in a character class: Use it directly as any character but the first, such as [a^b] or [ab^]; OR use it “escaped” at any position, such as [\^ab] or [a\^b] or [ab\^] . [^_set_] ⇒ The complement of the characters in the set. For example, [^A-Za-z] means any character except an alphabetic character. Care should be taken with a complement list, as regular expressions are always multi-line, and hence [^ABC]* will match until the first A, B or C (or a, b or c if match case is off), including any newline characters. To confine the search to a single line, include the newline characters in the exception list, e.g. [^ABC\r\n]. [[:_name_:]] or [[:☒:]] ⇒ The whole character class named name. For many, there is also a single-letter “short” class name, ☒. Please note: the [:_name_:] and [:☒:] must be inside a character class [...] to have their special meaning. short full name description equivalent character class alnum letters and digits alpha letters h blank spacing which is not a line terminator [\t\x20\xA0] cntrl control characters [\x00-\x1F\x7F\x81\x8D\x8F\x90\x9D] d digit digits graph graphical character, so essentially any character except for control chars, \0x7F, \x80 l lower lowercase letters print printable characters [\s[:graph:]] punct punctuation characters [!"#$%&'()*+,\-./:;<=>?@\[\\\]^_{\|}~] s space whitespace (word or line separator) [\t\n\x0B\f\r\x20\x85\xA0\x{2028}\x{2029}] u upper uppercase letters unicode any character with code point above 255 [\x{0100}-\x{FFFF}] w word word characters [_\d\l\u] xdigit hexadecimal digits [0-9A-Fa-f] Note that letters include any unicode letters (ASCII letters, accented letters, and letters from a variety of other writing systems); digits include ASCII numeric digits, and anything else in Unicode that’s classified as a digit (like superscript numbers ¹²³…). Note that those character class names may be written in upper or lower case without changing the results. So [[:alnum:]] is the same as [[:ALNUM:]] or the mixed-case [[:AlNuM:]]. As stated earlier, the [:_name_:] and [:☒:] (note the single brackets) must be a part of a surrounding character class. However, you may combine them inside one character class, such as [_[:d:]x[:upper:]=], which is a character class that would match any digit, any uppercase, the lowercase x, and the literal _ and = characters. These named classes won’t always appear with the double brackets, but they will always be inside of a character class. If the [:_name_:] or [:☒:] are accidentally not contained inside a surrounding character class, they will lose their special meaning. For example, [:upper:] is the character class matching :, u, p, e, and r; whereas [[:upper:]] is similar to [A-Z] (plus other unicode uppercase letters) [^[:_name_:]] or [^[:☒:]] ⇒ The complement of character class named name or ☒ (matching anything not in that named class). This uses the same long names, short names, and rules as mentioned in the previous description. Character classes may not contain parentheses-based groups of any kind, including the special escape \R (which expands to a parentheses-based group when evaluated, even though \R doesn’t look like it contains parentheses). Character Properties These properties behave similar to named character classes, but cannot be contained inside a character class. \p☒ or \p{_name_} ⇒ Same as [[:☒:]] or [[:_name_:]], where ☒ stands for one of the short names from the table above, and name stands for one of the full names from above. For instance, \pd and \p{digit} both stand for a digit, just like the escape sequence \d does. \P☒ or \P{_name_} ⇒ Same as [^[:☒:]] or [^[:_name_:]] (not belonging to the class name). Character escape sequences \☒ ⇒ Where ☒ is one of d, w, l, u, s, h, v, described below. These single-letter escape sequences are each equivalent to a class from above. The lower-case escape sequence means it matches that class; the upper-case escape sequence means it matches the negative of that class. (Unlike the properties, these can be used both inside or outside of a character class.) Description Escape Sequence Positive Class Negative Escape Sequence Negative Class digits \d [[:digit:]] \D [^[:digit:]] word chars \w [[:word:]] \W [^[:word:]] lowercase \l [[:lower:]] \L [^[:lower:]] uppercase \u [[:upper:]] \U [^[:upper:]] word/line separators \s [[:space:]] \S [^[:space:]] horizontal space \h [[:blank:]] \H [^[:blank:]] vertical space \v see below \V Vertical space: This encompasses all the [[:space:]] characters that aren’t [[:blank:]] characters: The LF, VT, FF, CR , NEL control characters and the LS and PS format characters: 0x000A (line feed), 0x000B (vertical tabulation), 0x000C (form feed), 0x000D (carriage return), 0x0085 (next line), 0x2028 (line separator) and 0x2029 (paragraph separator). There isn’t a named class which matches. Note: despite its similarity to \v, even though \R matches certain vertical space characters, it is not a character-class-equivalent escape sequence (because it evaluates to a parentheses()-based expression, not a class-based expression). So while \d, \l, \s, \u, \w, \h, and \v are all equivalent to a character class and can be included inside another bracket[]-based character class, the \R is not equivalent to a character class, and cannot be included inside a bracketed[] character-class. Equivalence Classes [[=_char_=]] ⇒ All characters that differ from char by case, accent or similar alteration only. For example [[=a=]] matches any of the characters: A, À, Á, Â, Ã, Ä, Å, a, à, á, â, ã, ä and å. Multiplying operators + ⇒ This matches 1 or more instances of the previous character, as many as it can. For example, Sa+m matches Sam, Saam, Saaam, and so on. [aeiou]+ matches consecutive strings of vowels. * ⇒ This matches 0 or more instances of the previous character, as many as it can. For example, Sa*m matches Sm, Sam, Saam, and so on. ? ⇒ Zero or one of the last character. Thus Sa?m matches Sm and Sam, but not Saam. *? ⇒ Zero or more of the previous group, but minimally: the shortest matching string, rather than the longest string as with the “greedy” operator. Thus, m.*?o applied to the text margin-bottom: 0; will match margin-bo, whereas m.*o will match margin-botto. +? ⇒ One or more of the previous group, but minimally. {ℕ} ⇒ Matches ℕ copies of the element it applies to (where ℕ is any decimal number). {ℕ,} ⇒ Matches ℕ or more copies of the element it applies to. {ℕ,ℙ} ⇒ Matches ℕ to ℙ copies of the element it applies to, as much it can (where ℙ ≥ ℕ). {ℕ,}? or {ℕ,ℙ}? ⇒ Like the above, but minimally. *+ or ?+ or ++ or {ℕ,}+ or {ℕ,ℙ}+ ⇒ These so called “possessive” variants of greedy repeat marks do not backtrack. This allows failures to be reported much earlier, which can boost performance significantly. But they will eliminate matches that would require backtracking to be found. As an example, see how the matching engine handles the following two regexes: When regex “.*” is run against the text “abc”x : `“` matches `“` `.*` matches `abc”x` `”` doesn't match ( End of line ) => Backtracking `.*` matches `abc”` `”` doesn't match letter `x` => Backtracking `.*` matches `abc` `”` matches `”` => 1 overall match `“abc”` When regex “.*+”, with a possessive quantifier, is run against the text “abc”x : `“` matches `“` `.*+` matches `abc”x` ( catches all remaining characters ) `”` doesn't match ( End of line ) Notice there is no match at all in this version, because the possessive quantifier prevents backtracking to a possible solution. Anchors Anchors match a zero-length position in the line, rather than a particular character. ^ ⇒ This matches the start of a line (except when used inside a set, see above). $ ⇒ This matches the end of a line. \< ⇒ This matches the start of a word using Boost’s definition of words. \> ⇒ This matches the end of a word using Boost’s definition of words. \b ⇒ Matches either the start or end of a word. \B ⇒ Not a word boundary. It represents any location between two word characters or between two non-word characters. \A or \` ⇒ Matches the start of the file. \z or \' ⇒ Matches the end of the file. \Z ⇒ Matches like \z with an optional sequence of newlines before it. This is equivalent to (?=\v*\z), which departs from the traditional Perl meaning for this escape. \G ⇒ This “Continuation Escape” matches the end of the previous match, or matches the start of the text being matched if no previous match was found. In Find All or Replace All circumstances, this will allow you to anchor your next match at the end of the previous match. If it is the first match of a Find All or Replace All, and any time you use a single Find Next or Replace, the “end of previous match” is defined to be the start of the search area – the beginning of the document, or the current caret position, or the start of the highlighted text. Because of that, if you are using it in an alternation, where you want to say “find any occurrence of something after some prefix, or after a previous match), you will want to make sure that your prefix includes the start-of-file \A, otherwise the \G portion may accidentally match start-of-file when you don’t want that to occur. Capture Groups and Backreferences (_subset_) ⇒ Numbered Capture Group: Parentheses mark a part of the regular expression, also known as a subset expression or capture group. The string matched by the contents of the parentheses (indicated by subset in this example) can be re-used with a backreference or as part of a replace operation; see Substitutions, below. Groups may be nested. (?<name>_subset_) or (?'name'_subset_) ⇒ Named Capture Group: Names the value matched by subset as the group name. Please note that group names are case-sensitive. \ℕ, \gℕ, \g{ℕ}, \g<ℕ>, \g'ℕ', \kℕ, \k{ℕ}, \k<ℕ> or \k'ℕ' ⇒ Numbered Backreference: These syntaxes match the ℕth capture group earlier in the same expression. (Backreferences are used to refer to the capture group contents only in the search/match expression; see the Substitution Escape Sequences for how to refer to capture groups in substitutions/replacements.) A regex can have multiple subgroups, so \2, \3, etc. can be used to match others (numbers advance left to right with the opening parenthesis of the group). You can have as many capture groups as you need, and are not limited to only 9 groups (though some of the syntax variants can only reference groups 1-9; see the notes below, and use the syntaxes that explicitly allow multi-digit ℕ if you have more than 9 groups) Example: ([Cc][Aa][Ss][Ee]).*\1 would match a line such as Case matches Case but not Case doesn't match cASE. \ℕ ⇒ This form can only have ℕ as digits 1-9, so if you have more than 9 capture groups, you will have to use one of the other numbered backreference notations, listed in the next bullet point. Example: the expression \10 matches the contents of the first capture group \1 followed by the literal character 0”, not the contents of the 10th group. \gℕ, \g{ℕ}, \g<ℕ>, \g'ℕ', \kℕ, \k{ℕ}, \k<ℕ> or \k'ℕ' ⇒ These forms can handle any non-zero ℕ. For positive ℕ, it matches the ℕth subgroup, even if ℕ has more than one digit. \g10 matches the contents from the 10th capture group, not the contents from the first capture group followed by the literal 0. If you want to match a literal number after the contents of the ℕth capture group, use one of the forms that has braces, brackets, or quotes, like \g{ℕ} or \k'ℕ' or \k<ℕ>: For example, \g{2}3 matches the contents of the second capture group, followed by a literal 3, whereas \g23 would match the contents of the twenty-third capture group. For clarity, it is highly recommended to always use the braces or brackets form for multi-digit ℕ For negative ℕ, groups are counted backwards relative to the last group, so that \g{-1} is the last matched group, and \g{-2} is the next-to-last matched group. Please, note the difference between absolute and relative backreferences. For instance, an exact four-letters word palindrome can be matched with : the regex (?-i)\b(\w)(\w)\g{2}\g{1}\b, when using absolute (positive) coordinates the regex (?-i)\b(\w)(\w)\g{-1}\g{-2}\b, when using relative (negative) coordinates \g{name}, \g<name>, \g'name', \k{name}, \k<name> or \k'name' ⇒ Named Backreference: The string matching the subexpression named name. (As with the Numbered Backreferences above, these Named Backreferences are used to refer to the capture group contents only in the search/match expression; see the Substitution Escape Sequences for how to refer to capture groups in substitutions/replacements.)

      regular expression

    1. Note de Synthèse sur le Bizutage : Définition, Risques et Actions

      Synthèse

      Le bizutage est un délit grave et non une simple tradition étudiante, défini par l'article 225-16-1 du Code pénal.

      Il se caractérise par le fait d'amener une personne, consentante ou non, à subir ou commettre des actes humiliants ou dégradants, souvent accompagnés d'une consommation excessive d'alcool.

      Ce phénomène touche principalement l'enseignement supérieur et les internats, et est généralement orchestré par les étudiants des années supérieures (deuxième ou troisième année) sur les nouveaux arrivants.

      Les conséquences du bizutage sont profondes et peuvent être psychologiques (traumatismes durables, dépression), physiques (blessures, handicaps à vie) et parfois mortelles.

      Les actes vont de l'ingestion forcée de substances à des simulations sexuelles, des insultes et la diffusion d'images dégradantes sur les réseaux sociaux.

      La dynamique de groupe et la pression sociale rendent le refus extrêmement difficile pour les victimes, invalidant toute notion de consentement.

      Les parents ont un rôle crucial à jouer dans la prévention, en identifiant les signaux d'alerte avant les week-ends d'intégration (questionnaires déplacés, demande d'apporter de l'alcool, décharges de responsabilité) et en maintenant le dialogue avec leurs enfants.

      En cas de bizutage avéré, il est impératif de soutenir la victime sans la juger, de recueillir des preuves (certificats médicaux, témoignages, photos) et de contacter la direction de l'établissement, qui a l'obligation légale de saisir le procureur.

      Le Comité National Contre le Bizutage (CNCB) constitue une ressource essentielle pour l'écoute, le conseil et la médiation.

      --------------------------------------------------------------------------------

      1. Définition Juridique et Caractéristiques du Bizutage

      Le bizutage n'est pas une pratique anodine mais un délit formellement interdit et sanctionné par la loi française. Sa compréhension passe par une analyse de sa définition légale et de ses distinctions avec d'autres phénomènes comme le harcèlement.

      1.1. Le Cadre Légal : Article 225-16-1 du Code Pénal

      Le bizutage est défini comme le fait, pour une personne, "d'amener autrui, contre son gré ou non, à subir ou à commettre des actes humiliants ou dégradants ou à consommer de l'alcool de manière excessive" dans le cadre de manifestations ou réunions liées aux milieux scolaire, sportif et socio-éducatif.

      Sanctions : Ce délit est puni de six mois d'emprisonnement et de 7 500 euros d'amende.

      Auteur de la loi : Le Comité National Contre le Bizutage (CNCB) a participé à l'élaboration de cette loi en 1998.

      1.2. Concepts Fondamentaux

      Deux notions clés de la loi méritent une attention particulière :

      "Actes humiliants ou dégradants" : La perception de l'humiliation est subjective. Un acte peut être vécu comme profondément dégradant par une personne et pas par une autre.

      Il n'existe pas d'échelle pour mesurer l'humiliation. Un acte est considéré comme tel dès lors qu'il met la personne mal à l'aise ou porte atteinte à sa dignité.

      "Contre son gré ou non" : C'est l'élément le plus crucial. La notion de consentement n'existe pas dans le bizutage.

      Un jeune qui participe aux épreuves, même en donnant l'impression de s'amuser, n'est pas considéré comme consentant au regard de la loi. La pression du groupe, le désir d'intégration et la consommation d'alcool annihilent le libre arbitre.

      1.3. Distinction avec le Harcèlement

      Il est essentiel de ne pas confondre le bizutage et le harcèlement :

      Le Harcèlement : Vise une seule personne (ou un groupe restreint) pour des motifs spécifiques (physique, origine, etc.). Il s'agit d'un ou plusieurs harceleurs contre une victime ciblée.

      Le Bizutage : Vise un groupe entier, les "nouveaux", par un autre groupe, les "anciens".

      La seule et unique raison du bizutage est le statut de nouvel arrivant. L'objectif affiché, bien que perverti, est un rite de passage pour "intégrer" la promotion.

      2. Manifestations et Contexte du Bizutage

      Le bizutage se déroule selon des schémas récurrents, impliquant des acteurs spécifiques dans des environnements propices à l'abus de pouvoir.

      2.1. Acteurs et Lieux Concernés

      Les Bizuteurs : Généralement les étudiants de deuxième ou troisième année, souvent organisés par le Bureau des Élèves (BDE).

      Leurs motivations sont diverses : se venger d'un bizutage subi, ou un sentiment de toute-puissance et de perversité.

      Les Bizutés : Les nouveaux arrivants (premières années).

      Lieux : Le phénomène touche tous les types d'établissements de l'enseignement supérieur (universités, écoles de commerce, médecine, architecture, BTS), les centres sportifs (CREPS) et est particulièrement prévalent dans les internats, qui sont des lieux clos et propices aux abus.

      2.2. Formes et Exemples d'Actes de Bizutage

      Les pratiques sont variées mais suivent souvent une escalade, une "spirale" qui commence de manière prétendument "amusante" avant de dégénérer.

      Catégorie d'actes

      Exemples concrets issus de témoignages

      Humiliation Physique

      - Se faire couvrir d'un mélange "collant et puant" (œufs, farine, litière pour lapin, soupe de poisson).<br>- Être attaché à d'autres, parfois dans des positions dégradantes.<br>- Passer dans un tuyau rempli d'huile ou une bassine de soda.

      Consommation Forcée

      - Obligation de boire de l'alcool en grande quantité (la vodka est très fréquente).<br>- Ingurgiter de la nourriture ou des boissons dégoûtantes.

      Atteintes Sexuelles

      - Obliger une fille à simuler une fellation ou à faire un strip-tease.<br>- Chanter des chansons obscènes.<br>- Insultes à caractère sexiste pour les filles et homophobe pour les garçons.

      Cyber-violence

      - Déshabiller les bizutés, les filmer ou les photographier.<br>- Diffuser les images sur les réseaux sociaux.

      Menaces

      - Menacer ceux qui refusent de participer, les qualifier de "nuls" ou de "pas drôles".

      2.3. La Psychologie du Bizuteur

      La justification principale avancée par les bizuteurs est de "souder la promotion" et de créer des liens.

      En réalité, la logique sous-jacente est une relation de dominant-dominé.

      Un témoignage d'un ancien bizuteur est particulièrement éclairant :

      "Je retiens du bizutage, un sentiment enivrant de pouvoir. C'est en criant 'bois et ferme ta gueule' à une première année [...] que j'ai compris le plaisir d'être tyran d'un jour. J'ai adoré soumettre des premières années."

      3. Conséquences Graves et Dégâts Humains

      La formule du CNCB résume l'impact du bizutage : "Il tue parfois, il traumatise souvent et il humilie toujours."

      3.1. Conséquences Psychologiques

      Traumatismes à long terme : Des victimes contactent le CNCB 5, 10, voire 30 ans après les faits, n'ayant jamais réussi à oublier.

      Dépression et décrochage : De nombreux jeunes développent une dépression et abandonnent leurs études pour ne plus avoir à croiser leurs "bourreaux" dans les couloirs.

      Honte et culpabilité : Les victimes ressentent une profonde honte d'avoir accepté, de ne pas avoir su dire non, ce qui les empêche souvent de parler.

      3.2. Conséquences Physiques et Mortelles

      Le bizutage peut causer des blessures graves, voire la mort.

      Blessures graves : Un jeune est resté aveugle pendant trois semaines après avoir été baigné dans des liquides toxiques ; un autre est handicapé à vie après une chute de trois étages lors d'un bizutage à Lille en 2012.

      Décès : Plusieurs décès directement liés à des bizutages ont été recensés.

      Année

      Lieu

      Contexte

      2012

      Saint-Cyr

      Noyade

      2013

      École des Mines

      Décès

      2017

      Fac de Nanterre / Dentaire de Rennes

      Décès

      2021

      Lille

      Décès de Simon Monray

      Message de prévention crucial : Ne jamais laisser seul un jeune fortement alcoolisé. Il faut appeler les secours (pompiers, SAMU) et rester avec lui. Un jeune est décédé à Rennes d'un coma éthylique après avoir été laissé seul pour "cuver son vin".

      4. Rôle des Parents et Stratégies d'Action

      Les parents sont en première ligne pour prévenir le bizutage et agir s'il survient.

      4.1. Prévention en Amont (Avant un week-end d'intégration)

      Dialoguer : Profiter de la demande de financement pour le week-end pour aborder le sujet du bizutage, sans effrayer mais en prévenant.

      Analyser l'invitation et la liste de matériel : Certains signes doivent alerter.

      Questionnaire "bizarre" avec des questions intimes ou sur l'alcool.  

      ◦ Demande de prévoir des vêtements "qui ne craignent rien".    ◦ Demande d'apporter de l'alcool.  

      ◦ Demande de signer une décharge de responsabilité, qui n'a aucune valeur juridique.

      Exiger des informations claires : Les parents doivent connaître le lieu et le programme précis du week-end. Un lieu tenu secret est un signal d'alarme majeur.

      Assurer la communication : Le jeune doit toujours conserver son téléphone portable.

      La confiscation des téléphones vise à couper les victimes du monde extérieur et doit déclencher une alerte immédiate auprès de la direction de l'établissement.

      Conseiller le refus : Inciter le jeune à dire non s'il se sent mal à l'aise et à se regrouper avec d'autres qui partagent ses réticences.

      4.2. Réaction Après un Bizutage

      Identifier les signaux de détresse :

      ◦ Refus de parler du week-end, malaise.  

      ◦ Changement de comportement : isolement, anxiété, sommeil perturbé.  

      ◦ Volonté de quitter l'établissement.   

      ◦ Marques physiques ou blessures.

      Écouter et soutenir :

      ◦ Rester calme, ne pas paniquer.    ◦ Écouter sans porter de jugement sur l'incapacité du jeune à dire non.  

      Ne jamais minimiser les faits subis.   

      Déculpabiliser la victime : les seuls coupables sont les bizuteurs.

      Recueillir des preuves :

      ◦ Faire établir des certificats médicaux (physiques et psychologiques).    ◦ Conserver toutes les preuves : messages, photos, noms des organisateurs et des autres victimes, dates, lieux.

      4.3. Démarches Institutionnelles et Judiciaires

      Contacter l'établissement : Informer le chef d'établissement, les services de la vie étudiante ou le référent bizutage.

      Le CNCB peut servir de médiateur en garantissant l'anonymat.

      Obligation de l'établissement : Le chef d'établissement a l'obligation légale de saisir le procureur de la République lorsqu'il a connaissance de faits délictueux.

      Il doit aussi engager des poursuites disciplinaires.

      Porter plainte : Il est conseillé de consulter un avocat avant d'engager une procédure judiciaire.

      La justice est souvent très lente et de nombreuses plaintes sont classées sans suite.

      Le processus peut être long (l'affaire de 2012 s'est terminée en 2025) et coûteux.

      Objectif des sanctions : Les sanctions doivent être rapides et exemplaires pour dissuader de futures tentatives, car les bizuteurs ne sont généralement pas récidivistes.

      5. Le Comité National Contre le Bizutage (CNCB)

      Le CNCB est un acteur central de la lutte contre ce phénomène en France.

      Composition : Il regroupe des adhérents directs et des personnes morales comme les fédérations de parents d'élèves, des syndicats enseignants, la Conférence des présidents d'universités et la Conférence des grandes écoles.

      Missions :

      1. Recueillir les témoignages, écouter et conseiller les victimes.  

      2. Interpeller les responsables d'établissements et les ministères.   

      3. Intervenir dans les établissements pour prévenir et éradiquer le bizutage.  

      4. Réfléchir avec les jeunes à des formes d'accueil respectueuses et bienveillantes.

      Partenariats : Le CNCB travaille en étroit partenariat avec le Ministère de l'Enseignement Supérieur et le Ministère des Sports, qui le subventionnent.

      En revanche, la collaboration avec le Ministère de l'Éducation Nationale est décrite comme inexistante ("c'est un mur").

      Ressources : Le site web du CNCB met à disposition de nombreux outils (diaporamas, brochures) pour permettre à d'autres acteurs (parents, enseignants) de mener des actions de sensibilisation.

      6. Citations Clés

      Sur la nature du bizutage : "Le bizutage, il tue parfois, il traumatise souvent et il humilie toujours."

      Sur le consentement : "Les visiteurs nous disent 'mais madame, on a obligé personne. Tout le monde était d'accord.' [...] Et là, il faut vraiment remettre les pendules à l'heure et réfléchir avec eux parce que c'est faux. Le nouveau, il n'a pas le choix."

      Sur la motivation du bizuteur : "Je retiens du bizutage, un sentiment enivrant de pouvoir. [...] J'ai compris le plaisir d'être tyran d'un jour. J'ai adoré soumettre des premières années."

      Témoignage d'une victime : "Je me sentais sale dans tous les sens du terme. On est obligé parce qu'on nous dit en gros, si vous ne le faites pas, vous n'êtes pas drôle. Vous êtes des nuls."

      Sur l'importance de dénoncer : "Dénoncer un bizutage, c'est dénoncer un délit et que de dénoncer un délit, c'est le devoir de tout citoyen. Et que si on ne dénonce pas les faits, eh bien, c'est tout simple, ils se reproduiront l'année d'après."

    1. creating outlines can help clarify your purpose before you begin writing.

      it may feel like an extra step along the way but the final draft process is ay easier when you prep for it.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewer’s Comments

      We thank all three reviewers for their thoughtful and detailed comments, which will help us to improve the quality and clarity of our manuscript.


      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __ Summary: In this work, Tripathi et al address the open question of how the Fat/Ds pathway affects organ shape, using the Drosophila wing as a model. The Fat/Ds pathway is a conserved but complex pathway, interacting with Hippo signalling to affect growth and providing planar cell polarity that can influence cellular dynamics during morphogenesis. Here, authors use genetic perturbations combined with quantification of larval, pupal, and adult wing shape and laser ablation to conclude that the Ft/Ds pathway affects wing shape only during larval stages in a way that is at least partially independent of its interaction with Hippo and rather due to an effect on tissue tension and myosin II distribution. Overall the work is clearly written and well presented. I only have a couple major comments on the limitations of the work.

      Major comments: 1. Authors conclude from data in Figures 1 and 2 that the Fat/Ds pathway only affects wing shape during larval stages. When looking at the pupal wing shape analysis in Figure 2L, however, it looks there is a difference in wt over time (6h-18h, consistent with literature), but that difference in time goes away in RNAi-ds, indicating that actually there is a role for Ds in changing shape during pupal stages, although the phenotype is clearly less dramatic than that of larval stages. No statistical test was done over time (within the genotype), however, so it's hard to say. I recommend the authors test over time - whether 6h and 18h are different in wild type and in ds mutant. I think this is especially important because there is proximal overgrowth in the Fat/Ds mutants, much of which is contained in the folds during larval stages. That first fold, however, becomes the proximal part of the pupal wing after eversion and contracts during pupal stages to elongate the blade (Aiguoy 2010, Etournay 2015). Also, according to Trinidad Curr Biol 2025, there is a role for Fat/Ds pathway in pupal stages. All of that to say that it seems likely that there would be a phenotype in pupal stages. It's true it doesn't show up in the adult wing in the experiments in Fig 1, but looking at the pupal wing itself is more direct - perhaps the very proximal effect is less prominent later, as there is potential for further development after 18hr before adulthood and the most proximal parts are likely anyway excluded in the analysis.

      Response: Our main purpose in examining pupal wing shape was to emphasize that wings lacking ds are visibly abnormal even at early pupal stages. The reviewer makes the point that the change in shape from 6h to 18h APF is greater in control wings than in RNAi-ds wings. We have added quantitation of this to the revised manuscript as suggested. This difference could be interpreted as indicating that Ds-Fat signaling actively contributes to wing shape during pupal morphogenesis. However, given the genetic evidence that Ds-Fat signaling influences wing shape only during larval growth, we favor the interpretation that it reflects consequences of Ds-Fat action during larval stages – eg, overgrowth of the wing, particularly the proximal wing and hinge as occurs in ds and fat mutants, could result in relatively less elongation during the pupal hinge contraction phase. This wouldn’t change our key conclusions, but it is something that we discuss in a revised manuscript.

      I think there needs to be a mention and some discussion of the fact that the wing is not really flat. While it starts out very flat at 72h, by 96h and beyond, there is considerable curvature in the pouch that may affect measurements of different axis and cell shape. It is not actually specified in the methods, so I assume the measurements were taken using a 2D projection. Not clear whether the curvature of the pouch was taken into account, either for cell shape measurements presented in Fig 4 or for the wing pouch dimensional analysis shown in Fig 3, 6, and supplements. Do perturbations in Ft/Ds affect this curvature? Are they more or less curved in one or both axes? Such a change could affect the results and conclusions. The extent to which the fat/ds mutants fold properly is another important consideration that is not mentioned. For example, maybe the folds are deeper and contain more material in the ds/fat mutants, and that's why the pouch is a different shape? At the very least, this point about the 3D nature of the wing disc must be raised in discussion of the limitations of the study. For the cell shape analysis, you can do a correction based on the local curvature (calculated from the height map from the projection). For the measurement of A/P, D/V axes of the wing pouch, best would be to measure the geodesic distance in 3D, but this is not reasonable to suggest at this point. One can still try to estimate the pouch height/curvature, however, both in wild type and in fat/ds mutants.

      Response: The wing pouch measurements were done on 2D projections of wing discs that were already slightly flattened by coverslips, so there is not much curvature outside of the folds. We will revise the methods to make sure this is clear. While we recognize that the absolute values measured can be affected by this, our conclusions are based on the qualitative differences in proportions between genotypes and time points, and we wouldn’t expect these to differ significantly even if 3D distances were measured. Obtaining accurate 3D measures is technically more challenging - it requires having spacers matching the thickness of the wing disc, which varies at different time points and genotypes, and then measuring distances across curved surfaces. What we propose to address this is to do a limited set of 3D measures on wild-type and dsmutant wing discs at early and late stages and which we expect will confirm our expectation that the conclusions of our analysis are unaffected, while at the same time providing an indication of how much curvature affects the values obtained. We will also make sure the issue of wing disc curvature and folds is discussed in the text.

      Minor comments: 1. The analysis of the laser ablation is not really standard - usually one looks at recoil velocity or a more complicated analysis of the equilibrium shape using a model (e.g Shivakumar and Lenne 2016, Piscitello-Gomez 2023, Dye et al 2021). One may be able to extract more information from these experiments - nevertheless, I doubt the conclusions would change, given that that there seems to be a pretty clear difference between wt and ds (OPTIONAL).

      Response: We will add measurements of recoil velocities to complement our current analysis of circular cuts.

      Figure 7G: I think you also need a statistical test between RNAi-ds and UAS-rokCA+RNAi-ds.

      Response: We include this statistical test in the revised manuscript (it shows that they are significantly different).

      In the discussion, there is a statement: "However, as mutation or knock down of core PCP components, including pk or sple, does not affect wing shape... 59." Reference 59 is quite old and as far as I can tell shows neither images nor quantifications of the wing shape phenotype (not sure it uses "knockdown" either - unless you mean hypomorph?). A more recent publication Piscitello-Gomez et al Elife 2023 shows a very subtle but significant wing shape phenotype in core PCP mutants. It doesn't change your logic, but I would change the statement to be more accurate by saying "mutation of core PCP components has only subtle changes in adult wing shape"

      Response: Thank-you for pointing this out, we have revised the manuscript accordingly.

      **Referee cross-commenting**

      Reviewer2: Reviewer 2 makes the statement: "The distance along the AP boundary from the pouch border to DV midline is topologically comparable to the PD length of the adult wing. The distance along the DV boundary from A border to P border is topologically comparable to the AP length of the adult wing."

      I disagree - the DV boundary wraps around the entire margin of the adult wing (as correctly drawn with the pink line in Fig 2A). It is not the same as the wide axis of the adult wing (perpendicular to the AP boundary). It is not trivial to map the proximal-distal axis of the larval wing to the proximal-distal axis of the adult, due to the changes in shape that occur during eversion. Thus, I find it much easier to look at the exact measurement that the authors make, and it is much more standard in the field, rather than what the reviewer suggests. Alternatively, one could I guess measure in the adult the ratio of the DV margin length (almost the circumference of the blade?) to the AP boundary length. That may be a more direct comparison. Actually the authors leave out the term "boundary" - what they call AP is actually the AP boundary, not the AP axis, and likewise for the DV - what they measure is DV boundary, but I only noticed that in the second read-through now. Just another note, these measurements of the pouch really only correspond to the very distal part of the wing blade, as so much of the proximal blade comes from the folds in the wing disc. Therefore, a measurement of only distal wing shape would be more comparable.

      Response: We thank Reviewer 1 for their comments here. In terms of the region measured, we measure to the inner Wg ring in the disc, the location of this ring in the adult is actually more proximal than described above (eg see Fig 1B of Liu, X., Grammont, M. & Irvine, K. D. Roles for scalloped and vestigial in regulating cell affinity and interactions between the wing blade and the wing hinge. Developmental Biology 228, 287–303 (2000)), and this defines roughly the region we have measured in adult wings (with the caveat noted above that the measurements in the disc can be affected by curvature and the hinge/pouch fold, which we will address).

      Reviewer 2 states that authors cannot definitively conclude anything about mechanical tension from their reported cutting data because the authors have not looked at initial recoil velocity. I strongly disagree. __The wing disc tissue is elastic on much longer timescales than what's considered after laser ablation (even hours), and the shape of the tissue after it equilibrates from a circular cut (1-2min) can indeed be used to infer tissue stresses (see Dye et al Elife 2021, Piscitello-Gomez et al eLife 2023, Tahaei et al arXiv 2024).__ In the wing disc, the direction of stresses inferred from initial recoil velocity are correlated with the direction of stresses inferred from analysing the equilibrium shape after a circular cut. Rearrangements, a primary mechanism of fluidization in epithelia, does not occur within 1'. Analysing the equilibrium shape after circular ablation may be more accurate for assessing tissue stresses than initial recoil velocity - in Piscitello-Gomez et al 2023, the authors found that a prickle mutation (PCP pathway) affected initial recoil velocity but not tissue stresses in the pupal wing. Such equilibrium circular cuts have also been used to analyze stresses in the avian embryo, where it correlates with directions of stress gathered from force inference methods (Kong et al Scientific Reports 2019). The Tribolium example noted by the reviewer is on the timescale of tens to hundreds of minutes - much longer than the timescale of laser ablation retraction. It is true the analysis of the ablation presented in this paper is not at the same level as those other cited papers and could be improved. But I don't think the analysis would be improved by additional experiments doing timelapse of initial retraction velocity.

      Response: Thank-you, we agree with Reviewer 1 here.

      Reviewer 2 states "If cell anistropy is caused by polarized myosin activity, that activity is typically polarized along the short edges not long edges" Not true in this case. Myosin II accumulates along long boundaries (Legoff and Lecuit 2013). "Therefore, interpreting what causes the cell anistropy and how DS regulates it is difficult," Agreed - but this is well beyond the scope of this manuscript. The authors clearly show that there is a change of cell shape, at least in these two regions. Better would be to quantify it throughout the pouch and across multiple discs. Similar point for myosin quantifications - yes, polarity would be interesting and possible to look at in these data, and it would be better to do so on multiple discs, but the lack of overall myosin on the junctions shown here is not nothing. Interpreting what Ft/Ds does to influence tension and myosin and eventually tissue shape is a big question that's not answered here. I think the authors do not claim to fully understand this though, and maybe further toning down the language of the conclusions could help.

      Response: We agree with Reviewer 1 here and will also add quantitation of myosin across multiple discs and will include higher magnification myosin images and polarity tests.

      Reviewer 3: I agree with many of the points raised by Reviewer 3, in particular that relevant for Fig 1. The additional experiments looking at myosin II localization and laser ablation in the other perturbations (Hippo and Rok mutants/RNAi) would certainly strengthen the conclusions.

      Response: Reviewer 3 comment on Fig 1 requests Ab stains to assess recovery of expression after downshift, which we will do.

      We will add examination of myosin localization in hpo RNAi wing discs, and in the ds/rok combinations. We note that the effects of Rok manipulations on myosin and on recoil velocity have been described previously (eg Rauskolb et al 2014).

      Reviewer #1 (Significance (Required)): I think the work provides a clear conceptual advance, arguing that the Ft/Ds pathway can influence mechanical stress independently of its interaction with Hippo and growth. Such a finding, if conserved, could be quite important for those studying morphogenesis and Fat function in this and other organisms. For this point, the genetic approach is a clear strength. Previous work in the Drosophila wing has already shown an adult wing phenotype for Ft/Ds mutations that was attributed to its role in the larval growth phase, as marked clones show aberrant growth in mutants. The novelty of this work is the dissection of the temporal progression of this phenotype and how it relates to Hippo and myosin II activation. It remains unclear exactly how Ft/Ds may affect tissue tension, except that it involves a downregulation of myosin II - the mechanism of that is not addressed here and would involve considerable more work. I think the temporal analysis of the wing pouch shape was quite revealing, providing novel information about how the phenotype evolves in time, in particular that there is already a phenotype quite early in development. As mentioned above, however, the lack of consideration of the wing disc as a 3D object is a potential limitation. While the audience is likely mostly developmental biologists working in basic research, it may also interest those studying the pathway in other contexts, including in vertebrates given its conservation and role in other processes.

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __ The manuscript begins with very nice data from a ts sensitive period experiment. Instead of a ts mutation, the authors induced RNAi in a temperature dependent manner. The results are striking and strong. Knockdown of FT or DS during larval stages to late L3 changed shape while knockdown of FT or DS during later pupal stages did not. This indicates they are required during larval, not pupal stages of wing development for this shape effect. They did shift-up or shift-down at "early pupa stage" but precisely what stage that means was not described anywhere in the manuscript. White prepupal? Time? Likewise a shift-down was done at "late L3" but that meaning is also vague. Moreover, I was surprised to see they did not do a shift-up at the late L3 stage, to give completeness to the experiment. Why?

      Response: We have added more precise descriptions of the timing, and we will also add the requested late L3 shift-up experiment.

      Looking at the "shape" of the larval wing pouch they see a difference in the mutants. The pouch can be approximated as an ellipse, but with differing topology to the adult wing. Here, they muddled the analysis. The adult wing surface is analogous to one hemisphere of the larval wing pouch, ie., either dorsal or ventral compartment. The distance along the AP boundary from the pouch border to DV midline is topologically comparable to the PD length of the adult wing. The distance along the DV boundary from A border to P border is topologically comparable to the AP length of the adult wing. They confusingly call this latter metric the "DV length" and the former metric the "AP length" , and in fact they do not measure the PD length but PD+DP length. Confusing. Please change to make this consistent with earlier analysis of the adult and invert the reported ratio and divide by two.

      Then you would find the larval PD/AP ratio is smaller in the FT and DS mutants than wildtype, which resembles the smaller PD/AP ratio seen in the mutant adult wings. Totally consistent and also provides further evidence with the ts experiments that FT and DS exert shape effects in the larval phase of life.

      Response: As noted by Reviewer 1 in cross-referencing, some of the statements made by Reviewer 2 here are incorrect, eg “The distance along the DV boundary from A border to P border is topologically comparable to the AP length of the adult wing.” They are correct where they note that the A-P length we measure in the discs is actually equivalent to 2x the adult wing length, since we are measuring along both the dorsal and ventral wing, but this makes no difference to the analysis as the point is to compare shape between time points and genotypes, not to make inferences based on the absolute numbers obtained. The numerical manipulations suggested are entirely feasible but we think they are unnecessary.

      The remainder of the manuscript has experimental results that are more problematic, and really the authors do not figure out how the shape effect in larval stages is altered. I outline below the main problems.

      1. They compare the FT DS shape phenotypes to those of mutants or knockdowns in Hippo pathway genes (Hippo is known to be downstream of FT and DS). They find these Hippo perturbations do have shape effects trending in same direction as FT and DS effects. Knockdown reduces the PD/AP ratio while overexpressing WARTS increases the PD/AP ratio. The effect magnitudes are not as strong, but then again, they are using hypomorphic alleles and RNAi, which often induces partial or hypomorphic phenotypes. The effect strength is comparable when wing pouches are young but then dissipates over time, while FT and DS effects do not dissipate over time. The complexity of the data do not negate the idea that Hippo signaling is also playing some role and could be downstream of FT and DS in all of this. But the authors really downplay the data to the point of stating "These results imply that Ds-Fat influences wing pouch shape during wing disc growth separately from its effects on Hippo signaling." I think a more expansive perspective is needed given the caveats of the experiments.

      Response: Our results emphasize that the effects of Ds-Fat on wing shape cannot be explained solely by effects on Hippo signaling, eg as we stated on page 7 “These observations suggest that Hippo signaling contributes to, but does not fully explain, the influence of ds or fat on adult wing shape.” We also note that impairment of Hippo signaling has similar effects in younger discs, but very different effects in older discs, which clearly indicates that they are having very different effects during disc growth; we will revise the text to make sure our conclusions are clear.

                    The reviewer wonders whether some of the differences could be due to the nature of the alleles or gene knockdown. First, the *ex*, *ds*, and *fat* alleles that we use are null alleles (eg see FlyBase), so it is not correct to say that we use only hypomorphic alleles and RNAi. We do use a hypomorphic allele for wts, and RNAi for hpo, for the simple reason that null alleles in these genes are lethal, so adult wings could not be examined. A further issue that is not commented on by the reviewer, but is more relevant here, is that there are multiple inputs into Hippo signaling, so of course even a null allele for ex, ds or fat is not a complete shutdown of Hippo signaling. Nonetheless, one can estimate the relative impairment of Hippo signaling by measuring the increased size of the wings, and from this perspective the knockdown conditions that we use are associated with roughly comparable levels of Hippo pathway impairment, so we stand by our results. We do however, recognize that these issues could be discussed more clearly in the text, and will do so in a revised manuscript.
      

      Puzzlingly, this lack of taking seriously a set of complex results does not transfer to another set of experiments in which they inhibit or activate ROK, the rho kinase. When ROK is perturbed, they also see weak effects on shape when compared to FT or DS perturbation. This weakness is seen in adults, larvae, clones and in epistasis experiments. The epistasis experiment in particular convincingly shows that constitutuve ROK activation is not epistatic to loss of DS; in fact if anything the DS phenotype suppresses the ROK phenotype. These results also show that one cannot simply explain what FT and DS are doing with some single pathway or effector molecule like ROK. It is more complex than that.

      What I really think was needed were experiments combining FT and DS knockdown with other mutants or knockdowns in the Hippo and Rho pathways, and even combining Hippo and Rho pathway mutants with FT or DS intact, to see if there are genetic interactions (additive, synergistic, epistatic) that could untangle the phenotypic complexity.

      Response: We’re puzzled by these comments. First, we never claimed that what Fat or Ds do could be explained simply by manipulation of Rok (eg, see Discussion). Moreover, examination of wings and wing discs where ds is combined with Rho manipulations is in Fig 7, and Hippo and Rho pathway manipulation combinations are in Fig S5. We don’t think that combining ds or fat mutations with other Hippo pathway mutations would be informative, as it is well established that Ds-Fat are upstream regulators of Hippo signaling.

      Laser cutting experiments were done to see if there is anisotropy in tissue tension within the wing pouch. This was to test a favored idea that FT and DS activity generates anisotropy in tissue tension, thereby controlling overall anisotropic shape of the pouch. However there is a fundamental flaw to their laser cutting analysis. Laser cutting is a technique used to measure mechanical tension, with initial recoil velocity directly proportional to the tissue's tension. By cutting a small line and observing how quickly the edges of the cut snap apart, people can quantify the initial recoil velocity and infer the stored mechanical stress in the tissue at the time of ablation. Live imaging with high-speed microscopy is required to capture the immediate response of the tissue to the cut since initial recoil velocity occurs in the first few seconds. A kymograph is created by plotting the movement of the tissue edges over this time scale, perpendicular to the cut. The initial recoil velocity is the slope of the kymograph at time zero, representing how fast the severed edges move apart. A higher recoil velocity indicates higher mechanical tension in the tissue. However, the authors did not measure this initial recoil velocity but instead measured the distance between the severed edges at one time point: 60 seconds after cutting. This is much later than the time point at which the recoil usually begins to dissipate or decay. This decay phase typically lasts a minute or two, during which time the edges continue to separate but at a progressively slower rate. This time-dependent decay of the recoil reveals whether the tissue behaves more like a viscous fluid or an elastic solid. Therefore, the distance metric at 60 seconds is a measurement of both tension and the material properties of the cells. One cannot know then whether a difference in the distance is due to a difference in tension or fluidity of the cells. If the authors made measurements of edge separation at several time points in the first 10 seconds after ablation, they can deconvolute the two. Otherwise their analysis is inconclusive. Anisotropy in recoil could be caused by greater tissue fluidity along one axis. Observing a gradient of cell fluidity in a tissue along one axis of a tissue has been observed in the amnioserosa of Tribolium for example. (Related and important point - was the anisotropy of recoil oriented along the PD or AP axis or not oriented to either axis, this key point was never stated)..

      The authors cannot definitiviely conclude anything about mechanical tension from their reported cutting data.

      Response: As noted by Reviewer 1 in cross-commenting, there is no fluidity on a time scale of 1 minute in the wing disc, and circular ablations are an established methods to investigate tissue stress. We choose the circular ablation method in part because it interrogates stress over a larger area, whereas cutting individual junctions is subject to more variability, particularly as the orientation of the junction (eg radial vs tangential) impacts the tension detected in the wing disc. Nonetheless, we will add recoil measurements to the revised manuscript to complement our circular ablations, which we expect will provide independent confirmation of our results and address the Reviewer’s concern here.

      They measured the eccentricity of wing pouch cells near the pouch border, and found they were highly anisotropic compared to DS mutant cells at comparable locations. Cells were elongated but again what if either axis (PD or AP) they were elongated along was never stated. If cell anistropy is caused by polarized myosin activity, that activity is typically polarized along the short edges not long edges. Thus, recoil velocity after laaser cutting would be stonger along the axis aligned with short cell edges. It looks like the cutting anisotropy they see is greater along the axis aligned with long cell edges. Of course, if the cell anisotropy is caused by a pulling force exerted by the pouch boundary, then it would stretch the cells. This would in fact fit their cutting data. But then again, the observed cell anisotropy could also be caused by variation in the fluid-solid properties of the wing cells as discussed earlier. Compression of the cells then would deform them anisotropically and produce the anisotropic shapes that were observed, Therefore, interpreting what causes the cell anistropy and how DS regulates it is difficult,

      Response: As noted by Reviewer 1 in cross-commenting, it is well established that tension and myosin are higher along long edges in the proximal wing. However, we acknowledge that we could do a better job of making the location and orientation of the regions shown in these experiments clear and, we will address this in a revised manuscript.

      The imaging and analysis of the myosin RLC by GFP tagging is also flawed. SQH-GFP is a tried and true proxy for myosin activity in Drosophila. Although the authors image the wing pouch of wildtype and DS mutants. they did so under low magnification to image the entire pouch. This gives a "low-res" perspective of overall myosin but what they needed to do was image at high magnification in that proximal region of the pouch and see if Sqh-GFP is polarized in wildtype cells along certain cell edges aligned with an axis. And if such a polarity is observed, is it present or absent in the DS mutant. From the data shown in Figure 5, I cannot see any significant difference between wildtype and knocked down samples at this low resolution. Any difference, if there is any, is not really interpretable.

      Response: We agree that examination of myosin localization at high resolution to see if it is polarized is a worthwhile experiment. We did in fact do this, and myosin (Sqh:GFP) appeared unpolarized in ds mutants. However, the levels of myosin were so low that we didn’t feel confident in our assessment, so we didn’t include it. We now recognize that this was a mistake, and we will include high resolution myosin images and assessments of (lack of) polarity in a revised manuscript to address this comment.

      In conclusion, the manuscript has multiple problems that make it imposiible for the authors to make the claims they make in the current manuscript. And even if they calibrated their interpretations to fit the data, there is not much of a simple clear picture as to how FT and DS regulate pouch eccentricity in the larval wing.

      Response: We think that the legitimate issues raised are addressable, as described above, while some of the criticisms are incorrect (as noted by Reviewer 1).

      Reviewer #2 (Significance (Required)): This manuscript describes experiments studying the role that the protocadherins FAT and DACHSOUS play in determining the two dimensional "shape" of the fruit fly wing. By "shape", the manuscript really means how much the wing's outline, when approximated as an ellipse, deviates from a circle. The elliptical approximations of FT and DS mutant wings more closely resemble a circle compared to the more eccentric wildtype wings. This suggests the molecules contribute to anisotropic growth in some way. A great deal of attention has been paid on how FT and DS regulate overall organ growth and planar cell polarity, and the Irvine lab has made extensive contributions to these questions over the years. Somewhat understudied is how FT and DS regulate wing shape, and this manuscript focuses on that. It follows up on an interesting result that the Irvine lab published in 2019, in which mud mutants randomized spindle pole orientation in wing cells but did not change the eccentricity of wings, ruling out biased cell division orientation as a mechanism for the anisotropic growth.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __ Summary The authors investigate the mechanisms underlying epithelial morphogenesis using the Drosophila wing as a model system. Specifically, they analyze the contribution of the conserved Fat/Ds pathway to wing shape regulation. The main claim of the manuscript is that Ds/Fat controls wing shape by regulating tissue mechanical stress through MyoII levels, independently of Hippo signaling and tissue growth.

      Major Comments To support their main conclusions, the authors should address the following major points and consider additional experiments where indicated. Most of the suggested experiments are feasible within a reasonable timeframe, while a few are more technically demanding but would substantially strengthen the manuscript's central claims.

      Figure 1: The authors use temperature-sensitive inactivation of Fat or Ds to determine the developmental window during which these proteins regulate wing shape. To support this claim, it is essential to demonstrate that upon downshift during early pupal stages, Ds or Fat protein levels are restored to normal. For consistency, please include statistical analyses in Figure 1P and ensure that all y-axis values in shape quantifications start at 1.

      Response: We will do the requested antibody stains for Fat (Ds antibody is unfortunately no longer available, but the point made by the reviewer can be addressed by Fat as the approach and results are the same for both genes). We have also added the requested statistical analysis to Fig 1P, and adjusted the scales as requested.

      Figure 2: The authors propose that wing shape is regulated by Fat/Ds during larval development. However, Figure 2L suggests that wing elongation occurs in control conditions between 6 and 12 h APF, while this elongation is not observed upon Ds RNAi. The authors should therefore perform downshift experiments while monitoring wing shape during the pupal stage to substantiate their main claim. In addition, equivalent data for Fat loss of function should be included to support the assertion that Fat and Ds act similarly.

      Response: As noted in our response to point 1 of Reviewer 1, we agree that there does seem to be relatively more elongation in control wings than in ds RNAi wings, but we think this likely reflects effects of ds on growth during larval stages, and we will revise the manuscript to comment on this.

      We will also add the suggested examination of fat RNAi pupal wings.

      The suggested examination of pupal wing shape in downshift experiments is unfortunately not feasible. Our temperature shift experiments expressing ds or fat RNAi are done using the UAS-Gal4-Gal80tssystem. We also use the UAS-Gal4 system to mark the pupal wing. If we do a downshift experiment, then expression of the fluorescent marker will be shut down in parallel with the shut down of ds or fat RNAi, so the pupal wings would no longer be visible.

      Figure 3: The authors state that "These observations indicate that Ds-Fat signaling influences wing shape during the initial formation of the wing pouch, in addition to its effects during wing growth." This conclusion is not fully supported, as the authors only examine wing shape at 72 h AEL. At this stage, fat or ds mutant wings already display altered morphology. The authors could only make this claim if earlier time points were fully analyzed. In fact, the current data rather suggest that Ds function is required before 72 h AEL, as a rescue of wing shape is observed between 72 and 120 h AEL.

      Response: First, I think we are largely in agreement with the Reviewer, as the basis for our saying that DS-Fat are likely required during initial formation of the wing pouch is that our data show they must be required before 72 h AEL. Second, 72 h is the earliest that we can look using Wg expression as a marker, as at earlier stages it is in a ventral wedge rather than a ring around the future wing pouch + DV line (eg see Fig 8 of Tripathi, B. K. & Irvine, K. D. The wing imaginal disc. Genetics (2022) doi:10.1093/genetics/iyac020.). We can revise the text to make sure this is clear.

      Figure 4: The authors state that "The influence of Ds-Fat on wing shape is not explained by Hippo signaling." However, this conclusion is not supported by their data, which show that partial loss of ex or hippo causes clear defects in wing shape. In addition, the initial wing shape is affected in wts and ex mutants, and hypomorphic alleles were used for these experiments. Therefore, the main conclusion requires revision. It would be useful to include a complete dataset for hippo RNAi, ex, and wts conditions in Figure S1. The purpose and interpretation of the InR^CA experiments are also unclear. While InR^CA expression can increase tissue growth, Hippo signaling has functions beyond growth control. Whether Hippo regulates tissue shape through InR^CA-dependent mechanisms remains to be clarified.

      Response: As noted in our response to point 1 of Reviewer 2 - our results emphasize that the effects of Ds-Fat on wing shape cannot be explained solely by effects on Hippo signaling, eg as we stated on page 7 “These observations suggest that Hippo signaling contributes to, but does not fully explain, the influence of ds or fat on adult wing shape.” We also note that impairment of Hippo signaling has similar effects in younger discs, but very different effects in older discs, which clearly indicates that they are having very different effects during disc growth. We will make some revisions to the text to make sure that our conclusions are clear throughout.

      While we used a hypomorphic allele for wts, because null alleles are lethal, the ex allele that we used is described in Flybase as an amorph, not a hypomorph, and as noted in our response to Reviewer 2, we will add some discussion about relative strength of effects on Hippo signaling.

      In Fig S1, we currently show adult wings for ex[e1] and RNAi-Hpo, and wing discs for wts[P2]/wts[x1], and for ex[e1]. The wts combination does not survive to adult so we can’t include this. We will however, add hpo RNAi wing discs as requested.

                    The purpose of including InR^CA experiments is to try to separate effects of Hippo signaling from effects of growth, because InR signaling manipulation provides a distinct mechanism for increasing growth. We will revise the text to try to make sure this is clearer.
      

      Figure 5: This figure presents images of MyoII distribution, but no quantification across multiple samples is provided. Moreover, the relationship between changes in tissue stress and MyoII levels remains unclear. Performing laser ablation and MyoII quantification on the same samples would provide stronger support for the proposed conclusions.

      Response: We will revise the quantitation so that it presents analysis of averages across multiple discs, rather than representative examples of single discs.

      Both the myosin imaging, and the laser ablation were done on the same genotypes (wildtype and ds) at the same ages (108 h AEL) so we think it is valid to directly compare them. Moreover, the imaging conditions for laser ablation and myo quantification are different, so it’s not feasible to do them at the same time (For ablations we do a single Z plane and a single channel (has to include Ecad, or an equivalent junctional marker) on live discs, so that fast imaging can be done. For Myo imaging we do multiple Z stacks and multiple channels (eg Ecad and Myo), which is not compatible with the fast imaging needed for analysis of laser ablations).

      Figure 6: It is unclear when Rok RNAi and Rok^CA misexpression were induced. To substantiate their claims, the authors should measure both MyoII levels and mechanical tension under the different experimental conditions in which wing shape was modified through Rok modulation (i.e. the condition shown in Fig. 7G). For comparison, fat and ds data should be added to Fig 6H. Overall, the effects of Rok modulation appear milder than those of Fat manipulation. Given that Dachs has been shown to regulate tension downstream of Fat/Ds, it would be informative to determine whether tissue tension is altered in dachs mutant wings and to assess the relative contribution of Dachs- versus MyoII-mediated tension to wing shape control. It would also be interesting to test whether Rok activation can rescue dachs loss-of-function phenotypes.

      Response: In these Rok experiments there was no separate temporal control of Rok RNAi or Rok^CA expression, they were expressed under nub-Gal4 control throughout development.

      We will add examination of myosin in combinations of ds RNAi and rok manipulation as in Fig 7G to a revised manuscript.

      Data for fat and ds comparable to that shown in Fig 6H is already presented in Fig 3D, and we don’t think its necessary to reproduce this again in Fig 6H.

      We agree that the effects of Rok manipulations are milder than those of Fat manipulations; as we try to discuss, this could be because the pattern or polarity of myosin is also important, not just the absolute level, and we will add assessment of myosin polarity.

      The suggestion to also look at dachs mutants is reasonable, and we will add this. In addition, we plan to add an "activated" Dachs (a Zyxin-Dachs fusion protein previously described in Pan et al 2013) that we anticipate will provide further evidence that the effects of Ds-Fat are mediated through Dachs. We will also add the suggested experiment combining Rok activation with dachs loss-of-function.

      Figure 7: The authors use genetic interactions to support their claim that Fat controls wing shape independently of Hippo signaling. However, these interactions do not formally exclude a role for Hippo. Moreover, previous work has shown that tissue tension regulates Hippo pathway activity, implying that any manipulation of tension could indirectly affect Hippo and growth. To provide more direct evidence, the authors should further analyze MyoII localization and tissue tension under the various experimental conditions tested (as also suggested above).

      Response: As discussed above, our data clearly show that Fat has effects independently of Hippo signaling that are crucial for its effects on wing shape, but we did not mean to imply that the regulation of Hippo signaling by Fat makes no contribution to wing shape control, and we will revise the text to make this clearer. We will also add additional analysis of Myosin localization , as described above.

      Reviewer #3 (Significance (Required)): How organ growth and shape are controlled remains a fundamental question in developmental biology, with major implications for our understanding of disease mechanisms. The Drosophila wing has long served as a powerful and informative model to study tissue growth and morphogenesis. Work in this system has been instrumental in delineating the conserved molecular and mechanical processes that coordinate epithelial dynamics during development. The molecular regulators investigated by the authors are highly conserved, suggesting that the findings reported here are likely to be of broad biological relevance.

      Previous studies have proposed that anisotropic tissue growth regulates wing shape during larval development and that such anisotropy induces mechanical responses that promote MyoII localization (Legoff et al., 2013, PMID: 24046320; Mao et al., 2013, PMID: 24022370). The Ds/Fat system has also been shown to regulate tissue tension through the Dachs myosin, a known modulator of the Hippo/YAP signaling pathway. As correctly emphasized by the authors, the respective contributions of anisotropic growth and mechanical tension to wing shape control remain only partially understood. The current study aims to clarify this issue by analyzing the role of Fat/Ds in controlling MyoII localization and, consequently, wing shape. This represents a potentially valuable contribution. However, the proposed mechanistic link between Fat/Ds and MyoII localization remains insufficiently explored. Moreover, the role of MyoII is not fully discussed in the broader context of Dachs function and its known interactions with MyoII (Mao et al., 2011, PMID: 21245166; Bosveld et al., 2012, PMID: 22499807; Trinidad et al., 2024, PMID: 39708794). Most importantly, the experimental evidence supporting the authors' conclusions would benefit from further strengthening. It should also be noted that disentangling the relative contributions of anisotropic growth and MyoII polarization to tissue shape and size remains challenging, as MyoII levels are known to increase in response to anisotropic growth (Legoff et al., 2013; Mao et al., 2013), and mechanical tension itself can modulate Hippo/YAP signaling (Rauskolb et al., 2014, PMID: 24995985).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      The authors investigate the mechanisms underlying epithelial morphogenesis using the Drosophila wing as a model system. Specifically, they analyze the contribution of the conserved Fat/Ds pathway to wing shape regulation. The main claim of the manuscript is that Ds/Fat controls wing shape by regulating tissue mechanical stress through MyoII levels, independently of Hippo signaling and tissue growth.

      Major Comments

      To support their main conclusions, the authors should address the following major points and consider additional experiments where indicated. Most of the suggested experiments are feasible within a reasonable timeframe, while a few are more technically demanding but would substantially strengthen the manuscript's central claims.

      Figure 1:

      The authors use temperature-sensitive inactivation of Fat or Ds to determine the developmental window during which these proteins regulate wing shape. To support this claim, it is essential to demonstrate that upon downshift during early pupal stages, Ds or Fat protein levels are restored to normal. For consistency, please include statistical analyses in Figure 1P and ensure that all y-axis values in shape quantifications start at 1.

      Figure 2:

      The authors propose that wing shape is regulated by Fat/Ds during larval development. However, Figure 2L suggests that wing elongation occurs in control conditions between 6 and 12 h APF, while this elongation is not observed upon Ds RNAi. The authors should therefore perform downshift experiments while monitoring wing shape during the pupal stage to substantiate their main claim. In addition, equivalent data for Fat loss of function should be included to support the assertion that Fat and Ds act similarly.

      Figure 3:

      The authors state that "These observations indicate that Ds-Fat signaling influences wing shape during the initial formation of the wing pouch, in addition to its effects during wing growth." This conclusion is not fully supported, as the authors only examine wing shape at 72 h AEL. At this stage, fat or ds mutant wings already display altered morphology. The authors could only make this claim if earlier time points were fully analyzed. In fact, the current data rather suggest that Ds function is required before 72 h AEL, as a rescue of wing shape is observed between 72 and 120 h AEL.

      Figure 4:

      The authors state that "The influence of Ds-Fat on wing shape is not explained by Hippo signaling." However, this conclusion is not supported by their data, which show that partial loss of ex or hippo causes clear defects in wing shape. In addition, the initial wing shape is affected in wts and ex mutants, and hypomorphic alleles were used for these experiments. Therefore, the main conclusion requires revision. It would be useful to include a complete dataset for hippo RNAi, ex, and wts conditions in Figure S1. The purpose and interpretation of the InR^CA experiments are also unclear. While InR^CA expression can increase tissue growth, Hippo signaling has functions beyond growth control. Whether Hippo regulates tissue shape through InR^CA-dependent mechanisms remains to be clarified.

      Figure 5:

      This figure presents images of MyoII distribution, but no quantification across multiple samples is provided. Moreover, the relationship between changes in tissue stress and MyoII levels remains unclear. Performing laser ablation and MyoII quantification on the same samples would provide stronger support for the proposed conclusions.

      Figure 6:

      It is unclear when Rok RNAi and Rok^CA misexpression were induced. To substantiate their claims, the authors should measure both MyoII levels and mechanical tension under the different experimental conditions in which wing shape was modified through Rok modulation (i.e. the condition shown in Fig. 7G). For comparison, fat and ds data should be added to Fig 6H.<br /> Overall, the effects of Rok modulation appear milder than those of Fat manipulation. Given that Dachs has been shown to regulate tension downstream of Fat/Ds, it would be informative to determine whether tissue tension is altered in dachs mutant wings and to assess the relative contribution of Dachs- versus MyoII-mediated tension to wing shape control. It would also be interesting to test whether Rok activation can rescue dachs loss-of-function phenotypes.

      Figure 7:

      The authors use genetic interactions to support their claim that Fat controls wing shape independently of Hippo signaling. However, these interactions do not formally exclude a role for Hippo. Moreover, previous work has shown that tissue tension regulates Hippo pathway activity, implying that any manipulation of tension could indirectly affect Hippo and growth. To provide more direct evidence, the authors should further analyze MyoII localization and tissue tension under the various experimental conditions tested (as also suggested above).

      Significance

      How organ growth and shape are controlled remains a fundamental question in developmental biology, with major implications for our understanding of disease mechanisms. The Drosophila wing has long served as a powerful and informative model to study tissue growth and morphogenesis. Work in this system has been instrumental in delineating the conserved molecular and mechanical processes that coordinate epithelial dynamics during development. The molecular regulators investigated by the authors are highly conserved, suggesting that the findings reported here are likely to be of broad biological relevance.

      Previous studies have proposed that anisotropic tissue growth regulates wing shape during larval development and that such anisotropy induces mechanical responses that promote MyoII localization (Legoff et al., 2013, PMID: 24046320; Mao et al., 2013, PMID: 24022370). The Ds/Fat system has also been shown to regulate tissue tension through the Dachs myosin, a known modulator of the Hippo/YAP signaling pathway. As correctly emphasized by the authors, the respective contributions of anisotropic growth and mechanical tension to wing shape control remain only partially understood. The current study aims to clarify this issue by analyzing the role of Fat/Ds in controlling MyoII localization and, consequently, wing shape. This represents a potentially valuable contribution. However, the proposed mechanistic link between Fat/Ds and MyoII localization remains insufficiently explored. Moreover, the role of MyoII is not fully discussed in the broader context of Dachs function and its known interactions with MyoII (Mao et al., 2011, PMID: 21245166; Bosveld et al., 2012, PMID: 22499807; Trinidad et al., 2024, PMID: 39708794). Most importantly, the experimental evidence supporting the authors' conclusions would benefit from further strengthening. It should also be noted that disentangling the relative contributions of anisotropic growth and MyoII polarization to tissue shape and size remains challenging, as MyoII levels are known to increase in response to anisotropic growth (Legoff et al., 2013; Mao et al., 2013), and mechanical tension itself can modulate Hippo/YAP signaling (Rauskolb et al., 2014, PMID: 24995985).

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this work, Tripathi et al address the open question of how the Fat/Ds pathway affects organ shape, using the Drosophila wing as a model. The Fat/Ds pathway is a conserved but complex pathway, interacting with Hippo signalling to affect growth and providing planar cell polarity that can influence cellular dynamics during morphogenesis. Here, authors use genetic perturbations combined with quantification of larval, pupal, and adult wing shape and laser ablation to conclude that the Ft/Ds pathway affects wing shape only during larval stages in a way that is at least partially independent of its interaction with Hippo and rather due to an effect on tissue tension and myosin II distribution. Overall the work is clearly written and well presented. I only have a couple major comments on the limitations of the work.

      Major comments:

      1. Authors conclude from data in Figures 1 and 2 that the Fat/Ds pathway only affects wing shape during larval stages. When looking at the pupal wing shape analysis in Figure 2L, however, it looks there is a difference in wt over time (6h-18h, consistent with literature), but that difference in time goes away in RNAi-ds, indicating that actually there is a role for Ds in changing shape during pupal stages, although the phenotype is clearly less dramatic than that of larval stages. No statistical test was done over time (within the genotype), however, so it's hard to say. I recommend the authors test over time - whether 6h and 18h are different in wild type and in ds mutant. I think this is especially important because there is proximal overgrowth in the Fat/Ds mutants, much of which is contained in the folds during larval stages. That first fold, however, becomes the proximal part of the pupal wing after eversion and contracts during pupal stages to elongate the blade (Aiguoy 2010, Etournay 2015). Also, according to Trinidad Curr Biol 2025, there is a role for Fat/Ds pathway in pupal stages. All of that to say that it seems likely that there would be a phenotype in pupal stages. It's true it doesn't show up in the adult wing in the experiments in Fig 1, but looking at the pupal wing itself is more direct - perhaps the very proximal effect is less prominent later, as there is potential for further development after 18hr before adulthood and the most proximal parts are likely anyway excluded in the analysis.
      2. I think there needs to be a mention and some discussion of the fact that the wing is not really flat. While it starts out very flat at 72h, by 96h and beyond, there is considerable curvature in the pouch that may affect measurements of different axis and cell shape. It is not actually specified in the methods, so I assume the measurements were taken using a 2D projection. Not clear whether the curvature of the pouch was taken into account, either for cell shape measurements presented in Fig 4 or for the wing pouch dimensional analysis shown in Fig 3, 6, and supplements. Do perturbations in Ft/Ds affect this curvature? Are they more or less curved in one or both axes? Such a change could affect the results and conclusions. The extent to which the fat/ds mutants fold properly is another important consideration that is not mentioned. For example, maybe the folds are deeper and contain more material in the ds/fat mutants, and that's why the pouch is a different shape? At the very least, this point about the 3D nature of the wing disc must be raised in discussion of the limitations of the study. For the cell shape analysis, you can do a correction based on the local curvature (calculated from the height map from the projection). For the measurement of A/P, D/V axes of the wing pouch, best would be to measure the geodesic distance in 3D, but this is not reasonable to suggest at this point. One can still try to estimate the pouch height/curvature, however, both in wild type and in fat/ds mutants.

      Minor comments:

      1. The analysis of the laser ablation is not really standard - usually one looks at recoil velocity or a more complicated analysis of the equilibrium shape using a model (e.g Shivakumar and Lenne 2016, Piscitello-Gomez 2023, Dye et al 2021). One may be able to extract more information from these experiments - nevertheless, I doubt the conclusions would change, given that that there seems to be a pretty clear difference between wt and ds (OPTIONAL).
      2. Figure 7G: I think you also need a statistical test between RNAi-ds and UAS-rokCA+RNAi-ds.
      3. In the discussion, there is a statement: "However, as mutation or knock down of core PCP components, including pk or sple, does not affect wing shape... 59." Reference 59 is quite old and as far as I can tell shows neither images nor quantifications of the wing shape phenotype (not sure it uses "knockdown" either - unless you mean hypomorph?). A more recent publication Piscitello-Gomez et al Elife 2023 shows a very subtle but significant wing shape phenotype in core PCP mutants. It doesn't change your logic, but I would change the statement to be more accurate by saying "mutation of core PCP components has only subtle changes in adult wing shape"

      Referee cross-commenting

      Reviewer2:

      Reviewer 2 makes the statement: "The distance along the AP boundary from the pouch border to DV midline is topologically comparable to the PD length of the adult wing. The distance along the DV boundary from A border to P border is topologically comparable to the AP length of the adult wing."

      I disagree - the DV boundary wraps around the entire margin of the adult wing (as correctly drawn with the pink line in Fig 2A). It is not the same as the wide axis of the adult wing (perpendicular to the AP boundary). It is not trivial to map the proximal-distal axis of the larval wing to the proximal-distal axis of the adult, due to the changes in shape that occur during eversion. Thus, I find it much easier to look at the exact measurement that the authors make, and it is much more standard in the field, rather than what the reviewer suggests. Alternatively, one could I guess measure in the adult the ratio of the DV margin length (almost the circumference of the blade?) to the AP boundary length. That may be a more direct comparison. Actually the authors leave out the term "boundary" - what they call AP is actually the AP boundary, not the AP axis, and likewise for the DV - what they measure is DV boundary, but I only noticed that in the second read-through now. Just another note, these measurements of the pouch really only correspond to the very distal part of the wing blade, as so much of the proximal blade comes from the folds in the wing disc. Therefore, a measurement of only distal wing shape would be more comparable.

      Reviewer 2 states that authors cannot definitively conclude anything about mechanical tension from their reported cutting data because the authors have not looked at initial recoil velocity. I strongly disagree. The wing disc tissue is elastic on much longer timescales than what's considered after laser ablation (even hours), and the shape of the tissue after it equilibrates from a circular cut (1-2min) can indeed be used to infer tissue stresses (see Dye et al Elife 2021, Piscitello-Gomez et al eLife 2023, Tahaei et al arXiv 2024). In the wing disc, the direction of stresses inferred from initial recoil velocity are correlated with the direction of stresses inferred from analysing the equilibrium shape after a circular cut. Rearrangements, a primary mechanism of fluidization in epithelia, does not occur within 1'. Analysing the equilibrium shape after circular ablation may be more accurate for assessing tissue stresses than initial recoil velocity - in Piscitello-Gomez et al 2023, the authors found that a prickle mutation (PCP pathway) affected initial recoil velocity but not tissue stresses in the pupal wing. Such equilibrium circular cuts have also been used to analyze stresses in the avian embryo, where it correlates with directions of stress gathered from force inference methods (Kong et al Scientific Reports 2019). The Tribolium example noted by the reviewer is on the timescale of tens to hundreds of minutes - much longer than the timescale of laser ablation retraction. It is true the analysis of the ablation presented in this paper is not at the same level as those other cited papers and could be improved. But I don't think the analysis would be improved by additional experiments doing timelapse of initial retraction velocity.

      Reviewer 2 states "If cell anistropy is caused by polarized myosin activity, that activity is typically polarized along the short edges not long edges" Not true in this case. Myosin II accumulates along long boundaries (Legoff and Lecuit 2013). "Therefore, interpreting what causes the cell anistropy and how DS regulates it is difficult," Agreed - but this is well beyond the scope of this manuscript. The authors clearly show that there is a change of cell shape, at least in these two regions. Better would be to quantify it throughout the pouch and across multiple discs. Similar point for myosin quantifications - yes, polarity would be interesting and possible to look at in these data, and it would be better to do so on multiple discs, but the lack of overall myosin on the junctions shown here is not nothing. Interpreting what Ft/Ds does to influence tension and myosin and eventually tissue shape is a big question that's not answered here. I think the authors do not claim to fully understand this though, and maybe further toning down the language of the conclusions could help.

      Reviewer 3:

      I agree with many of the points raised by Reviewer 3, in particular that relevant for Fig 1. The additional experiments looking at myosin II localization and laser ablation in the other perturbations (Hippo and Rok mutants/RNAi) would certainly strengthen the conclusions.

      Significance

      I think the work provides a clear conceptual advance, arguing that the Ft/Ds pathway can influence mechanical stress independently of its interaction with Hippo and growth. Such a finding, if conserved, could be quite important for those studying morphogenesis and Fat function in this and other organisms. For this point, the genetic approach is a clear strength. Previous work in the Drosophila wing has already shown an adult wing phenotype for Ft/Ds mutations that was attributed to its role in the larval growth phase, as marked clones show aberrant growth in mutants. The novelty of this work is the dissection of the temporal progression of this phenotype and how it relates to Hippo and myosin II activation. It remains unclear exactly how Ft/Ds may affect tissue tension, except that it involves a downregulation of myosin II - the mechanism of that is not addressed here and would involve considerable more work. I think the temporal analysis of the wing pouch shape was quite revealing, providing novel information about how the phenotype evolves in time, in particular that there is already a phenotype quite early in development. As mentioned above, however, the lack of consideration of the wing disc as a 3D object is a potential limitation. While the audience is likely mostly developmental biologists working in basic research, it may also interest those studying the pathway in other contexts, including in vertebrates given its conservation and role in other processes.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Overall Response.

      We would like to thank the reviewers for their analysis of the manuscript. From their comments it is clear that our manuscript was not. We completely rewrote the manuscript to focus on the central core question which was how does Adam13 regulates gene expression in general and TFap2a in particular leading to the expression of Calpain8 a protein required for CNC migration.

      The following model will be the central line of our story. It will address all of the proteins involved and mechanistical evidences that link Adam13 to one of its proven effector target Calpain8.

      • *

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)): **

      In this manuscript, Pandey et al. show that the ADAM13 protein modulates histone modifications in cranical neural crest and that the Arid3a protein binds the Tfap2a promoter in an Adam13-dependent manner and has promoter-specific effects on transcription. Furthermore, they show that the Adam13 and human ADAM9 proteins associated with histone modifiers as well as proteins involved in RNA splicing. Although the manuscript is mostly clearly written and the figures well assembled, it reads like a couple of separate and unfinished stories.*

      I believe that our story line was not clear and that the overarching questions was not well stated. We have made every effort to change this in the revised manuscript. I would like to include a figure that explains the story.

      In short:

      1 We knew that Adam13 could regulate gene expression in CNC via its cytoplasmic domain.

      2 We also knew that this required Adam13 interaction with Arid3a and that a direct target with the transcription factor TFAP2a which in turn regulates functional targets that we had identified including the protocadherin PCNS and the protease Calpain8.

      Our goal was to understand the mechanism allowing Adam13 to regulate gene expression.

      3 This first part of this manuscript shows how Adam13 modulates Histone modification in vivo in the CNC globally as well as specifically on the Tfap2a promoter. This results I an Open chromatin.

      4 Using Chip we show that Adam13 and Arid3a both bind to the Tfap2a promoter and that Arid3a binding to the first ATG depends on Adam13.

      5 Using Luciferase reporter we show that both Adam13 and Arid3a can induce expression at the first ATG.

      *They show using immunocytochemistry and qPCR that ADAM13 knockouts in CNCs afffects histone modifications. Here ChIP-seq or Cut-n-Run experiments would be more appropriate and would result in a more comprehensive understanding of the changes mediated. *

      I agree but we did not have the fund and now I have nobody working in the lab to do this experiment. These are also likely to overlap with the RNAseq data that we have and would simply add more open leads. We selected to go after the only direct target that we know which is TFAP2a and focus on this gene to understand the mechanism.

      We believe that the Chip PCR experiment are sufficient for this story.

      *The immunohistochemistry assays should at least be verified further using western blotting or other more quantiative methods. *

      Immunofluorescence and statistical analysis is a valid quantification method. Western blot of CNC explants is not trivial and requires a large amount of material. Given the small overall change we also would not expect to be able to detect the change over the noise of western blot. The Chip PCR confirms our finding in a completely independent manner.

      *The authors then show that ADAM13 interacts with a number of histone modifiers such as KDM3B, KDM4B and KMT2A but strangely they do not follow up this interesting observation to map the interactions further (apart from a co-ip with KMT2A), the domains involved, the functional role of the interactions or how they mediate the changes in chromatin modifications. *

      We selected KMT2a because it is expressed in the Hek293T cells. KMT2D has been shown to regulate CNC development in Xenopus and is responsible for the Kabuki syndrome in human. We used aphafold to predict interaction and found that Adam13 interact with the Set domain. In addition we see multiple Set- containing domain protein in our mass spec data. The mass spec is done on Human hek293T cells that express a subset of KMT proteins. We now include evidence that Adam13 interact with the KMT2D SET domain (new figure 5D)

      The authors then show that ADAM13 affects expression of the TFAP2a gene in a promoter specific manner - affecting expression from S1 but not S2.

      It is the S1 but not S3. Adam13 has no effect on S2.

      • They further show that ADAM13 affects the binding of the Arid3 transcription fator to the S1-promoter but not to the S3 promoter. However, ADAM13 was present at both promoters. Absence of ADAM13 resulted in increased H3K9me2/3 and decreased H3K4me3 at the S1 promoter whereas only H3K4me3 was changed at the S2 promoter*

      S3 not S2*. Unfortunately, they do not show how this is mediated or through which binding elements this takes place. Why is ADAM13 present at both promoters but only affects Arid3 binding at S1? *

      We agree this is a very interesting question that could be the subject of an entire publication. Promoter deletion and mutation to identify which site are bound by and modulated by Adam13/Arid3a is not trivial.

      *The authors claim that transfecting Arid3a and Adam13 together further increases expression from a reporter (Fig 4E) but this is not true as no statistical comparison is done between the singly transfected and double transfected cells. *

      This is correct, there is a small increase that is not significant with both. The fact that both proteins can induce the promoter suggest but does not prove that they can have additive roles. The loss of function experiment shows that the human Arid3a expressed in Hek293T cells is important for Adam13 increases of S1. It is possible that the dose of the endogenous Arid3a is sufficient to get full activity of Adam13.* Then the authors surprisingly start investigating association of proteins with the two isoforms of TFAP2a which in the mind of this reviewer is a different question entirely. *

      We agree and have removed this part of the manuscript.

      *They find a number of proteins involved in splicing. And the observation that ADAM13 also interacts with splicing factors is really irrelevant in terms of the story that they are trying to tell. Transcription regulation and splicing are different processes and although both affect the final outcome, mRNA, they need to be investigated separately. The link is at least not very clear from the manuscript. Again, the effects on splicing are not further investigated through functional analysis and as presented the data presented is too open-ended and lacking in clarity. *

      We agree that beside the different activities of the TFap2a isoform, the rest of the splicing regulation could be a separate study. We were interested to understand how these two isoforms could activate Calpain8 so differently this is why we looked at LC/MS/MS. We have removed this part of the story from the manuscript.*

      Additional points: 1. In the abstract they propose that the ADAMs may act as extracellular sensors. This is not substantiated by the results. *

      As an extracellular protein translocating into the nucleus it is a possibility that we propose, but I agree this is not investigated in this manuscript. We will modify the text.* 2. Page 5, line 16: what is referred to by 6 samples 897 proteins? Were 6 samples analyzed for each condition? The number of repeats for the mass spec analysis is not clear from the text nor are the statistical parameters used to analyse the data. This is also true for the mass spec presented in the part on TFAP2aL-S1 and Adam13 regulate splicing. Statistics and repeats are not presented. *

      In general we provide biological triplicate and use the statistical function of Scaffold to identify proteins that are significantly enriched or absent in each samples.

      When we specify 6 samples it means 6 independent proteins samples were analyzed and used for our statistic. We use Scafold T-test with a p value less than 0.05. Peptides were identified with 95% confidence and proteins with 99% confidence.* 3. Page 6, line 19: set domain should be SET domain. *

      Yes* 4. The number of repeats in the RNA sequencing of the CNCs is not clear from the text. *

      Three biological replicates (Different batch of embryos from different females).* 5. The explanation of Figure C is a bit lacking. There are two forms of TFAP2a, L and S, but only one is presented in the figure. Do both forms have the extra S1-3 exons? Also, at the top of the figure it is not clear that the boxes are part of a continuous DNA sequence. Also, it is not clear which codon is not coding. *

      Xenopus laevis are pseudo tetraploid giving in most cases L and S genes in addition to the 2 alleles from being diploid. The TFAP2a gene structure is conserved between both aloalleles and is similar to the human gene. For promoter analysis and Chip PCR we chose one of the alloallele (L), given that the RNAseq data showed that both genes and variant behave the same in response to Adam13. This only becomes important in loss of function experiment in which both L and S version need to be knock down or Knock out.

      * In the sashimi plot there are green and pink shaded areas. What do they denote? What exactly is lacking in the MO13 mutant - seems that a particular exon is missing suggesting skipping?*

      MO13 is a morpholino that bocks the translation of Adam13 (Already characterized with >90% of the protein absent) but does not affect Adam13 mRNA expression.* 7. Page 11, line 9: „with either MbC or MbC and MO13" needs to be rephrased. *

      Will do *8. Page 11, line 19: „the c-terminus of....and S3) and" should be „the C-terminus of...and S3 and". ** 9. Page 15, line 10: substrateS 10. Page 16, line 23: the sentence „increases H3K9 to the promoter of the most upstream" needs revision. 11. Page 26, line 12: Here the authors say: „for two samples two-tail unpaired". What does this mean? Statistics should not be performed on fewer than three samples. In legnd to Figure 6 it indicates that T-test was performed on two samples. 12. The discussion should be shortened and simplified. 13. Figure 1 legend. How many images were quantitated for each condition? *

      At least 3 images per condition. For 3 independent experiments. (9 images per condition).* 14. Figure 2 has a strange order of panels where G is below B. 15. Figure 6 legend, line 12. „proteins that were significantly enriched in either of the 2 samples" is not very clear. What exactly does this mean?

      Reviewer #1 (Significance (Required)):

      If the authors follow up on either the transcription-part of the story, or the splicing part of the story, they are likely to have important results to present. However, in the present format the paper is lacking in focus as both issues are mixed together without a clear end-result. *

      We have entirely changed the paper according to these comments.

      *

      • *

      *Reviewer #2 (Evidence, reproducibility and clarity (Required)): **

      Panday et al seeks to determine the function of ADAM13 in regulating histone modifications, gene expression and splicing during cranial neural crest development. Specifically, the authors tested how Adam13, a metalloprotease, could modify chromatin by interaction with Arid3a and Tfap2a and RNA splicing and gene expression. They then utilize knockouts in Xenopus and HEK293T cells followed by immunofluorescence, IPs, BioID, luciferase assays, Mass spec and RNA assays. Although there is some strong data in the BioID and luciferase experiments, the manuscript tells multiple stories, linking together too many things to make a compelling story. The result is a paper that is very difficult to read and understand the take home message. In addition, some of the conclusions are not supported by the data. This unfortunately means it is not ready for publication. However, I have added below some suggestions that would strengthen the manuscript. My comments are below: *

      Clarity is clearly an issue here. The new version is entirely re-written.

      Here is the take home message:

      We knew that Adam13 could regulate gene expression via its cytoplasmic domain. One of the key targets was identified as Calpain8, a protein critical for CNC migration. We subsequently showed that Adam13 and Arid3a regulated Tfap2a expression which in turn regulated Calpain8.

      In this manuscript we investigated 1) how Adam13 regulates TFAP2a and 2) how Tfap2a controls Calpain8 expression.

      The take home message is that Adam13 bind to Histone methyl transferase and changes the histone methylation code overall in the CNC and in particular at the TFAP2a promoter. This results in more open chromatin. We further find that Adam13 binds to the Tfap2a promoter in vivo and is important for Arid3a binding to the first start. Tfap2a that include this N-terminus sequence regulates Capn8 expression.*

      Major comments: 1. I think it would be better to split out the chromatin modification function from the splicing in two separate papers. While there is a connection, having it all together makes the story difficult to follow. *

      Agree but I believe that the S1 vs S3 story of Tfap2a is important for the overall story. The new paper does not emphasize splicing.* 2. The immunofluorescence of H3K9me2/3, in Figure 1, 2, 3 following Adam13 knockdown is not convincing. There seems to be a strong edge effect especially in Figure 2 and 3. *

      The statistical analysis shows that the results, while modest, are significant (Three independent experiments using 3 different females and 3 explants for each condition were analyzed). The edge effect observed is eliminated by the mask that we use that normalize the expression to either DAPI or Snai2. The edge effect is seen in both control and KD as well. These are further confirmed by the Chip PCR on one direct target.

      Similarly the Arid3a expression in Supp Figure 1 if anything seems increased.

      We have previously shown that Arid3a expression is not affected by Adam13 KD (Khedgikar et al). Our point here is simply that the difference in Tfap2a cannot be explained by a decrease in Arid3a expression. It is not a critical figure and was eliminated in the new manuscript.

      *It would be better to quantify by western blot and not by fluorescent intensity since it is difficult to determine what a small change in fluorescent intensity means in vivo. *

      Not all antibodies used here work by western blot and the quantity of material required for western blot is much larger than IF. Given the small overall changes and the variability observed in Western blot it is not a viable alternative.

      IF is a quantitative method that has been used widely to assert increase or decrease of protein level or post translational modification. The fact that the same post translational modification that we see in cranial neural crest explants can also be seen by ChipPCR on the Tfap2a promoter confirm this observation.

      *Also, it does not say in the text or the figure legend what these are, Xenopus explants of CNC? *

      These are CNC explants. It is now clearly stated in the figure legend.* 3. The rationale for isolating KMT2A from the other chromatin modifiers in the dataset is not clear. *

      The new manuscript is clarifying that point. Because we are using Hek293T cells in this assay, which are human embryonic kidney derived instead of Xenopus Cranial neural crest cells, we are not interested in a specific protein but rather a family of protein that can modify histones (KMT and KDM). Our rational is if Adam13 can bind to KMT2 via the SET domain, it is likely to interact with KTM2 that are expressed in the CNC. KMT2A and D are expressed in the CNC. This is why we selected KMT2a here (Hek293T). We now include 1 co-IP with the Set domain of Xenopus KMT2D (new figure 5D)

      From the RNA-seq in Supp Figure 2 it is not changed as much as likely some of the others.

      The new manuscript addresses this point. We did not show or expect that the loss of Adam13 would affect mRNA expression of Kmt2.

      *Also, the arrow seems to indicate that it is right above the cutoff. What about other proteins with ATPase activity? That is the top hit in the Dot plot nuclear function. Would be helpful to write out Adam13 cytoplasm/nucleus here. *

      We have used another set of proteomics data that does not include the cytoplasmic/nuclear extract to simplify the results. We hope that the changes make it more obvious.

      Given that we are looking at Chromatin remodeling enzyme here we did not chose to investigate further in this report the ATPase. This is such a wide category that it could lead us away from the main story here.* 4. The splicing information, while interesting would be better as a different manuscript. The sashimi plot requires more explanation as written. *

      We agree and think that a simple representation of the fold change of the different isoform is more obvious. It is now a minor part of figure 1 and the legend has been improved to describe the method here.

      How do you tell if the interactions are changed from this?

      I do not understand this question. The sashimi plot indicate the read through from the mRNA that goes from one exon to the next quantifying the specific exon usage. It can therefore be quantified and compared between different conditions.

      • The authors argue there is a reduction of Tfap2a in Figure 3H but half the explant is not expressing sox9 in the Adam13 knockdown. How is this kind of experiment controlled when measure areas that don't have any fluorescence because of the nature of the explants? *

      We have removed this figure as we had already shown previously by western blot that Tfap2a protein decreased in MO13 embryos. As noted on the histogram, the fluorescence is only measured in Sox9 positive cells in each explant. Three independent experiments with 3 explants for each. We also have seen a decrease by Western blot and mRNA expression (Both RNAseq and realtime PCR). In most of our explants, the vast majority of the cells are positive for Snai2 and Sox9, while those that are negative are positive for Sox3 (data not shown here). There is always less signal in the center of the explant possibly due to the penetration of antibody or interference with the signal by the cells pigment or yolk autofluorescence. Our control explants have the same effect so our quantification is valid.* 5. The use of a germ line Xenopus mutant for Adam13 is great but how were these knockouts validated? *

      All of the KO were validated by sequencing, RNAseq and protein expression. These are now included in the supplemental figure 1.

      *More information is required here. The Chip-qPCR has a lot of variability between the samples, especially in the H3K9me2/3. *

      All ChipPCR were performed on Xenopus embryos. The variability is tested by statistical analysis and is either significant or not.

      Because these are in cell lines, this should be more consistent.

      They are not in cell lines but in Xenopus embryos.

      • In addition, it is difficult to understand what this means for cranial neural crest cells when assaying in HEK293T cells with the luciferase assay. *

      We use Luciferase assay in Hek293T cells to test if Xenopus protein can induce a specific reporter (Gain of function). We also use luciferase reporter in Xenopus to test if they can perceive the loss of a specific protein (For example Adam13).

      Our result show that Adam13 or Arid3a expression in Hek293T cells can induce the TFAP2S1 reporter. * 6. The migration assay shows only an example of what it looks like to have defective migration. But it would be better to show control embryos, embryos with Adam13 knockdown and what the rescues look like so the reader can make their own conclusion.*

      We can certainly include this but have published this assay in multiple publication before. The picture is a single example, the histogram shows that statistical validation.

      • The argument from the section above suggests the S1 isoform is the primary one but S3 in this assay also rescues, please explain what this result means since it seems to suggest that even though these isoforms have different activity the function is similar in terms of the ability to rescue defective migration. *

      The result in Hek293T cells shows that only TFAP2aS1 can induce Calpain8, while both S1 and S3 can partially rescue CNC migration in embryos lacking Adam13. The issue here is the dose of mRNA injected for each variant might be too high. Adam13 proteolytic activity is also critical, so we do not expect a complete rescue. The fact that S1 is significantly better at rescuing than S3 is relevant here. It is possible that if we were to decrease the dose of each mRNA we would find one in which S3 no longer rescues but S1 does.

      * The next section again talks about yet another protein Calpain-8. Here the authors use MO13 for luciferase assays instead of HEK293 cells. The authors do not explain why they decided to switch from cells to MO.*

      Calpain8 is one of the validate target of Adam13 that can rescue CNC migration (Cousin et al Dev Cell). We use the luciferase reporter corresponding to the Xenopus Capn8 reporter to show 1 in vivo that loss of Adam13 reduce its expression (Similar to the Capn8 gene). We then went in vitro using Hek293T cells for gain of function experiment that shows that only the Tfaps2S1 variant can induce it while S3 does not.

      We hope that the graphical summary and the new manuscript make this clear.* 8. The experiment to IP RNA supports only the correlation that Adam9 and Adam13 bind RNA and RNA binding proteins to regulate splicing. This conclusion presented is not supported by the data presented here. While there is a sentence about why Adam9 was chosen here, it would be preferred to focus on Adam13 as the rest of the manuscript is focused on Adam13. The conclusions are generalized to all ADAMs, but ADAM13 and ADAM9 are the only ADAMs investigated in the manuscript *

      This figure is no longer included. For each of the protein classes that we identify by Masspec we try to find a validation. RNA-IP is simply a validation that Adam13 and Adam9 can bind to complexes that include RNA in a cytoplasmic domain dependent fashion. The conclusion that Adam13 and possibly ADAM9 might be involved in regulating splicing is 1) that the protein associated with Adam13 are include multiple splicing factors, 2) that the RNAseq analysis shows abnormal splicing in CNC missing Adam13 and 3) that the form of TFAP2a induced by Adam13 (S1) associate significantly more with splicing factor than the S3 isoform.

      We agree that the generalization to other ADAM is not demonstrated here but only suggested. We selected ADAM9 and ADAM19 because we have shown that they can each rescue Adam13 function in the CNC. Unfortunately there are no ADAM19 antibody that work by IP on the market. We have tested multiple company and multiple cell lines.

      We believe that the ADAM9 experiment is critical to show that the protein associated with Adam13 are not simply the result of overexpressing a different species protein sin ADAM9 is the endogenous protein.*

      Minor comments 1. The manuscript using a lot of abbreviations (PCNS, NI, MO, SH3) and lingo that are unclear to a general reader. Please define acronyms when first used, as well as be clear on which model is being used in each experiment. *

      We have corrected this* 2. Similarly, the figures are not labeled such that a reader would be able to understand ie MO13 should be Adam13 knockdown etc. *

      We have corrected this in the legend

      • Please identify the genes on the heatmap and some highlighted genes from volcano plot from the RNA-seq.*

      The volcano plot is from MS/MS not RNAseq. We have list of all of the genes and/or proteins corresponding to each figure in tables

      We now have a figure from the RNAseq and a subset of genes of interest are show. *4. Why use the flag tag in Figure 5? *

      We used Flag-tagged construct to only immunoprecipitated the variants and not the endogenous TFPA2a in these experiments. Also we used RFP-Flag to eliminate any protein that bound to the tag or the antibody.

      This figure is no longer in the manuscript.* 5. Is the data in figure 4A-D the same as Supp. Figure 4A-D? *

      These are independent biological replicates of the same experiment.* 6. Please italicize gene symbols - e.g. "key transcription factors that exemplify CNC, such as the SOX9, FOXD3, SNAI1, SNAI2, and TFAP2 family". *

      We clearly have missed some, we are using italicized for gene, and regular for proteins. It might not be clear in the text when we are referring to genes and proteins. We will correct this in the rewrite. 7. Please review the manuscript for grammatical and typographical errors. * We have used all available software including Word and Grammarly. We will try to improve on the next version. **Cross-commenting**

      I think the two reviewers on one the same page on this manuscript.

      Reviewer #2 (Significance (Required)):

      If more solid, would be a conceptual advance in role of Adam13 in mediating chromatin modification and transcription factors, adds to exiting work from this lab, good for a specialize audience, my expertise is in in neural crest development, non-mammalian modes, epigenetic regulators.*

      • *
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Panday et al seeks to determine the function of ADAM13 in regulating histone modifications, gene expression and splicing during cranial neural crest development. Specifically, the authors tested how Adam13, a metalloprotease, could modify chromatin by interaction with Arid3a and Tfap2a and RNA splicing and gene expression. They then utilize knockouts in Xenopus and HEK293T cells followed by immunofluorescence, IPs, BioID, luciferase assays, Mass spec and RNA assays. Although there is some strong data in the BioID and luciferase experiments, the manuscript tells multiple stories, linking together too many things to make a compelling story. The result is a paper that is very difficult to read and understand the take home message. In addition, some of the conclusions are not supported by the data. This unfortunately means it is not ready for publication. However, I have added below some suggestions that would strengthen the manuscript. My comments are below:

      Major comments:

      1. I think it would be better to split out the chromatin modification function from the splicing in two separate papers. While there is a connection, having it all together makes the story difficult to follow.
      2. The immunofluorescence of H3K9me2/3, in Figure 1, 2, 3 following Adam13 knockdown is not convincing. There seems to be a strong edge effect especially in Figure 2 and 3. Similarly the Arid3a expression in Supp Figure 1 if anything seems increased. It would be better to quantify by western blot and not by fluorescent intensity since it is difficult to determine what a small change in fluorescent intensity means in vivo. Also, it does not say in the text or the figure legend what these are, Xenopus explants of CNC?
      3. The rationale for isolating KMT2A from the other chromatin modifiers in the dataset is not clear. From the RNA-seq in Supp Figure 2 it is not changed as much as likely some of the others. Also, the arrow seems to indicate that it is right above the cutoff. What about other proteins with ATPase activity? That is the top hit in the Dot plot nuclear function. Would be helpful to write out Adam13 cytoplasm/nucleus here.
      4. The splicing information, while interesting would be better as a different manuscript. The sashimi plot requires more explanation as written. How do you tell if the interactions are changed from this? The authors argue there is a reduction of Tfap2a in Figure 3H but half the explant is not expressing sox9 in the Adam13 knockdown. How is this kind of experiment controlled when measure areas that don't have any fluorescence because of the nature of the explants?
      5. The use of a germ line Xenopus mutant for Adam13 is great but how were these knockouts validated? More information is required here. The Chip-qPCR has a lot of variability between the samples, especially in the H3K9me2/3. Because these are in cell lines, this should be more consistent. In addition, it is difficult to understand what this means for cranial neural crest cells when assaying in HEK293T cells with the luciferase assay.
      6. The migration assay shows only an example of what it looks like to have defective migration. But it would be better to show control embryos, embryos with Adam13 knockdown and what the rescues look like so the reader can make their own conclusion. The argument from the section above suggests the S1 isoform is the primary one but S3 in this assay also rescues, please explain what this result means since it seems to suggest that even though these isoforms have different activity the function is similar in terms of the ability to rescue defective migration.
      7. The next section again talks about yet another protein Calpain-8. Here the authors use MO13 for luciferase assays instead of HEK293 cells. The authors do not explain why they decided to switch from cells to MO.
      8. The experiment to IP RNA supports only the correlation that Adam9 and Adam13 bind RNA and RNA binding proteins to regulate splicing. This conclusion presented is not supported by the data presented here. While there is a sentence about why Adam9 was chosen here, it would be preferred to focus on Adam13 as the rest of the manuscript is focused on Adam13. The conclusions are generalized to all ADAMs, but ADAM13 and ADAM9 are the only ADAMs investigated in the manuscript

      Minor comments

      1. The manuscript using a lot of abbreviations (PCNS, NI, MO, SH3) and lingo that are unclear to a general reader. Please define acronyms when first used, as well as be clear on which model is being used in each experiment.
      2. Similarly, the figures are not labeled such that a reader would be able to understand ie MO13 should be Adam13 knockdown etc.
      3. Please identify the genes on the heatmap and some highlighted genes from volcano plot from the RNA-seq.
      4. Why use the flag tag in Figure 5?
      5. Is the data in figure 4A-D the same as Supp. Figure 4A-D?
      6. Please italicize gene symbols - e.g. "key transcription factors that exemplify CNC, such as the SOX9, FOXD3, SNAI1, SNAI2, and TFAP2 family".
      7. Please review the manuscript for grammatical and typographical errors.

      Cross-commenting

      I think the two reviewers on one the same page on this manuscript.

      Significance

      If more solid, would be a conceptual advance in role of Adam13 in mediating chromatin modification and transcription factors, adds to exiting work from this lab, good for a specialize audience, my expertise is in in neural crest development, non-mammalian modes, epigenetic regulators

    1. The abundance of the peptide VISLSGEHSIIGR2 does not vary with run order after correction

      It looks like there might still be some oscillatory patterns as a function of run order in panel 2a. It could be useful to plot a spline for quick visual inspection. Also, I wonder if a window- rather than a point-based correction might better remove this confound?

    1. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors describe a new computational method (SegPore), which segments the raw signal from nanopore direct RNA-Seq data to improve the identification of RNA modifications. In addition to signal segmentation, SegPore includes a Gaussian Mixture Model approach to differentiate modified and unmodified bases. SegPore uses Nanopolish to define a first segmentation, which is then refined into base and transition blocks. SegPore also includes a modification prediction model that is included in the output. The authors evaluate the segmentation in comparison to Nanopolish and Tombo (RNA002) as well as f5c and Uncalled 4 (RNA004), and they evaluate the impact on m6A RNA modification detection using data with known m6A sites. In comparison to existing methods, SegPore appears to improve the ability to detect m6A, suggesting that this approach could be used to improve the analysis of direct RNA-Seq data.

      Strengths:

      SegPore address an important problem (signal data segmentation). By refining the signal into transition and base blocks, noise appears to be reduced, leading to improved m6A identification at the site level as well as for single read predictions. The authors provide a fully documented implementation, including a GPU version that reduces run time. The authors provide a detailed methods description, and the approach to refine segments appears to be new.

      Weaknesses:

      The authors show that SegPore reduces noise compared to other methods, however the improvement in accuracy appears to be relatively small for the task of identifying m6A. To run SegPore, the GPU version is essential, which could limit the application of this method in practice.

      As discussed in Paragraph 4 of the Discussion, we acknowledge that the improvement of SegPore combined with m6Anet over Nanopolish+m6Anet in bulk in vivo analysis is modest. This outcome is likely influenced by several factors, including alignment inaccuracies caused by pseudogenes or transcript isoforms, the presence of additional RNA modifications that can affect signal baselines, and the fact that m6Anet is specifically trained on Nanopolish-derived events. Additionally, the absence of a modification-free (in vitro transcribed) control sample in the benchmark dataset makes it challenging to establish true k-mer baselines.

      Importantly, these challenges do not exist for in vitro data, where the signal is cleaner and better defined. As a result, SegPore achieves a clear and substantial improvement at the single-molecule level, demonstrating the strength of its segmentation approach and its potential to significantly enhance downstream analyses. These results indicate that SegPore is particularly well suited for benchmarking and mechanistic studies of RNA modifications under controlled experimental conditions, and they provide a strong foundation for future developments.

      We also recognize that the current requirement for GPU acceleration may limit accessibility in some computational environments. To address this, we plan to further optimize SegPore in future versions to support efficient CPU-only execution, thereby broadening its applicability and impact.

      Reviewer #2 (Public review):

      Summary:

      The work seeks to improve detection of RNA m6A modifications using Nanopore sequencing through improvements in raw data analysis. These improvements are said to be in the segmentation of the raw data, although the work appears to position the alignment of raw data to the reference sequence and some further processing as part of the segmentation, and result statistics are mostly shown on the 'data-assigned-to-kmer' level.

      As such, the title, abstract and introduction stating the improvement of just the 'segmentation' does not seem to match the work the manuscript actually presents, as the wording seems a bit too limited for the work involved.

      The work itself shows minor improvements in m6Anet when replacing Nanopolish' eventalign with this new approach, but clear improvements in the distributions of data assigned per kmer. However, these assignments were improved well enough to enable m6A calling from them directly, both at site-level and at read-level.

      A large part of the improvements shown appear to stem from the addition of extra, non-base/kmer specific, states in the segmentation/assignment of the raw data, removing a significant portion of what can be considered technical noise for further analysis. Previous methods enforced assignment of (almost) all raw data, forcing a technically optimal alignment that may lead to suboptimal results in downstream processing as datapoints could be assigned to neighbouring kmers instead, while random noise that is assigned to the correct kmer may also lead to errors in modification detection.

      For an optimal alignment between the raw signal and the reference sequence, this approach may yield improvements for downstream processing using other tools.

      Additionally, the GMM used for calling the m6A modifications provides a useful, simple and understandable logic to explain the reason a modification was called, as opposed to the black models that are nowadays often employed for these types of tasks.

      Weaknesses:

      The manuscript suggests the eventalign results are improved compared to Nanopolish. While this is believably shown to be true (Table 1), the effect on the use case presented, downstream differentiation between modified and unmodified status on a base/kmer, is likely limited for during downstream modification calling the noisy distributions are often 'good enough'. E.g. Nanopolish uses the main segmentation+alignment for a first alignment and follows up with a form of targeted local realignment/HMM test for modification calling (and for training too), decreasing the need for the near-perfect segmentation+alignment this work attempts to provide. Any tool applying a similar strategy probably largely negates the problems this manuscript aims to improve upon. Should a use-case come up where this downstream optimisation is not an option, SegPore might provide the necessary improvements in raw data alignment.

      Thank you for this thoughtful comment. We agree that many current state-of-the-art (SOTA) methods perform well on benchmark datasets, but we believe there is still substantial room for improvement. Most existing benchmarks are based on limited datasets, primarily focusing on DRACH motifs in human and mouse transcriptomes. However, m6A modifications can also occur in non-DRACH motifs, where current models tend to underperform. Furthermore, other RNA modifications, such as pseudouridine, inosine, and m5C, remain less studied, and their detection is likely to benefit from more accurate and informative signal modeling.

      It is also important to emphasize that raw signal segmentation and RNA modification detection are fundamentally distinct tasks. SegPore focuses on improving the segmentation step by producing a cleaner and more interpretable signal, which provides a stronger foundation for downstream analyses. Even if RNA modification detection algorithms such as m6Anet can partially compensate for noisy segmentation in specific cases, starting from a more accurate signal alignment can still lead to improved accuracy, robustness, and interpretability—particularly in challenging scenarios such as non-canonical motifs or less characterized modifications.

      Scientific progress in this field is often incremental, and foundational improvements can have a significant long-term impact. By enhancing raw signal segmentation, SegPore contributes an essential building block that we expect will enable the development of more accurate and generalizable RNA modification detection algorithms as the community integrates it into more advanced workflows.

      Appraisal:

      The authors have shown their methods ability to identify noise in the raw signal and remove their values from the segmentation and alignment, reducing its influences for further analyses. Figures directly comparing the values per kmer do show a visibly improved assignment of raw data per kmer. As a replacement for Nanopolish' eventalign it seems to have a rather limited, but improved effect, on m6Anet results. At the single read level modification modification calling this work does appear to improve upon CHEUI.

      Impact:

      With the current developments for Nanopore based modification calling largely focusing on Artificial Intelligence, Neural Networks and the likes, improvements made in interpretable approaches provide an important alternative that enables deeper understanding of the data rather than providing a tool that plainly answers the question of wether a base is modified or not, without further explanation. The work presented is best viewed in context of a workflow where one aims to get an optimal alignment between raw signal data and the reference base sequence for further processing. For example, as presented, as a possible replacement for Nanopolish' eventalign. Here it might enable data exploration and downstream modification calling without the need for local realignments or other approaches that re-consider the distribution of raw data around the target motif, such as a 'local' Hidden Markov Model or Neural Networks. These possibilities are useful for a deeper understanding of the data and further tool development for modification detection works beyond m6A calling.

      Reviewer #3 (Public review):

      Summary:

      Nucleotide modifications are important regulators of biological function, however, until recently, their study has been limited by the availability of appropriate analytical methods. Oxford Nanopore direct RNA sequencing preserves nucleotide modifications, permitting their study, however many different nucleotide modifications lack an available base-caller to accurately identify them. Furthermore, existing tools are computationally intensive, and their results can be difficult to interpret.

      Cheng et al. present SegPore, a method designed to improve the segmentation of direct RNA sequencing data and boost the accuracy of modified base detection.

      Strengths:

      This method is well described and has been benchmarked against a range of publicly available base callers that have been designed to detect modified nucleotides.

      Weaknesses:

      However, the manuscript has a significant drawback in its current version. The most recent nanopore RNA base callers can distinguish between different ribonucleotide modifications, however, SegPore has not been benchmarked against these models.

      The manuscript would be strengthened by benchmarking against the rna004_130bps_hac@v5.1.0 and rna004_130bps_sup@v5.1.0 dorado models, which are reported to detect m5C, m6A_DRACH, inosine_m6A and PseU.

      A clear demonstration that SegPore also outperforms the newer RNA base caller models will confirm the utility of this method.

      Thank you for highlighting this important limitation. While Dorado, the new ONT basecaller, is publicly available and supports modification-aware basecalling, suitable public datasets for benchmarking m5C, inosine, m6A, and PseU detection on RNA004 are currently lacking. Dorado’s modification-aware models are trained on ONT’s internal data, which is not publicly released. Therefore, it is currently not feasible to directly evaluate or compare SegPore’s performance against Dorado for these RNA modifications.

      We would also like to emphasize that SegPore’s primary contribution lies in raw signal segmentation, which is an upstream and foundational step in the RNA modification detection pipeline. As more publicly available datasets for RNA004 modification detection become accessible, we plan to extend our work to benchmark and integrate SegPore with modification detection tasks on RNA004 data in future studies.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      Comments based on Author Response

      “However, it is valid to compare them on the segmentation task, where SegPore exhibits better performance (Table 1).”

      This dodges the point of the actual use case of this approach, as Nanopolish indeed does not support calling modifications for this kind of data, but the general approach it uses might, if adapted for this data, nullify the gains made in the examples presented.

      We respectfully disagree with the comment that the advantages demonstrated by SegPore could be “nullified”. Although SegPore’s performance is indeed more modest in in vivo datasets, it shows substantially better performance than CHEUI in in vitro data, clearly demonstrating that improved segmentation directly contributes to more accurate RNA modification estimation.

      It is worth noting that CHEUI relies on Nanopolish’s segmentation results for m6A detection. Despite this, SegPore outperforms CHEUI, further supporting the conclusion that segmentation quality has a meaningful impact on downstream modification calling.

      In conclusion, based on our current experimental results, SegPore is particularly well suited for RNA modification analysis from in vitro transcribed data, where its improved segmentation provides a clear advantage over existing methods.

      Further comments

      (2) “(2) Page 3  employ models like Hidden Markov Models (HMM) to segment the signal, but they are prone to noise and inaccuracies”

      “That's the alignment/calling part, not the segmentation?”

      “Current methods, such as Nanopolish, employ models like Hidden Markov Models (HMM) to segment the signal”

      I get the impression the word 'segment' has a different meaning in this work than what I'm used to based on my knowledge around Nanopolish and Tombo, see the deeper code examples further down below.

      Additionally, in Nanopolish there is a clear segmentation step (or event detection) without any HMM, then a sort of dynamic timewarping step that aligns the segments and re-combines some segments into a single segment where necessary afterwards. I believe the HMM in Nanopolish is not used at all unless modification calling, but if you can point out otherwise I'm open for proof.

      Now I believe it is the meaning of 'segmenting the signal' that confuses me, and now the clarification makes it a bit odd as well:

      “Nanopolish and Tombo align the raw signal to the reference sequence to determine which portion of the signal corresponds to each k-mer. We define this process as the segmentation task, referred to as "eventalign" in Nanopolish.”

      So now it's clearly stated the raw signal is being 'aligned' and then the process is suddenly defined as the 'segmentation task', and again referred to as "eventalign". Why is it not referred to as the 'alignment task' instead?

      I understand the segmentation and alignment parts are closely connected but to me, it seems this work picks the wrong word for the problem being solved.

      “Unlike Nanopolish and Tombo, which directly align the raw signal to the reference sequence,…”

      Looking at their code, I believe both Nanopolish and Tombo actually do segment the data first (or "event detection"), then they align the segments/events they found, and finally multiple events aligned to the same section are merged. See for yourself:

      Nanopolish:

      https://github.com/jts/nanopolish/blob/master/src/nanopolish_squiggle_read.cpp<br /> Line 233:

      cpp

      trim_and_segment_raw(fast5_data.rt, trim_start, trim_end, varseg_chunk, varseg_thresh);

      event_table et = detect_events(fast5_data.rt, *ed_params);

      Line 270:

      cpp

      // align events to the basecalled read

      std::vector event_alignment = adaptive_banded_simple_event_align(*this, *this->base_model[strand_idx], read_sequence);

      Where event detection is further defined at line 268 here:

      https://github.com/jts/nanopolish/blob/master/src/thirdparty/scrappie/event_detection.c

      Tombo:

      https://github.com/nanoporetech/tombo/blob/master/tombo/resquiggle.py

      line 1162 and onwards shows a ‘segment_signal’ call and the results are used in a ‘find_adaptive_base_assignment’ call, where ‘segment_signal’ starting at line 1057 tries to find where the signal jumps from a series of similar values to another (start of a base change in the pore), stored in ‘valid_cpts’, and the ‘find_adaptive_base_assignment’ tries to align the resulting segment values to the expected series of values:

      python

      valid_cpts, norm_signal, new_scale_values = segment_signal(

      map_res, num_events, rsqgl_params, outlier_thresh, const_scale)

      event_means = ts.compute_base_means(norm_signal, valid_cpts)

      dp_res = find_adaptive_base_assignment(

      valid_cpts, event_means, rsqgl_params, std_ref, map_res.genome_seq,

      start_clip_bases=map_res.start_clip_bases,

      seq_samp_type=seq_samp_type, reg_id=map_res.align_info.ID)

      These implementations are also why I find the choice of words for what is segmentation and what is alignment a bit confusing in this work, as both Tombo and Nanopolish do a similar, clear segmentation step (or an "event detection" step), followed by the alignment of the segments they determined. The terminology in this work appears to deviate from these.

      We thank the reviewer for the detailed comments!

      First of all, we sincerely apologize for our earlier misunderstanding regarding how Nanopolish and Tombo operate. Based on a closer examination of their source codes, we now recognize that both tools indeed include a segmentation step based on change-point detection methods, after which the resulting segments are aligned to the reference sequence. We have revised the relevant text in the manuscript accordingly:

      - “Current methods, such as Nanopolish, employ change-point detection methods to segment the signal and use dynamic programming methods and HMM to align the derived segments to the reference sequence,”

      - “We define this process as the segmentation and alignment task (abbreviated as the segmentation task), which is referred to as “eventalign” in Nanopolish.”

      - “In SegPore, we segment the raw signal into small fragments using a Hierarchical Hidden Markov Model (HHMM) and align the mean values of these fragments to the reference, where each fragment corresponds to a sub-state of a k-mer. By contrast, Nanopolish and Tombo use change-point–based methods to segment the signal and employ dynamic programming approaches together with profile HMMs to align the resulting segments to the reference sequence.”

      Regarding terminology, we originally borrowed the term “segmentation” from speech processing, where it refers to dividing continuous audio signals into meaningful units. In the context of nanopore signal analysis, segmentation and alignment are often tightly coupled steps. Because of this and because our initial focus was on methodological development rather than terminology, we used the term “segmentation task” to describe the combined process of signal segmentation and alignment.

      However, we now recognize that this terminology may cause confusion. Changing every instance of “segmentation” to “segmentation and alignment” or “alignment” would require substantial rewriting of the manuscript. Therefore, in this revision, we have clearly defined “segmentation task” as referring to the combined process of segmentation and alignment. We apologize for any earlier confusion and will adopt the term “alignment” in future work for greater clarity.

      (3) I think I do understand the meaning, but I do not understand the relevance of the Aj bit in the last sentence. What is it used for?

      Based on the response and another close look at Fig1, it turns out the j refers to extremely small numbers 1 and 2 in step 3. You may want in improve readability for these.

      Thank you for the suggestion. We have added subscripts to all nucleotides in the reference sequence in Figure 1A and revised the legend to clarify the notation and improve readability. Specifically, we now include the following explanation:

      “For example, A<sub>j</sub> denotes the base ‘A’ at the j-th position on the reference sequence. In this example, A<sub>1</sub> and A<sub>2</sub> refer to the first and second occurrences of ‘A’ in the reference sequence, respectively. Accordingly, μ<sub>1</sub> and μ<sub>2</sub> are aligned to A<sub>1</sub>, while μ<sub>3</sub> is aligned to A<sub>2</sub>”.

      (6) “We chose to use the poly(A) tail for normalization because it is sequence-invariant- i.e., all poly(A) tails consist of identical k-mers, unlike transcript sequences which vary in composition. In contrast, using the transcript region for normalization can introduce biases: for instance, reads with more diverse k-mers (having inherently broader signal distributions) would be forced to match the variance of reads with more uniform k-mers, potentially distorting the baseline across k-mers.”

      While the next part states there was a benchmark showing SegPore still works without this normalization, I think this answer does not touch upon the underlying issue I'm trying to point out here.

      - The biases mentioned here due to a more diverse (or different) subsets of k-mers in a read indeed affects the variance of the signal overall.

      - As I pointed out in my earlier remark here, this can be resolved using an approach of 'general normalization', 'mapping to expected signal', 'theil-sen fitting of scale and offset', 're-mapping to expected signal', as Tombo and Nanopolish have implemented.<br /> - Alternatively, one could use the reference sequence (using the read mapping information) and base the expected signal mean and standard deviation on that instead.

      - The polyA tail stability as an indicator for the variation in the rest of the signal seems a questionable assumption to me. A 'noisy' pore could introduce a large standard deviation using the polyA tail without increasing the deviations on the signal induced by the variety of k-mers, rather it would be representative for the deviations measured within a single k-mer segment. I thought this possible discrepancy is to be expected from a worn out pore, hence I'd imagine reads sequenced later in a run to provide worse results using this method.

      In the current version it is not the statement that is unclear, it is the underlying assumption of how this works that I question.

      We thank the reviewer for raising this important point and for the insightful discussion. Our choice of using the poly(A) tail for normalization is based on the working hypothesis that the poly(A) signal reflects overall pore-level variability and provides a stable reference for signal scaling. We find this to be a practical and effective approach in most experimental settings.

      We agree that more sophisticated strategies, such as “general normalization” or iterative fitting to the expected signal (as implemented in Tombo and Nanopolish), could in principle generate a "better" normalization. However, these approaches are significantly more challenging to implement in practice. This is because signal normalization and alignment are mutually dependent processes: baseline estimates for k-mers influence alignment accuracy, while alignment accuracy, in turn, affects baseline calculation. This interdependence becomes even more complex in the presence of RNA modifications, which alter signal distributions and further confound model fitting.

      It is worth noting that this limitation is already evident in our results. As shown in Figure 4B (first and second k-mers), Nanopolish produces more dispersed baselines than SegPore, even for these unmodified k-mers, suggesting inherent limitations in its normalization strategy. Ideally, baselines for the same k-mer should remain highly consistent across different reads.

      In contrast, poly(A)-based normalization offers a simpler and more robust solution that avoids this circular dependency. Because poly(A) sequences are compositionally homogeneous, they enable reliable estimation of scaling parameters without assumptions about k-mer composition or modification state. Regarding the reviewer’s concern about pore instability, we mitigate this issue by including only high-quality, confidently mapped reads in our analysis, which reduces the likelihood of incorporating signals from degraded or “noisy” pores.

      We fully agree that exploring more advanced normalization strategies is an important direction for future work, and we plan to investigate such approaches as the field progresses.

      (8) “In the remainder of this paper, we refer to these resulting events as the output of eventalign analysis or the segmentation task.”

      Picking only one descriptor rather than two alternatives would be easier to follow (and I'd prefer the first).

      Thank you for the suggestion. We have revised the sentence to:

      “In the remainder of this paper, we refer to these resulting events as the output of eventalign analysis, which also represents the final output of the segmentation and alignment task.”

      (9) “Additionally, a complete explanation of how the weighted mean is computed is provided in Section 5.3 of Supplementary Note 1. It is derived from signal points that are assigned to a given 5mer.”

      I believe there's no more mention of a weighted mean, and I don't get any hits when searching for 'weight'. Is that intentional?

      We apologize for the misplacement of the formulas. We have updated Section 5.3 of Supplementary Note 1 to clarify the definition of the weighted mean. Because multiple current signal segments may be aligned to a single k-mer, we computed the weighted mean for each k-mer across these segments, where the weight corresponds to the number of data points assigned to “curr” state in each event.

      (17) Response: We revised the sentence to clarify the selection criteria: "For selected 5mers “that exhibit both a clearly unmodified and a clearly” “modified signal component”, “SegPore reports the modification rate at each site,” “as well as the modification state of that site on individual reads.””

      So is this the same set described on page 13 ln 343 or not?

      “Due to the differences between human (Supplementary Fig. S2A) and mouse (Supplementary Fig. S2B), only six 5mers were found to have m6A annotations in the test data's ground truth (Supplementary Fig. S2C). For a genomic location to be identified as a true m6A modification site, it had to correspond to one of these six common 5mers and have a read coverage of greater than 20.”

      I struggle to interpret the 'For selected 5mers' part, as I'm not sure if this is a selection I'm supposed to already know at this point in the text or if it's a set just introduced here. If the latter, removing the word 'selected' would clear it up for me.

      We apologize for the confusion. What we mean is that when pooling signals aligned to the same k-mer across different genomic locations and reads, only a subset of k-mers exhibit a bimodal distribution — one peak corresponding to the unmodified state and another to the modified state. Other k-mers show a unimodal distribution, making it impossible to reliably estimate modification levels. We refer to the subset of k-mers that display a bimodal distribution as the “selected” k-mers.

      The “selected k-mers” described on page 13, line 343, must additionally have ground truth labels available in both the training and test datasets. There are 10 k-mers with ground truth annotations in the training data and 11 in the test data, and only 6 of these k-mers are shared between the two datasets, therefore only those 6 overlapping k-mers are retained for evaluation. These 6 k-mers satisfy both criteria: (1) exhibiting a bimodal distribution and (2) having ground truth annotations in both training and test sets.

      To improve clarity, we have removed the term “selected” from the sentence.

      (21) "Tombo used the "resquiggle" method to segment the raw signals, and we standardized the segments using the “poly(A)” tail to ensure a fair comparison “(See” “preprocessing section in Materials and Methods)."”

      In the Materials and Methods:

      “The raw signal segment corresponding to the poly(A) tail is used to standardize the raw signal for each read.”

      I cannot find more detailed information here on what the standardization does, do you mean to refer to Supplementary Note 1, Section 3 perhaps?

      Thank you for pointing this out. Yes, the standardization procedure is described in detail in Supplementary Note 1, Section 3. Tombo itself does not segment and align the raw signal on the absolute pA scale, which can result in very large variance in the derived events if the raw signal is used directly. To ensure a fair comparison, we therefore applied the same preprocessing steps to Tombo’s raw signals as we did for SegPore, using only the event boundary information from Tombo while standardizing the signal in the same way.

      We have revised the sentence for clarity as follows:

      “Tombo used the "resquiggle" method to segment the raw signals, but the resulting signals are not reported on the absolute pA scale. To ensure a fair comparison with SegPore, we standardized the segments using the poly(A) tail in the same way as SegPore (See preprocessing section in Materials and Methods).”

      (22A) The table shown does help showing the benchmark is unlikely to be 'cheated'. However I am suprised to see the Avg std for Nanopolish and Tombo going up instead of down, as I'd expect the transition values to increase the std, and hence, removing them should decrease these values. So why does this table show the opposite?

      I believe this table is not in the main text or the supplement, would it not be a good idea to cover this point somewhere in the work?

      Thank you for this insightful comment. In response, we carefully re-examined our analysis and identified a bug in the code related to boundary removal for Nanopolish. We have now corrected this issue and included the updated results in Supplementary Table S1 of the revised manuscript. As shown in the updated table, the average standard deviations decrease after removing the boundary regions for both Nanopolish and Tombo.

      We have now included this table in Supplementary Table S1 in the revised manuscript and added the following clarification:

      “It is worth noting that the data points corresponding to the transition state between two consecutive 5-mers are not included in the calculation of the standard deviation in SegPore’s results in Table 1. However, their exclusion does not affect the overall conclusion, as there are on average only ~6 points per 5-mer in the transition state (see Supplementary Table S1 for more details).”

      (22B) As mentioned in 2), I'm happy there's a clear definition of what is meant but I found the chosen word a bit odd.

      We apologize for the earlier unclear terminology. We now refer to it as the segmentation and alignment task, abbreviated as the segmentation task.

      (23) Reading back I can gather that from the text earlier, but the summation of what is being tested is this:

      “including Tombo, MINES (31), Nanom6A (32), m6Anet, Epinano (33), and CHEUI (20). “

      next, the identifier "Nanopolish+m6Anet" is, aside from the figure itself, only mentioned in the discussion. Adding a line that explains that "Nanopolish+m6Anet" is the default method of running m6Anet and "SegPore+m6Anet" replaces the Nanopolish part for m6Anet with Segpore, rather than jumping straight to "SegPore+m6Anet", would clarify where this identifier came from.

      Thank you for the helpful suggestion. We have added the identifier to the revised manuscript as follows:

      “Given their comparable methodologies and input data requirements, we benchmarked SegPore against several baseline tools, including Tombo, MINES (31), Nanom6A (32), m6Anet, Epinano (33), and CHEUI (20). By default, MINES and Nanom6A use eventalign results generated by Tombo, while m6Anet, Epinano, and CHEUI rely on eventalign results produced by Nanopolish. In Fig. 3C, ‘Nanopolish+m6Anet’ refers to the default m6Anet pipeline, whereas ‘SegPore+m6Anet’ denotes a configuration in which Nanopolish’s eventalign results are replaced with those from SegPore.”

      (24) For completeness I'd expect tickmarks and values on the y-axis as well.

      Thank you for the suggestion. We have updated Figures 3A and 3B in the revised manuscript to include tick marks and values on the y-axis as requested.

      (25) Considering this statement and looking back at figure 3a and 3b, wouldn't this be easier to observe if the histograms/KDE's were plotted with overlap in a single figure?

      We appreciate the suggestion. However, we believe that overlaying Figures 3A and 3B into a single panel would make the visualization cluttered and more difficult to interpret.

      (29) Please change the sentence in the text to make that clear. As it is written now (while it's the same number of motifs, so one might guess it) it does not seem to refer to that particular set of motifs and could be a new selection of 6 motifs.

      We appreciate the suggestion and have revised the sentence for clarity as follows:

      “We evaluated m6A predictions using two approaches: (1) SegPore’s segmentation results were fed into m6Anet, referred to as SegPore+m6Anet, which works for all DRACH motifs and (2) direct m6A predictions from SegPore’s Gaussian Mixture Model (GMM), which is limited to the six selected 5-mers shown in Supplementary Fig. S2C that exhibit clearly separable modified and unmodified components in the GMM (see Materials and Methods for details). ”

      (31) I think we have a different interpretation of the word 'leverage', or perhaps what it applies to. I'd say it leverages the jiggling if there's new information drawn from the jiggling behaviour. It's taking it into account if it filters for it. The HHMM as far as I understand tries to identify the jiggles, and ignore their values for the segmentation etc. So while one might see this as an approach that "leverages the hypothesis", I don't see how this HHMM "leverages the jiggling property" itself.

      Thank you for the helpful suggestion. We have replaced the word “leverages” with “models” in the revised manuscript.

      New points

      pg6ln166: “…we extract the aligned raw signal segment and reference sequence segment from Nanopolish's events [...] we extract the raw signal segment corresponding to the transcript region for each input read based on Nanopolish's poly(A) detection results.”

      It is not clear as to why this different approach is applied for these two cases in this part of the text.

      Thank you for pointing this out. The two approaches refer to different preprocessing strategies for in vivo and in vitro data.

      For in vivo data, a large proportion of reads do not span the full-length transcript and often map only to a portion of the reference sequence. Moreover, because a single gene can generate multiple transcript isoforms, a read may align equally well to several possible transcripts. Therefore, we extract only the raw signal segment that corresponds to the mapped portion of the transcript for each read.

      In contrast, for in vitro data, the transcript sequence is known precisely. As a result, we can directly extract all raw signals following the poly(A) tail and align them to the complete reference sequence.

      pg10ln259: An important distinction from classical global alignment algorithms is that one or multiple base blocks may align with a single 5mer.”

      If there was usually a 1:1 mapping the alignment algorithm would be more or less a direct match, so I think the multiple blocks aligning to a 5mer thing is actually quite common.

      Thank you for the comment. The “classical global alignment algorithm” here refers to the Needleman–Wunsch algorithm used for sequence alignment. Our intention was to highlight the conceptual difference between traditional sequence alignment and nanopore signal alignment. In classical sequence alignment, each base typically aligns to a single position in the reference. In contrast, in nanopore signal alignment, one or multiple signal segments — corresponding to varying dwell times of the motor protein — can align to a single 5-mer.

      We have revised the sentence as follows:

      “An important distinction from classical global alignment algorithms (Needleman–Wunsch algorithm)……”

      pg13ln356: "dwell time" is not defined or used before, I guess it's effectively the number of raw samples per segment but this should be clarified.

      Thank you for pointing this out. We have now added a clear definition of dwell time in the text as follows:

      "such as the normalized mean μ_i, standard deviation σ_i, dwell time l_i (number of data points in the event)."

      pg13ln358: “Feature vectors from 80% of the genomic locations were used for training, while the remaining 20% were set aside for validation.”

      I assume these are selected randomly but this is not explicitly stated here and should be.

      Yes, they are randomly selected. We have revised the sentence as follows:

      “Feature vectors from a randomly selected 80% of the genomic locations were used for training, while the remaining 20% were set aside for validation.”

      pg18ln488: The manuscript now evaluates RNA004 and compares against f5c and Uncalled4. It mentions the differences between RNA004 and RNA002, namely kmer size and current levels, but does not explain where the starting reference model values for the RNA004 model come from: In pg18ln492 they state "RNA004 provides reference values for 9mers", then later they seem to use a 5mer parameter table (pg19ln508), are they re-using the same table from RNA002 or did they create a 5mer table from the 9mer reference table?

      We apologize for the confusion. The reference model table for RNA004 9-mers is obtained from f5c (the array named ‘rna004_130bps_u_to_t_rna_9mer_template_model_builtin_data’in  https://raw.githubusercontent.com/hasindu2008/f5c/refs/heads/master/src/model.h).

      Author response image 1.

      We have revised the subsection header “5-mer parameter table” in the Method to “5-mer & 9-mer parameter table” to highlight this and added a paragraph about how to obtain the 9-mer parameter table:

      “In the RNA004 data analysis (Table 2), we obtained the 9-mer parameter table from the source code of f5c (version 1.5). Specifically, we used the array named ‘rna004_130bps_u_to_t_rna_9mer_template_model_builtin_data’ from the following file: https://raw.githubusercontent.com/hasindu2008/f5c/refs/heads/master/src/model.h (accessed on 17 October 2025).”

      Also, in page 18 line 195, we added the following sentence:

      “The 9-mer parameter table in pA scale for RNA004 data provided by f5c (see Materials and Methods) was used in the analysis.”

      pg19ln520: “Additionally, due to the differences of the k-mer motifs between human and mouse (Supplementary Fig. S2), six shared 5mers were selected to demonstrate SegPore's performance in modification prediction directly.”

      "the differences" - in occurrence rates, as I gather from the supplementary figure, but it would be good to explicitly state it in this sentence itself too.

      Thank you for the helpful suggestion. We agree that the original sentence was vague. The main reason for selecting only six 5-mers is the difference in the availability of ground truth labels for specific k-mer motifs between human and mouse datasets. We have revised the sentence accordingly:

      “Additionally, due to the differences in the availability of ground truth labels for specific k-mer motifs between human and mouse (Supplementary Fig. S2), six shared 5-mers were selected to directly demonstrate SegPore’s performance in modification prediction.”

      pg24ln654: “SegPore codes current intensity levels”

      "codes" is meant to be "stores" I guess? Perhaps "encodes"?

      Thank you for the suggestion. We have now replaced it with “encodes” in the revised manuscript.

      Lastly, looking at the feedback from the other reviewers comment:

      The 'HMM' mentioned in line 184 looks fine to me, the HHMM is 2 HMM's in a hierarchical setup and the text now refers to one of these HMM layers. If this is to be changed it would need to state the layer (e.g. "the outer HHMM layer") throughout the text instead.

      We agree with this assessment and believe that the term “inner HMM” is accurate in this context, as it correctly refers to one of the two HMM layers within the HHMM structure. Therefore, we have decided to retain the current terminology.

      Reviewer #3 (Recommendations for the authors):

      I recommend the publication of this manuscript, provided that the following comments are addressed.

      Page 5, Preprocessing: You comment that the poly(A) tail provides a stable reference that is crucial for the normalisation of all reads. How would this step handle reads that have interrupted poly(A) tails (e.g. in the case of mRNA vaccines that employ a linker sequence)? Or cell types that express TENT4A/B, which can include transcripts with non-A residues in the poly(A) tail: https://www.science.org/doi/full/10.1126/science.aam5794.

      It depends on Nanopolish’s ability to reliably detect the poly(A) tail. In general, the poly(A) region produces a long stretch of signals fluctuating around a current level of ~108.9 pA (RNA002) with relatively stable variation, which allows it to be identified and used for normalization.

      For in vivo data, if the poly(A) tail is interrupted (e.g., due to non-A residues or linker sequences), two scenarios are possible:

      (1) The poly(A) tail may not be reliably detected, in which case the corresponding read will be excluded from our analysis.

      (2) Alternatively, Nanopolish may still recognize the initial uninterrupted portion of the poly(A) signal, which is typically sufficient in length and stability to be used for signal normalization.

      For in vitro data, the poly(A) tails are uninterrupted, so this issue does not arise.

      All analyses presented in this study are based exclusively on reads with reliably detected poly(A) tails.

      Page 7, 5mer parameter table: r9.4_180mv_70bps_5mer_RNA is an older kmer model (>2 years). How does your method perform with the newer RNA kmer models that do permit the detection of multiple ribonucleotide modifications? Addressing this comment would be beneficial, however I understand that it would require the generation of new data, as limited RNA004 datasets are available in the public domain.

      “r9.4_180mv_70bps_5mer_RNA” is the most widely used k-mer model for RNA002 data. Regarding the newer k-mer models, we believe the reviewer is referring to the “modification basecalling” models available in Dorado, which are specifically designed for RNA004 data. At present, SegPore can perform RNA modification estimation only on RNA002 data, as this is the platform for which suitable training data and ground truth annotations are available. Evaluating SegPore’s performance with the newer RNA004 modification models would require new datasets containing known modification sites generated with RNA004 chemistry. Since such data are currently unavailable, we have not yet been able to assess SegPore under these conditions. This represents an important future direction for extending and validating our method.

      The Methods and Results sections contain redundant information -please streamline the information in these sections and reduce the redundancy.

      We thank the reviewer for this suggestion and acknowledge that there is some overlap between the Methods and Results sections. However, we feel that removing these parts could compromise the clarity and readability of the manuscript, especially given that Reviewer 2 emphasized the need for clearer explanations. We therefore decided to retain certain methodological descriptions in the Results section to ensure that key steps are understandable without requiring the reader to constantly cross-reference the Methods.

      Minor comments

      Please be consistent when referring to k-mers and 5-mers (sometimes denoted as 5mers - please change to 5-mers throughout).

      We have revised the manuscript to ensure consistency and now use “5-mers” throughout the text.

      Introduction

      Lines 80 - 112: Please condense this section to roughly half the length (1-2 paragraphs). In general, the results described in the introduction should be very brief, as they are described in full in the results section.

      Thank you for the suggestion. We have condensed the original three paragraphs into a single, more concise paragraph as follows:

      "SegPore is a novel tool for direct RNA sequencing (DRS) signal segmentation and alignment, designed to overcome key limitations of existing approaches. By explicitly modeling motor protein dynamics during RNA translocation with a Hierarchical Hidden Markov Model (HHMM), SegPore segments the raw signal into small, biologically meaningful fragments, each corresponding to a k-mer sub-state, which substantially reduces noise and improves segmentation accuracy. After segmentation, these fragments are aligned to the reference sequence and concatenated into larger events, analogous to Nanopolish’s “eventalign” output, which serve as the foundation for downstream analyses. Moreover, the “eventalign” results produced by SegPore enhance interpretability in RNA modification estimation. While deep learning–based tools such as m6Anet classify RNA modifications using complex, non-transparent features (see Supplementary Fig. S5), SegPore employs a simple Gaussian Mixture Model (GMM) to distinguish modified from unmodified nucleotides based on baseline current levels. This transparent modeling approach improves confidence in the predictions and makes SegPore particularly well-suited for biological applications where interpretability is essential."

      Line 104: Please change "normal adenosine" to "adenosine".

      We have revised the manuscript as requested and replaced all instances of “normal adenosine” with “adenosine” throughout the text.

      Materials and Methods

      Line 176: Please reword "...we standardize the raw current signals across reads, ensuring that the mean and standard deviation of the poly(A) tail are consistent across all reads." To "...we standardize the raw current signals for each read, ensuring that the mean and standard deviation are consistent across the poly(A) tail region."

      We have changed sentence as requested.

      “Since the poly(A) tail provides a stable reference, we standardize the raw current signals for each read, ensuring that the mean and standard deviation are consistent across the poly(A) tail region.”

      Line 182: Please describe the RNA translocation hypothesis, as this is the first mention of it in the text. Also, why is the Hierachical Hidden Markov model perfect for addressing the RNA translocation hypothesis? Explain more about how the HHMM works and why it is a suitable choice.

      We have revised the sentence as requested:

      “The RNA translocation hypothesis (see details in the first section of Results) naturally leads to the use of a hierarchical Hidden Markov Model (HHMM) to segment the raw current signal.”

      The motivation of the HHMM is explained in detail in the the first section “RNA translocation hypothesis” of Results. As illustrated in Figure 2, the sequencing data suggest that RNA molecules may translocate back and forth (often referred to as jiggling) while passing through the nanopore. This behavior results in complex current fluctuations that are challenging to model with a simple HMM. The HHMM provides a natural framework to address this because it can model signal dynamics at two levels. The outer HMM distinguishes between two major states — base states (where the signal corresponds to a stable sub-state of a k-mer) and transition states (representing transitions from one base state to the next). Within each base state, an inner HMM models finer signal variation using three states — “curr”, “prev”, and “next” — corresponding to the current k-mer sub-states and its neighboring k-mer sub-states. This hierarchical structure captures both the stable signal patterns and the stochastic translocation behavior, enabling more accurate and biologically meaningful segmentation of the raw current signal.

      Line 184: do you mean HHMM? Please be consistent throughout the text.

      As explained in the previous response, the HHMM consists of two layers: an outer HMM and an inner HMM. The term “HMM” in line 184 is meant to be read together with “inner” at the end of line 183, forming the phrase “inner HMM.” It seems the reviewer may have overlooked this when reading the text.

      Line 203: please delete: "It is obviously seen that".

      We have removed the phrase “It is obviously seen that” from the sentence as requested. The revised sentence now reads:

      “The first part of Eq. 2 represents the emission probabilities, and the second part represents the transition probabilities.”

      Line 314, GMM for 5mer parameter table re-estimation: "Typically, the process is repeated three to five times until the5mer parameter table stabilizes." How is the stabilisation of the 5mer parameter table quantified? What is a reasonable cut-off that would demonstrate adequate stabilisation of the 5mer parameter table? Please add details of this to the text.

      We have revised the sentence to clarify the stabilization criterion as follows:

      “Typically, the process is repeated three to five times until the 5-mer parameter table stabilizes (when the average change of mean values of all 5-mers is less than 5e-3).”

      Results

      Line 377: Please edit to read "Traditional base calling algorithms such as Guppy and Albacore assume that the RNA molecule is translocated unidirectionally through the pore by the motor protein."

      We have revised the sentence as:

      “In traditional basecalling algorithms such as Guppy and Albacore, we implicitly assume that the RNA molecule is translocated through the pore by the motor protein in a monotonic fashion, i.e., the RNA is pulled through the pore unidirectionally.”

      Line 555, m6A identification at the site level: "For six selected m6A motifs, SegPore achieved an ROC AUC of 82.7% and a PR AUC of 38.7%, earning the third best performance compared with deep leaning methods m6Anet and CHEUI (Fig. 3D)." So SegPore performs third best of all deep learning methods. Do you recommend its use in conjunction with m6Anet for m6A detection? Please clarify in the text. This will help to guide users to possible best practice uses of your software.

      Thank you for the suggestion. We have added a clarification in the revised manuscript to guide users.

      “For practical applications, we recommend taking the intersection of m6A sites predicted by SegPore and m6Anet to obtain high-confidence modification sites, while still benefiting from the interpretability provided by SegPore’s predictions.”

      Figures.

      Figure 1A please refer to poly(A) tail, rather than polyA tail.

      We have updated it to poly(A) tail in the revised manuscript.

    1. The FKBP5 rs9296158, rs4713916, rs992105 andrs3800373 SNPs have been found to be irrele-vant in the association between childhood trau-ma and cognitive functioning in patients withpsychotic disorder

      Perhaps these SNPs are silent based on when they are transcribed, meaning when the trauma happens depends on whether or not the SNP is in transcription and then reflected in translation?

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      Summary: 

      The study by Pinho et al. presents a novel behavioral paradigm for investigating higher-order conditioning in mice. The authors developed a task that creates associations between light and tone sensory cues, driving mediated learning. They observed sex differences in task acquisition, with females demonstrating faster-mediated learning compared to males. Using fiber photometry and chemogenetic tools, the study reveals that the dorsal hippocampus (dHPC) plays a central role in encoding mediated learning. These findings are crucial for understanding how environmental cues, which are not directly linked to positive/negative outcomes, contribute to associative learning. Overall, the study is well-designed, with robust results, and the experimental approach aligns with the study's objectives. 

      Strengths: 

      (1) The authors develop a robust behavioral paradigm to examine higher-order associative learning in mice. 

      (2) They discover a sex-specific component influencing mediated learning, with females exhibiting enhanced learning abilities. 

      (3) Using fiber photometry and chemogenetic techniques, the authors identify the dorsal hippocampus but not the ventral hippocampus, which plays a crucial for encoding mediated learning.

      We appreciate the strengths highlighted by the Reviewer and the valuable and complete summary of our work.

      Weaknesses: 

      (1) The study would be strengthened by further elaboration on the rationale for investigating specific cell types within the hippocampus.  

      We thank the Reviewer for highlighting this important point. In the revised manuscript, we have added new information (Page 11, Lines 27-34) to specifically explain the rational of studying the possible cell-type specific involvement in sensory preconditioning.

      (2) The analysis of photometry data could be improved by distinguishing between early and late responses, as well as enhancing the overall presentation of the data.  

      According to the Reviewer comment, we have included new panels in Figure 3E and the whole Supplementary Figure 4, which separates the photometry data across different preconditioning and conditioning sessions, respectively. Overall, this data suggests that there are no major changes on cell activity in both hippocampal regions during the different sessions as similar light-tone-induced enhancement of activity is observed. These findings have been incorporated in the Results Section (Page 12, Lines 13-15, 19-20 and 35-36).

      (3) The manuscript would benefit from revisions to improve clarity and readability.

      Based on the fair comment, we have gone through the text to increase clarity and readability.

      Reviewer #2 (Public review): 

      Summary: 

      Pinho et al. developed a new auditory-visual sensory preconditioning procedure in mice and examined the contribution of the dorsal and ventral hippocampus to learning in this task. Using photometry they observed activation of the dorsal and ventral hippocampus during sensory preconditioning and conditioning. Finally, the authors combined their sensory preconditioning task with DREADDs to examine the effect of inhibiting specific cell populations (CaMKII and PV) in the DH on the formation and retrieval/expression of mediated learning. 

      Strengths: 

      The authors provide one of the first demonstrations of auditory-visual sensory preconditioning in male mice. Research on the neurobiology of sensory preconditioning has primarily used rats as subjects. The development of a robust protocol in mice will be beneficial to the field, allowing researchers to take advantage of the many transgenic mouse lines. Indeed, in this study, the authors take advantage of a PV-Cre mouse line to examine the role of hippocampal PV cells in sensory preconditioning. 

      We acknowledge the Reviewer´s effort and for highlighting the strengths of our work.

      Weaknesses: 

      (1) The authors report that sensory preconditioning was observed in both male and female mice. However, their data only supports sensory preconditioning in male mice. In female mice, both paired and unpaired presentations of the light and tone in stage 1 led to increased freezing to the tone at test. In this case, fear to the tone could be attributed to factors other than sensory preconditioning, for example, generalization of fear between the auditory and visual stimulus.

      We thank the comment raised by the Reviewer. At first, we were hypothesizing that female mice were somehow able to associate light and tone although they were presented separately during the preconditioning sessions. Thus, we designed new experiments (shown in Supplementary Figure 2D) to test if we would observe data congruent with our initial hypothesis or with fear generalization as proposed by the reviewer. We have performed a new experiment comparing a Paired group with two additional control groups that are (i) an Unpaired group where we increased the time between the light and tone presentations and (ii) an experimental group where the light was absent during the conditioning. Clearly, the new results indicate the presence of fear generalization in female mice aswe found a significant cue-induced increase on freezing responses in all the experimental groups tested. In accordance with the Reviewer’s suggestion, we can conclude that mediated learning is not correctly observed in female mice using the protocol described (i.e. with 2 conditioning sessions). All these new results forced us to reorganize the structure and the figures of the manuscript to focus more in male mice in the Main Figures whereas showing the data with female mice in Supplementary Figures. Overall, our data clearly revealed the necessity to have adapted behavioral protocols for each sex demonstrating sex differences in sensory preconditioning, which was added in the Discussion Section (Page 15, lines 12-37).

      (2) In the photometry experiment, the authors report an increase in neural activity in the hippocampus during both phase 1 (sensory preconditioning) and phase 2 (conditioning). In the subsequent experiment, they inhibit neural activity in the DH during phase 1 (sensory preconditioning) and the probe test, but do not include inhibition during phase 2 (conditioning). It was not clear why they didn't carry forward investigating the role of the hippocampus during phase 2 conditioning. Sensory preconditioning could occur due to the integration of the tone and shock during phase two, or retrieval and chaining of the tonelight-shock memories at test. These two possibilities cannot be differentiated based on the data. Given that we do not know at which stage the mediate learning is occurring, it would have been beneficial to additionally include inhibition of the DH during phase 2. 

      Following the Reviewer’s valuable comment, we have conducted a new experiment where we have chemogenetically inhibited the CaMKII-positive neurons of the dHPC during the conditioning to explore their involvement in mediated learning formation. Notably, the inhibition of principal neurons of the dHPC during conditioning does not impair the formation ofthe mediated learning in our hands. These new results are now shown in Supplementary Figure 7G and added in the Results section (Page 13, Lines 19-23).

      (3) In the final experiment, the authors report that inhibition of the dorsal hippocampus during the sensory preconditioning phase blocked mediated learning. While this may be the case, the failure to observe sensory preconditioning at test appears to be due more to an increase in baseline freezing (during the stimulus off period), rather than a decrease in freezing to the conditioned stimulus. Given the small effect, this study would benefit from an experiment validating that administration of J60 inhibited DH cells. Further, given that the authors did not observe any effect of DREADD inhibition in PV cells, it would also be important to validate successful cellular silencing in this protocol.  

      According to the Reviewer comments, we have performed new experiments to validate the use of J60 to inhibit hippocampal cells that are shown in Supplementary Figure 7 E-F for CaMKII-positive neurons, in which J60 administration tends to decrease the frequency of calcium events both in the dHPC and vHPC. Furthermore, in Supplementary Figure 8 B-C we show that J60 is also able to modify calcium events in PV-positive interneurons. Although,the best method to validate the use of DREADD (i.e. to inhibit hippocampal cell activity) could be electrophysiology recordings, we lack this technique in our laboratory. Thus, in order to adress the reviewer comment, we decided to combine the DREADD modulation through J60 administration with photometry recordings, where several tendencies are confirmed. In addition, a similar approach has been used in another preprint of the lab (https://doi.org/10.1101/2025.08.29.673009), where there is an increase of phospho-PDH, a marker of neuronal inhibition upon J60 administration in the dHPC, as well as in other experiments conducted from a collaborator lab where they were able to observe a modulation of SOM-positive interneurons activity upon J60 administration (PhD defense of Miguel Sabariego, University Pompeu Fabra, Barcelona). 

      Reviewer #3 (Public review): 

      Summary: 

      Pinho et al. investigated the role of the dorsal vs ventral hippocampus and the gender differences in mediated learning. While previous studies already established the engagement of the hippocampus in sensory preconditioning, the authors here took advantage of freely-moving fiber photometry recording and chemogenetics to observe and manipulate sub-regions of the hippocampus (dorsal vs. ventral) in a cell-specific manner. The authors first found sex differences in the preconditioning phase of a sensory preconditioning procedure, where males required more preconditioning training than females for mediating learning to manifest, and where females displayed evidence of mediated learning even when neutral stimuli were never presented together within the session. 

      After validation of a sensory preconditioning procedure in mice using light and tone neutral stimuli and a mild foot shock as the unconditioned stimulus, the authors used fiber photometry to record from all neurons vs. parvalbumin_positive_only neurons in the dorsal hippocampus or ventral hippocampus of male mice during both preconditioning and conditioning phases. They found increased activity of all neurons, as well as PV+_only neurons in both sub-regions of the hippocampus during both preconditioning and conditioning phases. Finally, the authors found that chemogenetic inhibition of CaMKII+ neurons in the dorsal, but not ventral, hippocampus specifically prevented the formation of an association between the two neutral stimuli (i.e., light and tone cues), but not the direct association between the light cue and the mild foot shock. This set of data: (1) validates the mediated learning in mice using a sensory preconditioning protocol, and stresses the importance of taking sex effect into account; (2) validates the recruitment of dorsal and ventral hippocampi during preconditioning and conditioning phases; and (3) further establishes the specific role of CaMKII+ neurons in the dorsal but not ventral hippocampus in the formation of an association between two neutral stimuli, but not between a neutralstimulus and a mild foot shock. 

      Strengths: 

      The authors developed a sensory preconditioning procedure in mice to investigate mediated learning using light and tone cues as neutral stimuli, and a mild foot shock as the unconditioned stimulus. They provide evidence of a sex effect in the formation of light-cue association. The authors took advantage of fiber-photometry and chemogenetics to target sub-regions of the hippocampus, in a cell-specific manner and investigate their role during different phases of a sensory conditioning procedure. 

      We thank the Reviewer for the extensive summary of our work and for giving interesting value to some of our findings.

      Weaknesses: 

      The authors went further than previous studies by investigating the role of sub-regions of the hippocampus in mediated learning, however, there are several weaknesses that should be noted: 

      (1) This work first validates mediated learning in a sensory preconditioning procedure using light and tone cues as neutral stimuli and a mild foot shock as the unconditioned stimulus, in both males and females. They found interesting sex differences at the behavioral level, but then only focused on male mice when recording and manipulating the hippocampus. The authors do not address sex differences at the neural level. 

      We appreciate the comment of the Reviewer. Indeed, thanks to other Reviewer comments during this revision process (see Point 1 of Reviewer #2), we performed an additional experiment that reveals that using the described protocol in female mice we observed fear generalization rather than mediated learning responding. This data pointed to the need of sex-specific changes in the behavioral protocols to measure sensory preconditioning. The revised version of the manuscript, although highlighting these sex differences in behavioral performance (see Supplementary Figure 2), is more focused in male mice and, accordingly, all photometry or chemogenetic experiments are performed using male mice. In future studies, once we are certain to have a sensory preconditioning paradigm working in female mice, it will be very interesting to study if the same hippocampal mechanisms mediating this behavior in male mice are also observed in female mice.  

      (2) As expected in fear conditioning, the range of inter-individual differences is quite high. Mice that didn't develop a strong light-->shock association, as evidenced by a lower percentage of freezing during the Probe Test Light phase, should manifest a low percentage of freezing during the Probe Test Tone phase. It would interesting to test for a correlation between the level of freezing during mediated vs test phases. 

      Thanks to the comment raised by the reviewer, we generated a new set of data correlating mediated and direct fear responses. As it can be observed in Supplementary Figure 3, there is a significant correlation between mediated and direct learning in male mice (i.e. the individuals that freeze more in the direct learning test, correlate with the individuals that express more fear response in the mediated learning test). In contrast, this correlation is absent in female mice, further confirming what we have explained above. We have highlighted this new analysis in the Results section (Page 11, Lines 20-24).

      (3) The use of a synapsin promoter to transfect neurons in a non-specific manner does not bring much information. The authors applied a more specific approach to target PV+ neurons only, and it would have been more informative to keep with this cell-specific approach, for example by looking also at somatostatin+ inter-neurons. 

      The idea behind using a pan neuronal promoter was to assess in general terms how neuronal activity in the hippocampus is engaged during different phases of the lighttone sensory preconditioning. However, the comment of the Reviewer is very pertinent and, as suggested, we have generated some new data targeting CaMKII-positive neurons (see Point 4 below). Finally, although it could be extremely interesting, we believe that targeting different interneuron subtypes is out of the scope of the present work. However, we have added this in the Discussion Section as a future perspective/limitation of our study (Page 17, Lines 9-24).   

      (4) The authors observed event-related Ca2+ transients on hippocampal pan-neurons and PV+ inter-neurons using fiber photometry. They then used chemogenetics to inhibit CaMKII+ hippocampal neurons, which does not logically follow. It does not undermine the main finding of CaMKII+ neurons of the dorsal, but not ventral, hippocampus being involved in the preconditioning, but not conditioning, phase. However, observing CaMKII+ neurons (using fiber photometry) in mice running the same task would be more informative, as it would indicate when these neurons are recruited during different phases of sensory preconditioning. Applying then optogenetics to cancel the observed event-related transients (e.g., during the presentation of light and tone cues, or during the foot shock presentation) would be more appropriate.  

      We have generated new photometry data to analyze the activity of CaMKII-positive neurons during the preconditioning phase to confirm their engagement during the light-tone pairings. Thus, we infused a CaMKII-GCAMP calcium sensor into the dHPC and vHPC of mice and we recorded its activity during the 6 preconditioning sessions. The new results can be found in Figure 3 and explained in the Results section (Page 12, Lines 26-36). The results clearly show an engagement of CaMKII-positive neurons during the light-tone pairing observed both in the dHPC and vHPC. Finally, although the suggestion of performing optogenetic manipulations would be very elegant, we expect to have convinced the reviewer that our chemogenetic results clearly show and are enough to demonstrate the involvement of dHPC in the formation of mediated learning in the Light-Tone sensory preconditioning paradigm. However, we have added this in the Discussion Section as a future perspective/limitation of our study (Page 17, Lines 9-24).  

      (5) Probe tests always start with the "Probe Test Tone", followed by the "Probe Test Light". "Probe Test Tone" consists of an extinction session, which could affect the freezing response during "Probe Test Light" (e.g., Polack et al. (http://dx.doi.org/10.3758/s13420-013-0119-5)). Preferably, adding a group of mice with a Probe Test Light with no Probe Test Tone could help clarify this potential issue. The authors should at least discuss the possibility that the tone extinction session prior to the "Probe Test Light" could have affected the freezing response to the light cue. 

      We appreciate the comment raised by the reviewer. However, we think that our direct learning responses are quite robust in all of our experiments and, thus, the impact of a possible extinction based on the tone presentation should not affect our direct learning. However, as it is an important point, we have discussed it in the Discussion Section (Page 17, Lines 12-14).  

      Reviewer #4 (Public review): 

      Summary 

      Pinho et al use in vivo calcium imaging and chemogenetic approaches to examine the involvement of hippocampal sub-regions across the different stages of a sensory preconditioning task in mice. They find clear evidence for sensory preconditioning in male but not female mice. They also find that, in the male mice, CaMKII-positive neurons in the dorsal hippocampus: (1) encode the audio-visual association that forms in stage 1 of the task, and (2) retrieve/express sensory preconditioned fear to the auditory stimulus at test. These findings are supported by evidence that ranges from incomplete to convincing. They will be valuable to researchers in the field of learning and memory. 

      We appreciate the summary of our work and all the constructive comments raised by the Reviewer, which have greatly improved the clarity and quality of our manuscript.  

      Abstract 

      Please note that sensory preconditioning doesn't require the stage 1 stimuli to be presented repeatedly or simultaneously. 

      The reviewer is right, and we have corrected and changed that information in the revised abstract.  

      "Finally, we combined our sensory preconditioning task with chemogenetic approaches to assess the role of these two hippocampal subregions in mediated learning."  This implies some form of inhibition of hippocampal neurons in stage 2 of the protocol, as this is the only stage of the protocol that permits one to make statements about mediated learning. However, it is clear from what follows that the authors interrogate the involvement of hippocampal sub-regions in stages 1 and 3 of the protocol - not stage 2. As such, most statements about mediated learning throughout the paper are potentially misleading (see below for a further elaboration of this point). If the authors persist in using the term mediated learning to describe the response to a sensory preconditioned stimulus, they should clarify what they mean by mediated learning at some point in the introduction. Alternatively, they might consider using a different phrase such as "sensory preconditioned responding". 

      Considering the arguments of the Reviewer, we have modified our text in the Abstract and through the main text. Moreover, based on a comment of Reviewer #2 (Point 2) we have generated new data demonstrating that dHPC does not seem to be involved in mediated learning formation during Stage 2, as its inhibition does not impair sensory preconditioning responding. This new data can be seen in Supplementary Figure 7G.  

      Introduction 

      "Low-salience" is used to describe stimuli such as tone, light, or odour that do not typically elicit responses that are of interest to experimenters. However, a tone, light, or odour can be very salient even though they don't elicit these particular responses. As such, it would be worth redescribing the "low-salience" stimuli in some other terms. 

      Through the revised version of the manuscript, we have replaced the term “lowsalience” by “innocuous stimuli” or avoiding any adjective as we think is not necessary.  

      "These higher-order conditioning processes, also known as mediated learning, can be captured in laboratory settings through sensory preconditioning procedures2,6-11."  Higher-order conditioning and mediated learning are not interchangeable terms: e.g., some forms of second-order conditioning are not due to mediated learning. More generally, the use of mediated learning is not necessary for the story that the authors develop in the paper and could be replaced for accuracy and clarity. E.g., "These higher-order conditioning processes can be studied in the laboratory using sensory preconditioning procedures2,6-11." 

      According to the Reviewer proposal, we have modified the text. 

      In reference to Experiment 2, it is stated that: "However, when light and tone were separated on time (Unpaired group), male mice were not able to exhibit mediated learning response (Figure 2B) whereas their response to the light (direct learning) was not affected (Figure 2D). On the other hand, female mice still present a lower but significant mediated learning response (Figure 2C) and normal direct learning (Figure 2E). Finally, in the No-Shock group, both male (Figure 2B and 2D) and female mice (Figure 2C and 2E) did not present either mediated or direct learning, which also confirmed that the exposure to the tone or light during Probe Tests do not elicit any behavioral change by themselves as the presence of the electric footshock is required to obtain a reliable mediated and direct learning responses."  The absence of a difference between the paired and unpaired female mice should not be described as "significant mediated learning" in the latter. It should be taken to indicate that performance in the females is due to generalization between the tone and light. That is, there is no sensory preconditioning in the female mice. The description of performance in the No-shock group really shouldn't be in terms of mediated or direct learning: that is, this group is another control for assessing the presence of sensory preconditioning in the group of interest. As a control, there is no potential for them to exhibit sensory preconditioning, so their performance should not be described in a way that suggests this potential. 

      All these comments are very pertinent and also raised by Reviewer #2 (Point 1, see above). In the revised version of the manuscript, we have carefully changed, when necessary, our interpretation of the results (e.g. in the case of the No-Shock group). In addition, we have generated new data that confirm that using similar conditions (i.e. 2 conditioning sessions in our SPC) in female mice we observe fear generalization and not a confident sensory preconditioning responding. In our opinion, this is not discarding the presence of mediated learning in female mice but suggesting that adapted protocols must be used in each sex. These results forced us to change the organization of the Figures but we hope the reviewer would agree with all the changes proposed. In addition, we have re-wrote a paragraph in the Discussion Section to explain these sex differences (see Page 15, lines 12-37). 

      Methods - Behavior 

      I appreciate the reasons for testing the animals in a new context. This does, however, raise other issues that complicate the interpretation of any hippocampal engagement: e.g., exposure to a novel context may engage the hippocampus for exploration/encoding of its features - hence, it is engaged for retrieving/expressing sensory preconditioned fear to the tone. This should be noted somewhere in the paper given that one of its aims is to shed light on the broader functioning of the hippocampus in associative processes. 

      This general issue - that the conditions of testing were such as to force engagement of the hippocampus - is amplified by two further features of testing with the tone. The first is the presence of background noise in the training context and its absence in the test context. The second is the fact that the tone was presented for 30 s in stage 1 and then continuously for 180s at test. Both changes could have contributed to the engagement of the hippocampus as they introduce the potential for discrimination between the tone that was trained and tested. 

      We have now added these pertinent comments in a “Study limitations” paragraph found in the Discussion Section (Page 17, Lines 9-24). Indeed, the different changes of context (including the presence of background noise) have been implemented by the fact that during the setting up of the paradigm we had problems of fear generalization (also in male mice). Similarly, differences in cue exposure between the preconditioning phase and the test phase were also decided based on important differences between previous protocols used in rats compared to how mice are responding. Certainly, mice were not able to adapt their behavioral responses when shorter time windows exposing the cue were used as it clearly happens with rats [1].

      Results - Behavior 

      The suggestion of sex differences based on differences in the parameters needed to generate sensory preconditioning is interesting. Perhaps it could be supported through some set of formal analyses. That is, the data in supplementary materials may well show that the parameters needed to generate sensory preconditioning in males and females are not the same. However, there needs to be some form of statistical comparison to support this point. As part of this comparison, it would be neat if the authors included body weight as a covariate to determine whether any interactions with sex are moderated by body weight.  

      Regarding the comparison between male and female mice, although the comments of the Reviewer are pertinent and interesting, we think that with the new data generated is not appropriate to compare both sexes as we still have to optimize the SPC protocol for female mice. 

      What is the value of the data shown in Figure 1 given that there are no controls for unpaired presentations of the sound and light? In the absence of these controls, the experiment cannot have shown that "Female and male mice show mediated learning using an auditory-visual sensory preconditioning task" as implied by its title. Minimally, this experiment should be relabelled. 

      Based on the new data generated with female mice, we have decided to remove Figure 1 and re-organize the structure of the manuscript. We hope that the Reviewer would agree that this has improved the clarity of the manuscript.  

      "Altogether, this data confirmed that we successfully set up an LTSPC protocol in mice and that this behavioral paradigm can be used to further study the brain circuits involved in higherorder     conditioning."  Please insert the qualifier that LTSPC was successfully established in male mice. There is no evidence of LTSPC in female mice. 

      We fully agree with the Reviewer and our new findings further confirm this issue. Thus, we have changed the statement in the revised version of the manuscript.  

      Results - Brain 

      "Notably, the inhibition of CaMKII-positive neurons in the dHPC (i.e. J60 administration in DREADD-Gi mice) during preconditioning (Figure 4B), but not before the Probe Test 1 (Figure 4B), fully blocked mediated, but not direct learning (Figure  4D)." The right panel of Figure 4B indicates no difference between the controls and Group DPC in the percent change in freezing from OFF to ON periods of the tone. How does this fit with the claim that CaMKII-positive neurons in the dorsal hippocampus regulate associative formation during the session of tone-light exposures in stage 1 of sensory preconditioning? 

      To improve the quality of the figures and to avoid possible redundancies between panels, in the new version of the manuscript, we have decided to remove all the panels regarding the percentage of change. However, in our opinion regarding the issue raised by the Reviewer, the inhibition of the dHPC clearly induced an impairment of mediated learning as animals do not change their behavior (i.e. there is no significant increase of freezing between OFF and ON periods) when the tone appears in comparison with the other two groups. The graphs indicating the percentage of change (old version of the manuscript) was a different manner to show the presence of tone- or light-induced responses in each experimental group. Thus, a significant effect (shown by # symbol) meant that in that specific experimental group there was a significant change in behavior (freezing) when the cue (tone or light) appeared compared when there was no cue (OFF period). Thus, in the old panel 4B commented by the Reviewer, in our opinion, the absence of significance in the group where the dHPC has been inhibited during thepreconditioning, compared to the other groups, where a clear significant effect can be observed, indicate an impairment of mediated learning formation. However, to avoid any confusion, we have slightly modified the text to strictly mention what is being analyzed and/or shown in the graphs and, as mentioned, the graphs of percentage of change have been removed.  

      Discussion 

      "When low salience stimuli were presented separated on time or when the electric footshock was absent, mediated and direct learning were abolished in male mice. In female mice, although light and tone were presented separately during the preconditioning phase, mediated learning was reduced but still present, which implies that female mice are still able to associate the two low-salience stimuli." 

      This doesn't quite follow from the results. The failure of the female unpaired mice to withhold their freezing to the tone should not be taken to indicate the formation of a light-tone association across the very long interval that was interpolated between these stimulus presentations. It could and should be taken to indicate that, in female mice, freezing conditioned to the light simply generalized to the tone (i.e., these mice could not discriminate well between the tone and light). 

      As discussed above, we fully agree with the Reviewer and all the manuscript has been modified as described above. 

      "Indeed, our data suggests that when hippocampal activity is modulated by the specific manipulation of hippocampal subregions, this brain region is not involved during retrieval."  Does this relate to the results that are shown in the right panel of Figure 4B, where there is no significant difference between the different groups? If so, how does it fit with the results shown in the left panel of this figure, where differences between the groups are observed? 

      "In line with this, the inhibition of CaMKII-positive neurons from the dorsal hippocampus, which has been shown to project to the restrosplenial cortex56, blocked the formation of mediated learning." 

      Is this a reference to the findings shown in Figure 4B and, if so, which of the panels exactly? That is, one panel appears to support the claim made here while the other doesn't. In general, what should the reader make of data showing the percent change in freezing from stimulus OFF to stimulus ON periods? 

      In our opinion, as pointed above, the graphs indicating the percentage of change were a different manner to show the presence of tone- or light-induced behavioral responses in each experimental group. Thus, a significant effect (shown by # symbol) meant that in this specific experimental group there was a significant change in behavior (freezing) when the cue (tone or light appear) compared when there was no cue (OFF period). Thus, in the old panel 4B commented by the Reviewer, in our opinion, the absence of significance in the group where the dHPC has been inhibited during the preconditioning, compared to the other groups where a clear significant effect can be observed, indicates an impairment of mediated learning formation. In the revised version of the manuscript, we have rephrased these sentences to stick to what the graphs are showing and, as explained, the graphs of percentage of change have been removed.

      Reviewer #1 (Recommendations for the authors): 

      The authors may address the following questions: 

      (1) The study identifies major sex differences in the conditioning phase, with females showing faster learning. Since hormonal fluctuations can influence learning and behavior, it would be helpful for the authors to comment on whether they tracked the estrous cycle of the females and whether any potential effects of the cycle on mediated learning were considered. 

      This is a relevant and important point raised by the Reviewer. In our study we did not track the estrous cycle to investigate whether it exists any effect of the cycle on mediated learning, which could be an interesting project by itself. Although in the revised version of the manuscript we provide new information regarding the mediated learning performance in male and female mice, we agree with the reviewer that sex hormones may account for the observed sex differences. However, the aim of the present work was to explore potential sex differences in mediated learning responding rather than to investigate the specific mechanisms behind these potential sex differences. 

      For this reason and to avoid adding further complexity to our present study, we did not check the estrous cycle in the female mice, the testosterone levels in male mice or analyze the amount of sex hormones during different phases of the sensory preconditioning task. Indeed, we think that checking the estrous cycle in female mice would still not be enough to ascertain the role of sex hormones because checking the androgen levels in male mice would also be required. In line with this, meta-analysis of neuroscience literature using the mouse model as research subjects [2-4]  has revealed that data collected from female mice (regardless of the estrous cycle) did not vary more than the data from males. In conclusion, we think that using randomized and mixed cohorts of male and female mice (as in the present study) would provide the same degree of variability in both sexes. Nevertheless, we have added a sentence to point to this possibility in the Discussion Section (Page 15, lines 32-37). 

      (2) The rationale for including parvalbumin (PV) cells in the study could be clarified. Is there prior evidence suggesting that this specific cell type is involved in mediated learning? This could apply to sensory stimuli not used in the current study.

      In the revised version of the manuscript, we have better clarified why we targeted PV interneurons, specifically mentioning previous studies [5] (see Page 11, Lines 27-34). 

      (3) The photometry recordings from the dHPC during the preconditioning phase, shown in Figure 3, are presented as average responses. It would be beneficial to separate the early vs. late trials to examine whether there is an increase in hippocampal activity as the associative learning progresses, rather than reporting the averaged data. Additionally, to clarify the dynamics of the dHPC in associative learning, the authors could compare the magnitude of photometry responses when light and tone stimuli are presented individually in separate sessions versus when they are presented closely in time to facilitate associative learning.

      As commented above, according to the Reviewer’s comment, we have now included a new Supplementary Figure 4, which splits the photometry data by the different preconditioning and conditioning sessions. Overall, this data suggests that there are no major changes on cell activity in both hippocampal regions during the different sessions as similar light-tone-induced enhancement of activity is observed. There is only an interesting trend in the activity of Pan-Neurons over the onset of light during conditioning sessions. All this is included now in the Results Section (Page 12, Line 13-15).

      (4) The authors note that PV cell responses recorded with GCaMP were similar to general hippocampal neurons, yet chemogenetic manipulations of PV cells did not impact behavior. A more detailed discussion of this discrepancy would be helpful. 

      As suggested by the Reviewer, we have included additional Discussion to explain the potential discrepancy between the activity of PV interneurons assessed by photometry and its modulation by chemogenetics (see Page 16, Lines 27-33).   

      (5) All fiber photometry recordings were conducted in male mice. Given the sex differences observed in associative learning, the authors could expand the study to include dHPC responses in females during both preconditioning and conditioning sessions. 

      We appreciate the comment of the Reviewer. Indeed, thanks to other comments made by other Reviewers in this revision (see Point 1 of Reviewer #2), we are not still sure that we have an optimal protocol to study mediated learning in female mice due to sexspecific changes related to fear generalization. Thus, the revised version of the manuscript, although highlighting these sex differences in behavioral performance (see Supplementary Figure 2), is more focused in male mice and, accordingly, all photometry or chemogenetic experiments are performed exclusively using male mice. In future studies, once we would be sure to have a sensory preconditioning paradigm working in female mice, it will be very interesting to study if the same hippocampal mechanisms mediating this behavior in male mice are also observed in female mice. 

      Minor Comments: 

      (1) In the right panel of Figure 2A, females received only one conditioning session, so the "x2" should be corrected to "x1" conditioning to accurately reflect the data. 

      We thank the Reviewer for the comment that has been addressed in the revised version of the manuscript.  

      (2) The overall presentation of Figure 3 could be improved. For example, the y-axis in Panel B could be cut to a maximum of 3 rather than 6, which would better highlight the response data. Alternatively, including heatmap representations of the z-score responses could enhance clarity and visual impact.  

      We thank the Reviewer for the comment that has been addressed providing a new format for Figures 2 and 3 in the revised version of the manuscript.   

      (3) There are several grammatical errors throughout the manuscript. It is recommended that the authors use a grammar correction tool to improve the overall writing quality and readability.  

      We have tried to correct the grammar through all the manuscript.  

      Reviewer #2 (Recommendations for the authors):  

      (1) In the abstract the authors write that sensory preconditioning requires the "repeated and simultaneous presentation of two low-salience stimuli such as a light and a tone". Previous research has shown that sensory preconditioning can still occur if the two stimuli are presented serially, rather than simultaneously. Further, the tone and the light are not necessarily "low-salience", for example, they can be loud or bright. It would be better to refer to them as innocuous. 

      In the revised version of the abstract, we have included the modifications suggested by the Reviewer.   

      (2) The authors develop a novel automated tool for assessing freezing behaviour in mice that correlates highly with both manual freezing and existing, open-source freeze estimation software (ezTrack). The authors should explain how the new program differs from ezTrack, or if it provides any added benefit over this existing software. 

      We have added new information in the Results Section (Page 10, Lines 13-20 to better explain how the new tool to quantify freezing could improve existing software.  

      (3) In Experiment 1, the authors report a sex difference in levels of freezing between male and female mice when they are only given one session of sensory preconditioning. This should be supported by a statistical comparison of levels of freezing between male and female mice. 

      Based on the new results obtained with female mice, we have decided to remove the original Figure 1 of the manuscript as it is not meaningful to compare male and female mediated learning response if we do not have an optimal protocol in female mice.  

      (4) Why did the authors choose to vary the duration of the stimuli across preconditioning, conditioning, and testing? During preconditioning, the light-tone compound was 30s, in conditioning the light was 10s, and at test both stimuli were presented continuously for 3 min. Did the level of freezing vary across the three-minute probe session? There is some evidence that rodents can learn the timing of stimuli and it may be the case that freezing was highest at the start of the test stimulus, when it most closely resembled the conditioned stimulus. 

      Differences in cue exposure between the preconditioning phase and the test phase were decided based on important differences between previous protocols used in rats compared to how mice are responding. Indeed, mice were not able to adapt their behavioral responses when shorter time windows exposing the cue were used as it clearly happens with rats1. In addition, we have added a new graph to show the time course of the behavioral responses (see Figure 1 and 4 and Supplementary Figure 2) that correlate with the quantification of freezing responses shown by the percentage of freezing during ON and OFF periods.   

      (5) The title of Experiment 1 "Female and male mice show mediated learning using an auditory-visual sensory preconditioning task" - this experiment does not demonstrate mediated learning; it merely shows that animals will freeze more in the presence of a stimulus as compared with no stimulus. This experiment lacks the necessary controls to claim mediated learning (which are presented in Experiment 2) and should therefore be retitled something more appropriate.

      As stated above, based on the new results obtained with female mice, we have decided to remove the original Figure 1 of the manuscript as it is not meaningful to compare male and female mediated learning response if we do not have an optimal protocol in female mice.   

      (6) In Figure 2, why does the unpaired group show less freezing to the tone than the paired group given that the tone was directly paired with the shock in both groups? 

      We believe the Reviewer may have referred to the tone in error (i.e. there are no differences in the freezing observed to the tone) and (s)he might be talking about the freezing induced by the Light in the direct learning test. In this case, it is true that the direct learning (e.g. percentage of freezing) seems to be slightly lower in the unpaired group compared to the paired one, which could be due to a latent inhibition process caused by the different exposure of cues between paired and unpaired experimental groups. However, the direct learning in both groups is clear and significant and there are no significant differences between them, which makes difficult to extract any further conclusion. 

      (7) The stimuli in the design schematics are quite small and hard to see, they should be enlarged for clarity. The box plots also looked stretched and the colour difference between the on and off periods is difficult to discern. 

      We have included some important modification to the Figures in order to address the comments made by the Reviewer and improve its quality.   

      (8) The authors do not include labels for the experimental groups (paired, unpaired, no shock) in Figures 2B, 2D, 2C, and 2E. This made it very difficult to interpret the figure.  

      According to this suggestion, Figure 2 has been changed accordingly. 

      (9) The levels of freezing during conditioning should be presented for all experiments.  

      We have generated a new Supplementary Figure 9 to show the freezing levels during conditioning sessions. 

      (10) In the final experiment, the authors wrote that mice were injected with J60 or saline, but I could not find the data for the saline animals.  

      In the Results and Methods section, we have included a sentence to better explain this issue. In addition, we have added a new Supplementary Figure 7 to show the performance of all control groups.  

      (11) Please list the total number of animals (per group, per sex) for each experiment.  

      In the revised version of the manuscript, we have added this information in each Figure Legend.  

      Reviewer #3 (Recommendations for the authors): 

      I found this study very interesting, despite a few weaknesses. I have several minor comments to add, hoping that it would improve the manuscript: 

      (1) The terminology used is not always appropriate/consistent. I would use "freely moving fiber photometry" or simply "fiber photometry" as calcium imaging conventionally refers to endoscopic or 2-photon calcium imaging. 

      We thank the Reviewer for this comment that has been addressed and corrected in the revised version of the manuscript. 

      (2) "Dorsal hippocampus mediates light-tone sensory preconditioning task in mice" suggests that a brain region mediates a task. I would rather suggest, e.g. "Dorsal hippocampus mediates light-tone association in mice" 

      We thank the Reviewer for this comment that has been addressed and corrected in the revised version of the manuscript.

      (3) As you are using low-salience stimuli, it would be better to also inform the readership with the light intensity used for the light cue, for replicability purposes. 

      In the Methods section (Page 5, Line 30), we have added new information regarding the visual stimuli used. 

      (4) If the authors didn't use a background noise during the probe tests, the tone cue could have been perceived as being louder/clearer by mice. Couldn't it have inflated the freezing response for the tone cue?  

      This is an interesting comment made by the Reviewer although we do not have any data to directly answer his/her suggestion. However, the presence of the Background noise resulted necessary to set up the protocol and to change different aspects of the context through all the paradigm, which was necessary to avoid fear generalization in mice. In addition, as demonstrated before [6] , the presence of background noise is important to avoid that other auditory cue (i.e. tone) could induce fear responses by itself as the transition of noise to silence is a signal to danger for animals. 

      (5) "salience" is usually used for the intensity of a stimulus, not for an association or pairing. Rather, we usually refer to the strength of an association. 

      We thank the Reviewer for this comment that has been addressed and corrected in the revised version of the manuscript.

      (6) Figure 3, panel A. "RCaMP Neurons", maybe "Pan-Neurons" would be more appropriate, as PV+ inter-neurons are also neurons. 

      We thank the Reviewer for this comment that has been corrected accordingly.

      (7) Figure 4, panel A, please add the AAV injected, and the neurons labelled in your example slice. 

      We thank the Reviewer for this comment that has been corrected accordingly.

      References

      (1) Wong, F. S., Westbrook, R. F. & Holmes, N. M. 'Online' integration of sensory and fear memories in the rat medial temporal lobe. Elife 8 (2019). https://doi.org:10.7554/eLife.47085

      (2) Prendergast, B. J., Onishi, K. G. & Zucker, I. Female mice liberated for inclusion in neuroscience and biomedical research. Neurosci Biobehav Rev 40, 1-5 (2014). https://doi.org:10.1016/j.neubiorev.2014.01.001

      (3) Becker, J. B., Prendergast, B. J. & Liang, J. W. Female rats are not more variable than male rats: a meta-analysis of neuroscience studies. Biol Sex Differ 7, 34 (2016). https://doi.org:10.1186/s13293-016-0087-5

      (4) Shansky, R. M. Are hormones a "female problem" for animal research? Science 364,  825-826 (2019). https://doi.org:10.1126/science.aaw7570

      (5) Busquets-Garcia, A. et al. Hippocampal CB1 Receptors Control Incidental Associations. Neuron 99, 1247-1259 e1247 (2018). https://doi.org:10.1016/j.neuron.2018.08.014

      (6) Pereira, A. G., Cruz, A., Lima, S. Q. & Moita, M. A. Silence resulting from the cessation of movement signals danger. Curr Biol 22, R627-628 (2012). https://doi.org:10.1016/j.cub.2012.06.015

    1. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public review):

      SMC5/6 is a highly conserved complex able to dynamically alter chromatin structure, playing in this way critical roles in genome stability and integrity that include homologous recombination and telomere maintenance. In the last years, a number of studies have revealed the importance of SMC5/6 in restricting viral expression, which is in part related to its ability to repress transcription from circular DNA. In this context, Oravcova and colleagues recently reported how SMC5/6 is recruited by two mutually exclusive complexes (orthologs of yeast Nse5/6) to SV40 LT-induced PML nuclear bodies (SIMC/SLF2) and DNA lesions (SLF1/2). In this current work, the authors extend this study, providing some new results. However, as a whole, the story lacks unity and does not delve into the molecular mechanisms responsible for the silencing process. One has the feeling that the story is somewhat incomplete, putting together not directly connected results.

      Please see the introductory overview above.

      (1) In the first part of the work, the authors confirm previous conclusions about the relevance of a conserved domain defined by the interaction of SIMC and SLF2 for their binding to SMC6, and extend the structural analysis to the modelling of the SIMC/SLF2/SMC complex by AlphaFold. Their data support a model where this conserved surface of SIMC/SLF2 interacts with SMC at the backside of SMC6's head domain, confirming the relevance of this interaction site with specific mutations. These results are interesting but confirmatory of a previous and more complete structural analysis in yeast (Li et al. NSMB 2024). In any case, they reveal the conservation of the interaction. My major concern is the lack of connection with the rest of the article. This structure does not help to understand the process of transcriptional silencing reported later beyond its relevance to recruit SMC5/6 to its targets, which was already demonstrated in the previous study.

      Demonstrating the existence of a conserved interface between the Nse5/6-like complexes and SMC6 in both yeast and human is foundationally important, not confirmatory, and was not revealed in our previous study. It remains unclear how this interface regulates SMC5/6 function, but yeast studies suggest a potential role in inhibiting the SMC5/6 ATPase cycle. Nevertheless, the precise function of Nse5/6 and its human orthologs in SMC5/6 regulation remain undefined, largely due to technical limitations in available in vivo analyses. The SIMC1/SLF2/SMC6 complex structure likely extends to the SLF1/2/SMC6 complex, suggesting a unifying function of the Nse5/6-like complexes in SMC5/6 regulation, albeit in the distinct processes of ecDNA silencing and DNA repair. There have been no studies to date (including this one) showing that SIMC1-SLF2 is required for SMC5/6 recruitment to ecDNA. Our previous study showed that SIMC1 was needed for SMC5/6 to colocalize with SV40 LT antigen at PML NBs. Here we show that SIMC1 is required for ecDNA repression, in the absence of PML NBs, which was not anticipated.

      (2) In the second part of the work, the authors focus on the functionality of the different complexes. The authors demonstrate that SMC5/6's role in transcription silencing is specific to its interaction with SIMC/SLF2, whereas SMC5/6's role in DNA repair depends on SLF1/2. These results are quite expected according to previous results. The authors already demonstrated that SLF1/2, but not SIMC/SLF2, are recruited to DNA lesions. Accordingly, they observe here that SMC5/6 recruitment to DNA lesions requires SLF1/2 but not SIMC/SLF2. Likewise, the authors already demonstrated that SIMC/SLF2, but not SLF1/2, targets SMC5/6 to PML NBs. Taking into account the evidence that connects SMC5/6's viral resistance at PML NBs with transcription repression, the observed requirement of SIMC/SLF2 but not SLF1/2 in plasmid silencing is somehow expected. This does not mean the expectation has not to be experimentally confirmed. However, the study falls short in advancing the mechanistic process, despite some interesting results as the dispensability of the PML NBs or the antagonistic role of the SV40 large T antigen. It had been interesting to explore how LT overcomes SMC5/6-mediated repression: Does LT prevent SIMC/SLF2 from interacting with SMC5/6? Or does it prevent SMC5/6 from binding the plasmid? Is the transcription-dependent plasmid topology altered in cells lacking SIMC/SLF2? And in cells expressing LT? In its current form, the study is confirmatory and preliminary. In agreement with this, the cartoons modelling results here and in the previous work look basically the same.

      Our previous study only examined the localization of SLF1 and SIMC1 at DNA lesions. The localization of these subcomplexes alone should not be used to define their roles in SMC5/6 localization. Indeed, the field is split in terms of whether Nse5/6-like complexes are required for ecDNA binding/loading, or regulation of SMC5/6 once bound. 

      We agree, determining the potential mechanism of action of LT in overcoming SMC5/6-based repression is an important next step. We believe it is unlikely due to blocking of the SMC5/6SIMC1/SLF2 interface, since SIMC1-SLF2 is required for SMC5/6 to localize at LT-induced foci. It will require the identification of any direct interactions with SMC5/6 subunits, and better methods for assessing SMC5/6 loading and activity on ecDNAs. Unlike HBx, Vpr, and BNRF1 it does not appear to induce degradation of SMC5/6, making it a more complex and interesting challenge. Also, the dispensability of PML NBs in plasmid silencing versus viral silencing raises multiple important questions about SMC5/6’s repression mechanism. 

      (3) There are some points about the presented data that need to be clarified.

      Thank you, we have addressed these points below, within the Recommendations for authors section.

      Reviewer #2 (Public review):

      Oracová et al. present data supporting a role for SIMC1/SLF2 in silencing plasmid DNA via the SMC5/6 complex. Their findings are of interest, and they provide further mechanistic detail of how the SMC5/6 complex is recruited to disparate DNA elements. In essence, the present report builds on the author's previous paper in eLife in 2022 (PMID: 36373674, "The Nse5/6-like SIMC1-SLF2 complex localizes SMC5/6 to viral replication centers") by showing the role of SIMC1/SLF2 in localisation of the SMC5/6 complex to plasmid DNA, and the distinct requirements as compared to recruitment to DNA damage foci. Although the findings of the manuscript are of interest, we are not yet convinced that the new data presented here represents a compelling new body of work and would better fit the format of a "research advance" article. In their previous paper, Oracová et al. show that the recruitment of SMC5/6 to SV40 replication centres is dependent on SIMC1, and specifically, that it is dependent on SIMC1 residues adjacent to neighbouring SLF2.

      We agree. We submitted this manuscript as a “Research Advance”, not as a standalone research article, given that it is an extension of our previous “Research Article” (1).

      Other comments

      (1) The mutations chosen in Figure 1 are quite extensive - 5 amino acids per mutant. In addition, they are in many cases 'opposite' changes, e.g., positive charge to negative charge. Is the effect lost if single mutations to an alanine are made?

      The mutations were chosen to test and validate the predicted SIMC1-SLF2-SMC6 structure i.e. the contact point between the conserved patch of SIMC1-SLF2 and SMC6. Multiple mutations and charge inversions increased the chance of disrupting the extensive interface. In this respect, the mutations were successful and informative, confirming the requirement of this region in specifically contacting SMC6. Whilst alanine scanning mutations are possible, we believe that they would not add to, or detract from, our validation of the predicted SIMC1-SLF2-SMC6 interface.

      (2) In Figure 2c, it isn't clear from the data shown that the 'SLF2-only' mutations in SMC6 result in a substantial reduction in SIMC1/SLF2 binding.

      To clarify the difference between wild-type and SLF2-only mutations in SIMC1-SLF2 interaction, we have performed an image volume analysis. This shows that the SLF2-facing SMC6 mutant reduces its interaction with SIMC1 (to 44% of WT) and SLF2 (to 21% of WT). The reduction in both SIMC1 and SLF2 interaction with SMC6 SLF2-facing mutant is expected, since SIMC1 and SLF2 are an interdependent heterodimer.  

      Author response table 1.

      (3) In the GFP reporter assays (e.g. Figure 3), median fluorescence is reported - was there any observed difference in the percentage of cells that are GFP positive?

      Yes, as expected when the GFP plasmid is not actively repressed, the percent of GFP positive cells differs in each cell line – in the same trend as GFP intensity

      (4) The potential role of the large T antigen as an SMC5/6 evasion factor is intriguing. However, given the role of the large T antigen as a transcriptional activator, caution is required when interpreting enhanced GFP fluorescence. Antagonism of the SMC5/6 complex in this context might be further supported by ChIP experiments in the presence or absence of large T. Can large T functionally substitute for HBx or HIV-Vpr?

      We agree, the potential role of LT in SMC5/6 antagonism is interesting. We did state in the text “While LT is known to be a promiscuous transcriptional activator (2,3) that does not rule out a co-existing role in antagonizing SMC5/6. Indeed, these findings are reminiscent of HBx from HBV and Vpr of HIV-1, both of which are known promiscuous transcriptional activators that also directly antagonize SMC5/6 to relieve transcriptional repression (4-10).“ We have tried ChIP experiments, but found these to be unreliable in assessing SMC5/6 association with plasmid DNA. Given the many disparate targets of LT, HBx and Vpr (other than SMC5/6), it seems unlikely that LT could functionally substitute for HBx and Vpr in supporting HBV and HIV-1 infections. Whilst certainly an interesting future question, we believe it is beyond the scope of this study.

      (5) In Figure 5c, the apparent molecular weight of large T and SMC6 appears to change following transfection of GFP-SMC5 - is there a reason for this?

      We are not certain as to what causes the molecular weight shift, but it is not specifically related to GFPSMC5 transfection. Rather, it appears to be a general effect of the pulldown. Indeed, a very weak “background” band of LT is seen in the GFP only pulldown, which also runs at a “higher” molecular weight, as in the GFP-SMC5 pulldown. We believe that the effect is instead related to gel mobility in the wells that contain post pulldown proteins and different buffers. We have also seen similar effects using different protein-protein interaction pairs. 

      Reviewer #3 (Public review):

      Summary:

      This study by the Boddy and Otomo laboratories further characterizes the roles of SMC5/6 loader proteins and related factors in SMC5/6-mediated repression of extrachromosomal circular DNA. The work shows that mutations engineered at an AlphaFold-predicted protein-protein interface formed between the loader SLF2/SIMC1 and SMC6 (similar to the interface in the yeast counterparts observed by cryo-EM) prevent co-IP of the respective proteins. The mutations in SLF2 also hinder plasmid DNA silencing when expressed in SLF2-/- cell lines, suggesting that this interface is needed for silencing. SIMC1 is dispensable for recruitment of SMC5/6 to sites of DNA damage, while SLF1 is required, thus separating the functions of the two loader complexes. Preventing SUMOylation (with a chemical inhibitor) increases transcription from plasmids but does not in SLF2-deleted cell lines, indicating the SMC5/6 silences plasmids in a SUMOylation dependent manner. Expression of LT is sufficient for increased expression, and again, not additive or synergistic with SIMC1 or SLF2 deletion, indicating that LT prevents silencing by directly inhibiting 5/6. In contrast, PML bodies appear dispensable for plasmid silencing.

      Strengths:

      The manuscript defines the requirements for plasmid silencing by SMC5/6 (an interaction of Smc6 with the loader complex SLF2/SIMC1, SUMOylation activity) and shows that SLF1 and PML bodies are dispensable for silencing. Furthermore, the authors show that LT can overcome silencing, likely by directly binding to (but not degrading) SMC5/6.

      Weaknesses:

      (1) Many of the findings were expected based on recent publications.

      There have been no manuscripts describing the role of SIMC1-SLF2 in ecDNA silencing. There have been studies describing SLF2’s roles in ecDNA silencing, but these suggested SLF2 had an SLF1 independent role, with no mention of an alternate Nse5-like cofactor. Our earlier study in eLife (1) described the identification of SIMC1 as an Nse5-like cofactor for SLF2 but did not test potential roles of the complex in ecDNA silencing. Also, the apparent dispensability of PML NBs in plasmid silencing (in U2OS cells) was unexpected based on recent publications. Finally, SV40 LT has not previously been implicated in SMC5/6 inhibition, which may occur through novel mechanisms.

      (2) While the data are consistent with SIMC1 playing the main function in plasmid silencing, it is possible that SLF1 contributes to silencing, especially in the absence of SIMC1. This would potentially explain the discrepancy with the data reported in ref. 50. SLF2 deletion has a stronger effect on expression than SIMC1 deletion in many but not all experiments reported in this manuscript. A double mutant/deletion experiments would be useful to explore this possibility.

      It is interesting to note that the data in ref. 50 (11) is also at odds with that in ref. 45 (8) in terms of defining a role for SLF1 in the silencing of unintegrated HIV-1 DNA. The Irwan study showed that SLF1 deficient cells exhibit increased expression of a reporter gene from unintegrated HIV-1, whereas the Dupont study found that SLF1 deletion, unlike SLF2 deletion, has no effect. It is unclear what the basis of this discrepancy is. In line with the Dupont study, we found no effect of SLF1 deletion on plasmid expression (Figure 4B), whereas SLF2 deletion increased reporter expression (Figure 3A/B). It is possible that SLF1 could support some plasmid silencing in the absence of SIMC1, especially considering the gross structural similarity in their C-terminal Nse5-like domains. However, we have been unable to generate double-knockout SIMC1 and SLF1 cells to test such a possibility, and shSLF1 has been ineffective. 

      (3) SLF2 is part of both types of loaders, while SLF1 and SIMC1 are specific to their respective loaders. Did the authors observe differences in phenotypes (growth, sensitivities to DNA damage) when comparing the mutant cell lines or their construction? This should be stated in the manuscript.

      We have not observed significant differences in the growth rates of each cell line, and DNA damage sensitivities are as yet untested.   

      (4) It would be desirable to have control reporter constructs located on the chromosome for several experiments, including the SUMOylation inhibition (Figures 5A and 5-S2) and LT expression (Figure 5D) to exclude more general effects on gene expression.

      We have repeated all GFP reporter assays using integrated versus episomal plasmid DNA. A seminal study by Decorsière et al. (6) showed that SMC5/6 degradation by HBx of HBV increased transcription of episomal but not chromosomally integrated reporters. In line with this data, the deletion of SLF2 does not notably impact the expression of our GFP reporter construct when it is genomically integrated (Figure 3—figure supplement 1C).  

      Somewhat surprisingly, given the generally transcriptionally repressive roles of SUMO, inhibition of the SUMO pathway with SUMOi did not significantly impact the expression of our genomically integrated GFP reporter, versus the episomal plasmid (Figure 5—figure supplement 1C). Finally, the expression of SV40 LT, which enhances plasmid reporter expression (Figure 5D), also did not notably affect expression of the same reporter when located in the genome (Figure 5—figure supplement 3B). This is an interesting result, which is in line with an early study showing that HBx of HBV induces transcription from episomal, but not chromosomally integrated reporters (12). This further suggests that SV40 LT acts similarly to other early viral proteins like HBx and Vpr to counteract or bypass SMC5/6 restriction, amongst their multifaceted functions. Clearly, further analyses are needed to define mechanisms of LT in counteracting SMC5/6, but they do not appear to include complex degradation as seen with HBx and Vpr.  

      (5) Figure 5A: There appears to be an increase in GFP in the SLF2-/- cells with SUMOi? Is this a significant increase?

      No significant difference was found between WT, SIMC1-/- or SLF2-/- when treated with SUMOi (p>0.05). The p-value is 0.0857 (when comparing SLF2-/- to WT in the SUMOi condition) This is described in the figure legend to Figure 5.

      (6) The expression level of SFL2 mut1 should be tested (Figure 3B).

      Full length SLF2 (WT or mutants) has been undetectable by western analyses. However, truncated SLF2 mut1 expresses well and binds SIMC1 but not SMC6 (Figure 1C). Moreover, full length SLF2 mut1 expression was confirmed by qPCR – showing a somewhat higher expression level than SLF2 WT (Figure 3—figure supplement 1B).  

      Reviewer #1 (Recommendations for the authors):

      There are some points about the presented data that need to be clarified.

      (1) Figures 3, 4B, and 5. The authors should rule out the possibility that the reported effects on transcription were due to alterations in plasmid number. This is particularly important, taking into account the importance of SMC5/6 in DNA replication.

      We used qPCR to assess plasmid copy number versus genomic DNA in our cell lines, testing at 72 hours post transfection to avoid any impact of cytosolic DNA (13). Our qPCR data show that there is no significant impact on plasmid copy number across our cell lines i.e. WT and SLF2 null.  SMC5/6 has a positive role in DNA replication progression on the genome (e.g. (14)), so loss of SMC5/6 “targeting” in SIMC1 and SLF2 null cells would be unlikely to promote replication fork progression per se. 

      (2) Figure S1A. In contrast to the statement in the text, the SIMC1-combo control is affected in its binding to SLF2; however, it is not affected in its binding to SMC6. This is somehow unexpected because it suggests that the solenoid-like structure is not required for SMC6 binding, just specific patches at either SIMC or SLF2. This should be commented on.

      We appreciate the reviewer’s observation regarding the discrepancy between Figure S1A and the text. This was our oversight. The data show that SLF2 recovery was reduced in the pull-down with the SIMC1 combo control mutant, while SLF2 expression was unchanged. Because SLF2 or SIMC1 variants that fail to associate typically show poor expression (1), these findings suggest that the SIMC1 combo control mutant associates with SLF2, albeit more weakly. Since the mutations were introduced into surface residues of SIMC1, it is not immediately clear how they would weaken the interaction or destabilize the complex. In contrast, SMC6 was fully recovered with the SIMC1 combo control mutant, indicating that the SIMC1–SMC6 interaction remains stable without stoichiometric SLF2. This may reflect direct recognition of a SIMC1 binding epitope or stabilization of its solenoid structure by SMC6, although this interpretation remains uncertain given the unstable nature of free SIMC1 and SLF2. Alternatively, SMC6 may have co-sedimented with the SIMC1 combo control mutant together with SLF2, which was initially retained but subsequently lost during washing, whereas SMC6 remained due to its limited solubility in the absence of other SMC5/6 subunits. While further mechanistic analysis will require purified SMC5/6 components, our data support the AlphaFold-based model by demonstrating that SIMC1 mutations on the non–SMC6-contacting surface retain association with SMC6. The text has been revised accordingly.

      (3) The SLF2-only mutant has alterations that affect interactions with both SLF2 and SIMC1. Is it not another Mixed mutant?

      We appreciate the reviewer’s observation regarding the discrepancy between the mutant name (“SLF2only”) and its description (“while N947 forms salt bridges with SIMC1”). The previous statement was inaccurate due to a misinterpretation of several AlphaFold models. Across these models, the SIMC1– SLF2 interface residues remain largely consistent, but the SIMC1 residue R470 exhibits positional variability—contacting N947 in some models but not in others. Given this variability and the absence of an experimental structure, we have revised the text to avoid overinterpretation. Because the N947 side chain is oriented toward SLF2 and consistently forms polar contacts with the H1148 side chain and G1149 backbone, we have renamed this mutant “SLF2-facing,” which more accurately describes its modeled environment. The other mutants are likewise renamed “SIMC1-facing” and “SIMC1–SLF2groove-facing,” providing a clearer and more consistent description of the interface.

      (4) The SLF2-only mutant still displays clear interactions with SMC6. Can this be explained with the AlphaFold model?

      SIMC1 may contribute more substantially to SMC6 binding than SLF2, consistent with our mutagenesis results. However, the energetic contributions of individual residues or proteins cannot be quantitatively inferred from structural models alone. Comprehensive experimental and computational analyses would be required to address this point.

      (5) The conclusions about the role of SUMOylation are vague; it is already known that its general effect on transcription repression, and the authors already demonstrated that SIMC interacts with SUMO pathway factors. Concerning the epistatic effect, the experiment should be done at a lower inhibitor concentration; at 100 nM there is not much margin to augment according to the kinetics analysis in Figure S5.

      The SUMO pathway is indeed thought to be generally repressive for transcription. Notably, in response to a suggestion from Reviewer 3 (public review point 4), we have repeated several of our GFP expression assays using cells with the GFP reporter plasmid integrated into the genome (please see Figure 3—figure supplement 1C; Figure 5—figure supplement 1C; Figure 5—figure supplement 3B). This type of integrated reporter does not show elevated expression following inhibition of the SMC5/6 complex, unlike ecDNAs (6,10). Interestingly, SUMOi, LT expression, and SLF2 knockout also did not notably impact the expression of our integrated GFP reporter (Figure 3—figure supplement 1C; Figure 5—figure supplement 1C; Figure 5—figure supplement 3B, unlike that of the plasmid (ecDNA) reporter. Given the “general” inhibitory effect of SUMO on transcription, the SUMOi result was not expected, and it opens further interesting avenues for study. 

      In Figure 5—figure supplement 1A, 100 nM SUMOi increases reporter expression well below the highest SUMOi dose. We believe that the ~3-4 fold induction of GFP expression in SLF2 null cells, if independent of SUMOylation, should further increase GFP expression. The impact of SUMOylation on GFP reporter expression remains “vague”, but our data indicate that SMC5/6 operates within SUMO’s “umbrella” function and provides a starting point for more mechanistic dissection. 

      (6) Figure 5C. Why is the size different between Input versus GFP-PD?

      Please see our response to this question above: reviewer 2, point (5)

      Reviewer #2 (Recommendations for the authors):

      If further data could be provided to extend on that which is presented, then publication as a 'standalone research article' may be appropriate, but not in its present form.

      We submitted this manuscript as a “Research Advance” not as a standalone research article, given that it was an extension of our previous research article (1).

      Reviewer #3 (Recommendations for the authors):

      (1) The term 'LT' should be defined in the title

      We have updated the title accordingly.  

      (2) This reviewer found the nomenclature of the SMC6 mutants confusing (SIMC1-only...). Either rephrase or define more clearly in the text and the figures.

      We agree with the reviewer and have renamed the mutants as “SIMC1-facing”, “SLF2-facing,”, and “SIMC1–SLF2-groove-facing”.

      (3) The authors could better emphasize that LT blocks silencing in trans (not only on its cognate target sequence in cis). This is consistent with the observed direct binding to SMC5/6.

      We appreciate the suggestion to further emphasize the impact of LT on plasmid silencing. We did not want to overstate its impact at this time because we do not know if it directly binds SMC5/6 or indeed affects SMC5/6 function more broadly. LT expression like HBx, does cause induction of a DNA damage response, but we cannot at this point tie that response to SMC5/6 inhibition alone.

      (4) Figure 5 S1: the merge looks drastically different. Is DAPI omitted in the wt merge image?

      Thank you for noting this issue. We have corrected the image, which was impacted by the use of an underexposed DAPI image.  

      (5) Figure 1: how is the structure in B oriented relative to A? A visual guide would be helpful.

      We have added arrows to indicate the view orientation and rotational direction to turn A to B.

      (6) Line 126, unclear what "specificity" here means.

      We have revised the sentence without this word, which now starts with “To confirm the SIMC1-SMC6 interface, we introduced….”

      (7) Line 152, The statement implies that the conserved residues are needed for loader subunits interactions ('mediating the SIMC1-SLF2 interaction"). Does Figure 1C not show that the residues are not important? Please clarify.

      Thank you for noting this writing error. We have corrected the sentence to provide the intended meaning. It now reads "Collectively, these results confirm that the conserved surface patch of SIMC1SLF2 is essential for SMC6 binding.” 

      References

      (1) Oravcova M, Nie M, Zilio N, Maeda S, Jami-Alahmadi Y, Lazzerini-Denchi E, Wohlschlegel JA, Ulrich HD, Otomo T, Boddy MN. The Nse5/6-like SIMC1-SLF2 complex localizes SMC5/6 to viral replication centers. Elife. 2022;11. PMCID: PMC9708086

      (2) Sullivan CS, Pipas JM. T antigens of simian virus 40: molecular chaperones for viral replication and tumorigenesis. Microbiol Mol Biol Rev. 2002;66(2):179-202. PMCID: PMC120785

      (3) Gilinger G, Alwine JC. Transcriptional activation by simian virus 40 large T antigen: requirements for simple promoter structures containing either TATA or initiator elements with variable upstream factor binding sites. J Virol. 1993;67(11):6682-8. PMCID: PMC238107

      (4) Qadri I, Conaway JW, Conaway RC, Schaack J, Siddiqui A. Hepatitis B virus transactivator protein, HBx, associates with the components of TFIIH and stimulates the DNA helicase activity of TFIIH. Proc Natl Acad Sci U S A. 1996;93(20):10578-83. PMCID: PMC38195

      (5) Aufiero B, Schneider RJ. The hepatitis B virus X-gene product trans-activates both RNA polymerase II and III promoters. EMBO J. 1990;9(2):497-504. PMCID: PMC551692

      (6) Decorsiere A, Mueller H, van Breugel PC, Abdul F, Gerossier L, Beran RK, Livingston CM, Niu C, Fletcher SP, Hantz O, Strubin M. Hepatitis B virus X protein identifies the Smc5/6 complex as a host restriction factor. Nature. 2016;531(7594):386-9. 

      (7) Murphy CM, Xu Y, Li F, Nio K, Reszka-Blanco N, Li X, Wu Y, Yu Y, Xiong Y, Su L. Hepatitis B Virus X Protein Promotes Degradation of SMC5/6 to Enhance HBV Replication. Cell Rep. 2016;16(11):2846-54. PMCID: PMC5078993

      (8) Dupont L, Bloor S, Williamson JC, Cuesta SM, Shah R, Teixeira-Silva A, Naamati A, Greenwood EJD, Sarafianos SG, Matheson NJ, Lehner PJ. The SMC5/6 complex compacts and silences unintegrated HIV-1 DNA and is antagonized by Vpr. Cell Host Microbe. 2021;29(5):792-805 e6. PMCID: PMC8118623

      (9) Felzien LK, Woffendin C, Hottiger MO, Subbramanian RA, Cohen EA, Nabel GJ. HIV transcriptional activation by the accessory protein, VPR, is mediated by the p300 co-activator. Proc Natl Acad Sci U S A. 1998;95(9):5281-6. PMCID: PMC20252

      (10) Diman A, Panis G, Castrogiovanni C, Prados J, Baechler B, Strubin M. Human Smc5/6 recognises transcription-generated positive DNA supercoils. Nat Commun. 2024;15(1):7805. PMCID: PMC11379904

      (11) Irwan ID, Bogerd HP, Cullen BR. Epigenetic silencing by the SMC5/6 complex mediates HIV-1 latency. Nat Microbiol. 2022;7(12):2101-13. PMCID: PMC9712108

      (12) van Breugel PC, Robert EI, Mueller H, Decorsiere A, Zoulim F, Hantz O, Strubin M. Hepatitis B virus X protein stimulates gene expression selectively from extrachromosomal DNA templates. Hepatology. 2012;56(6):2116-24. 

      (13) Lechardeur D, Sohn KJ, Haardt M, Joshi PB, Monck M, Graham RW, Beatty B, Squire J, O'Brodovich H, Lukacs GL. Metabolic instability of plasmid DNA in the cytosol: a potential barrier to gene transfer. Gene Ther. 1999;6(4):482-97. 

      (14) Gallego-Paez LM, Tanaka H, Bando M, Takahashi M, Nozaki N, Nakato R, Shirahige K, Hirota T. Smc5/6-mediated regulation of replication progression contributes to chromosome assembly during mitosis in human cells. Mol Biol Cell. 2014;25(2):302-17. PMCID: PMC3890350

    1. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #1 (Public Review): 

      Summary: 

      This paper by Schommartz and colleagues investigates the neural basis of memory reinstatement as a function of both how recently the memory was formed (recent, remote) and its development (children, young adults). The core question is whether memory consolidation processes as well as the specificity of memory reinstatement differ with development. A number of brain regions showed a greater activation difference for recent vs. remote memories at the long versus shorter delay specifically in adults (cerebellum, PHG, LOC). A different set showed decreases in the same comparison, but only in children (precuneus, RSC). The authors also used neural pattern similarity analysis to characterize reinstatement, though still in this revised paper I have substantive concerns about how the analyses were performed. While scene-specific reinstatement decreased for remote memories in both children and adults, claims about its presence cannot be made given the analyses. Gist-level reinstatement was observed in children but not adults, but I also have concerns about this analysis. Broadly, the behavioral and univariate findings are consistent with the idea memory consolidation differs between children and adults in important ways, and takes a step towards characterizing how.

      Strengths: 

      The topic and goals of this paper are very interesting. As the authors note, there is little work on memory consolidation over development, and as such this will be an important data point in helping us begin to understand these important differences. The sample size is great, particularly given this is an onerous, multi-day experiment; the authors are to be commended for that. The task design is also generally well controlled, for example as the authors include new recently learned pairs during each session.  

      Weaknesses: 

      As noted above and in my review of the original submission, the pattern similarity analysis for both item and category-level reinstatement were performed in a way that is not interpretable given concerns about temporal autocorrelation within scanning run.Unfortunately these issues remain of concern in this revision because they were not rectified. Most of my review focuses on this analytic issue, though I also outline additional concerns. 

      (1) The pattern similarity analyses are largely uninterpretable due to how they were performed. 

      (a) First, the scene-specific reinstatement index: The authors have correlated a neural pattern during a fixation cross (delay period) with a neural pattern associated with viewing a scene as their measure of reinstatement. The main issue with this is that these events always occurred back-to-back in time. As such, the two patterns will be similar due simply to the temporal autocorrelation in the BOLD signal. Because of the issues with temporal autocorrelation within scanning run, it is always recommended to perform such correlations only across different runs. In this case, the authors always correlated patterns extracted from the same run, and which moreover have temporal lags that are perfectly confounded with their comparison of interest (i.e., from Fig 4A, the "scene-specific" comparisons will always be back-to-back, having a very short temporal lag; "set-based" comparisons will be dispersed across the run, and therefore have a much higher lag). The authors' within-run correlation approach also yields correlation values that are extremely high - much higher than would be expected if this analysis was done appropriately. The way to fix this would be to restrict the analysis to only cross-run comparisons, which is not possible given the design. 

      To remedy this, in the revision the authors have said they will refrain from making conclusions about the presence of scene-specific reinstatement (i.e., reinstatement above baseline). While this itself is an improvement from the original manuscript, I still have several concerns. First, this was not done thoroughly and at times conclusions/interpretations still seem to imply or assume the presence of scene reinstatement (e.g., line 979-985, "our research supports the presence of scene-specific reinstatement in 5-to-7-year-old children"; line 1138). 

      We thank the reviewers for pointing out that there are inconsistencies in our writing. We agree that we cannot make any claims about the baseline level of scene-specific reinstatement. To reiterate, our focus is on the changes in reinstatement over time (30 minutes, 24 hours, and two weeks after learning), which showed a robust decrease. Importantly, scenespecific reinstatement indices for recent items — tested on different days — did not significantly differ, as indicated by non-significant main effects of Session (all p > .323) and Session x ROI interactions (all p > .817) in either age group. This supports our claim that temporal autocorrelation is stable and consistent across conditions and that the observed decline in scene-specific reinstatement reflects a time-dependent change in remote retrieval. We have revised the highlighted passages, accordingly, emphasizing the delay-related decrease in scene-specific reinstatement rather than its absolute magnitude. 

      Second, the authors' logic for the neural-behavioural correlations in the PLSC analysis involved restricting to regions that showed significant reinstatement for the gist analysis, which cannot be done for the analogous scene-specific reinstatement analysis. This makes it challenging to directly compare these two analyses since one was restricted to a small subset of regions and only children (gist), while scene reinstatement included both groups and all ROIs. 

      We thank the reviewer for pointing this out and want to clarify that it was not our intention to directly compare these analyses. For the neural-behavioral correlations, we included only those regions identified based on gist-like representations baseline, whereas for scene-specific reinstatement, we included all regions due to the absence of such a baseline. The primary aim of the PLSC analysis was to identify a set of regions that, after a stringent permutation and bootstrapping procedure, form a latent variable that explains a significant proportion of variance in behavioral performance across all participants. 

      Third, it is also unclear whether children and adults' values should be directly comparable given pattern similarity can be influenced by many factors like motion, among other things. 

      We thank the reviewer for raising this important point. In our multivariate analysis, we included confounding regressors specifically addressing motion-related artefacts. Following recent best practices for mitigating motion-related confounding factors in both adult and pediatric fMRI data (Ciric et al., 2017; Esteban et al., 2020; Jones et al., 2021; Satterthwaite et al., 2013), we implemented the most effective motion correction strategies. 

      Importantly, our group × session interaction analysis focuses on relative changes in reinstatement over time rather than comparing absolute levels of pattern similarity between children and adults. This approach controls for potential baseline differences and instead examines whether the magnitude of delay-related changes differs across groups. We believe this warrants the comparison and ensures that our conclusions are not driven by group-level differences in baseline similarity or motion artifacts.

      My fourth concern with this analysis relates to the lack of regional specificity of the effects. All ROIs tested showed a virtually identical pattern: "Scene-specific reinstatement" decreased across delays, and was greater in children than adults. I believe control analyses are needed to ensure artifacts are not driving these effects. This would greatly strengthen the authors' ability to draw conclusions from the "clean" comparison of day 1 vs. day 14. (A) The authors should present results from a control ROI that should absolutely not show memory reinstatement effects (e.g., white matter?). Results from the control ROI should look very different - should not differ between children and adults, and should not show decreases over time. 

      (C) If the same analysis was performed comparing the object cue and immediately following fixation (rather than the fixation and the immediately following scene), the results should look very different. I would argue that this should not be an index of reinstatement at all since it involves something presented visually rather than something reinstated (i.e., the scene picture is not included in this comparison). If this control analysis were to show the same effects as the primary analysis, this would be further evidence that this analysis is uninterpretable and hopelessly confounded. 

      We appreciate the reviewer’s suggestion to strengthen the interpretation of our findings by including appropriate control analyses to rule out non-memory-related artifacts. In response, we conducted several control analyses, detailed below, which collectively support the specificity of the observed reinstatement effects. The report of the results is included in the manuscript (line 593-619).

      We checked that item reinstatement for incorrectly remembered trial did not show any session-related decline for any ROI. This indicates that the reinstatement for correctly remembered items is memory-related (see Fig. S5 for details). 

      We conducted additional analyses on three subregions of the corpus callosum (the body, genu, and splenium). The results of the linear mixed-effects models revealed no significant group effect (all p > .426), indicating no differences between children and adults. In contrast, all three ROIs showed a significant main effect of Session (all p < .001). However, post hoc analyses indicated that this effect was driven by differences between the recent and the Day 14 remote condition. The main contrasts of interest – recent vs. Day 1 remote and Day 1 remote vs. Day 14 remote – were not significant (all p > .080; see Table S10.4), suggesting that, unlike in other ROIs, there was no delay-related decrease in scene-specific reinstatement in these white matter regions.

      Then we repeated our analysis using the same procedure but replaced the “scene” time window with the “object” time window. The rationale for this control is that comparing the object cue to the immediately following fixation period should not reflect scene reinstatement, as the object and the reinstated scene rely on distinct neural representations. Accordingly, we did not expect a delay-related decrease in the reinstatement index. Consistent with this expectation, the analysis using the object – fixation similarity index – though also influenced by temporal autocorrelation – did not reveal any significant effect of session or delay in any ROI (all p > .059; see Table S9, S9.1).

      Together, these control analyses provide converging evidence that our findings are not driven by global or non-specific signal changes. We believe that these control analyses strengthen our interpretation about delay-related decrease in scene-specific reinstatement index. 

      (B) Do the recent items from day 1 vs. day 14 differ? If so, this could suggest something is different about the later scans (and if not, it would be reassuring). 

      The recent items tested on day 1 and day14 do not differ (all p. > .323). This effect remains stable across all ROIs.

      (b) For the category-based neural reinstatement: (1) This suffers from the same issue of correlations being performed within run. Again, to correct this the authors would need to restrict comparisons to only across runs (i.e., patterns from run 1 correlated with patterns for run 2 and so on). The authors in their response letter have indicated that because the patterns being correlated are not derived from events in close temporal proximity, they should not suffer from the issue of temporal autocorrelation. This is simply not true. For example, see the paper by Prince et al. (eLife 2022; on GLMsingle). This is not the main point of Prince et al.'s paper, but it includes a nice figure that shows that, using standard modelling approaches, the correlation between (same-run) patterns can be artificially elevated for lags as long as ~120 seconds (and can even be artificially reduced after that; Figure 5 from that paper) between events. This would affect many of the comparisons in the present paper. The cleanest way to proceed is to simply drop the within-run comparisons, which I believe the authors can do and yet they have not. Relatedly, in the response letter the authors say they are focusing mainly on the change over time for reinstatement at both levels including the gist-type reinstatement; however, this is not how it is discussed in the paper. They in fact are mainly relying on differences from zero, as children show some "above baseline" reinstatement while adults do not, but I believe there were no significant differences over time (i.e., the findings the authors said they would lean on primarily, as they are arguably the most comparable).  

      We thank the reviewer for this important comment regarding the potential inflation of similarity values due to within-run comparisons.

      To address the reviewer’s concern, we conducted an additional cross-run analysis for all correctly retrieved trials. The approach restricted comparisons to non-overlapping runs (run1run2, run2-run3, run1-run3). This analysis revealed robust gist-like reinstatement in children for remote Day 14 memories in the mPFC (p = .035) and vlPFC (p = .0007), in adults’ vlPFC remote Day 1 memories (p = .029), as well as in children and adults remote Day 1 memories in LOC (p < .02). A significant Session effect in both regions (mPFC: p = .026; vlPFC: p = .002) indicated increased reinstatement for long delay (Day 14) compared to short-delay and recent session (all p < .05). Given that the cross-run results largely replicate and reinforce the effects found previously with within-run, we believe that combining both sources of information is methodologically justified and statistically beneficial. Specifically, both approaches independently identified significant gist-like reinstatement in children’s mPFC and vlPFC (although within-run vlPFC effect (short delay: p = .038; long delay p = .047) did not survive multiple comparisons), particularly for remote memories. Including both withinrun and between-run comparisons increases the number of unique, non-repeated trial pairs, improving statistical power without introducing redundancy. While we acknowledge that same-run comparisons may be influenced by residual autocorrelation (as shown by Prince et al. 2022, eLife), we believe that our design mitigates this risk through consistency between within-run and cross-run results, long inter-trial intervals, and trial-wise estimation of activation. We have adjusted the manuscript, accordingly, reporting the combined analysis. We also report cross-run and within-run analysis separately in supplementary materials (Tables S12.1, S12.2, showing that they converge with the cross-run results and thus strengthen rather than dilute the findings. 

      As suggested, we now explicitly highlight the change over time as the central finding. We observe a clear increase in gist-like reinstatement from recent to remote memories in children, particularly in mPFC and vlPFC. These effects based on combined within- and cross-run comparisons, are now clearly stated in the main results and interpreted in the discussion accordingly. 

      (2) This analysis uses a different approach of comparing fixations to one another, rather than fixations to scenes. In their response letter and the revised paper, the authors do provide a bit of reasoning as to why this is the most sensible. However, it is still not clear to me whether this is really "reinstatement" which (in my mind) entails the re-evoking of a neural pattern initially engaged during perception. Rather, could this be a shared neural state that is category specific? 

      We thank the reviewer for raising this important conceptual point about whether our findings reflect reinstatement in the classical sense — namely, the reactivation of perceptual neural patterns — or a shared, category-specific state.

      While traditional definitions of reinstatement emphasize item-specific reactivation (e.g., Ritchey et al., 2013; Xiao et al., 2017) it is increasingly recognized that memory retrieval can also involve the reactivation of abstracted, generalized, or gist-like representations, especially as memories consolidate. Our analysis follows this view, aimed to capture how memory representations evolve over time, particularly in development.

      Several studies support this broader notion of gist-like reinstatement. For instance, Chen et al. (2017) showed that while event-specific patterns were reinstated across the default mode network and medial temporal lobe, inter-subject recall similarity exceeded encodingretrieval similarity, suggesting transformation and abstraction beyond perceptual reinstatement. Zhuang et al. (2021) further showed that loss of neural distinctiveness in the

      MTL over time predicted false memories, linking neural similarity to representational instability. This aligns with our finding that greater gist-like reinstatement is associated with lower memory accuracy.

      Ye et al. (2020) discuss how memory representations are reshaped post-encoding — becoming more differentiated, integrated, or weakened depending on task goals and neural resources. While their work focuses on adults, our previous findings (Schommartz et al., 2023) suggest that children’s neural systems (the same sample) are structurally immature, making them more likely to rely on gist-based consolidation (see Fandakova et al., 2019). Adults, by contrast, may retain more item-specific traces.

      Relatedly, St-Laurent & Buchsbaum (2019) show that with repeated encoding, neural memory representations become increasingly distinct from perception, suggesting that reinstatement need not mimic perception. We agree that reinstatement does not always reflect reactivation of low-level sensory patterns, particularly over long delays or in developing brains.

      Finally, while we did not correlate retrieval patterns directly with perceptual encoding patterns, we assessed neural similarity among retrieved items within vs. between categories, based on non-repeated, independently sampled trials. This approach is intended to capture the structure and delay-related transformation of mnemonic representations, especially in terms of how they become more schematic or gist-like over time. Our findings align conceptually with the results of Kuhl et al. (2012), who used MVPA to show that older and newer visual memories can be simultaneously reactivated during retrieval, with greater reactivation of older memories interfering with retrieval accuracy for newer memories. Their work highlights how overlapping category-level representations in ventral temporal cortex can reflect competition among similar memories, even in the absence of item-specific cues. In our developmental context, we interpret the increased neural similarity among category members in children as possibly reflecting such representational overlap or competition, where generalized traces dominate over item-specific ones. This pattern may reflect a shift toward efficient but less precise retrieval, consistent with developmental constraints on memory specificity and consolidation.

      In this context, we view our findings as evidence of memory trace reorganization — from differentiated, item-level representations toward more schematic, gist-like neural patterns (Sekeres et al., 2018), particularly in children. Our cross-run analyses further confirm that this is not an artifact of same-run correlations or low-level confounds. We have clarified this distinction and interpretation throughout the revised manuscript (see lines 144-158; 1163-1170).

      In any case, I think additional information should be added to the text to clarify that this definition differs from others in the literature. The authors might also consider using some term other than reinstatement. Again (as I noted in my prior review), the finding of no category-level reinstatement in adults is surprising and confusing given prior work and likely has to do with the operationalization of "reinstatement" here. I was not quite sure about the explanation provided in the response letter, as category-level reinstatement is quite widespread in the brain for adults and is robust to differences in analytic procedures etc. 

      We agree that our operationalization of "reinstatement" differs from more conventional uses of the term, which typically involve direct comparisons between encoding and retrieval phases, often with item-level specificity. As our analysis is based on similarity among retrieval-phase trials (fixation-based activation patterns) and focuses on within- versus between-category neural similarity, we agree that the term reinstatement may suggest a stronger encoding–retrieval mapping than we are claiming.

      To avoid confusion and overstatement, we have revised the terminology throughout the manuscript: we now refer to our measure as “gist-like representations” rather than “gist-like reinstatement.” This change better reflects the nature of our analysis — namely, that we are capturing shared neural patterns among category-consistent memories that may reflect reorganized or abstracted traces, especially after delay and in development.

      As the reviewer rightly points out, category-level reinstatement is well documented in adults (e.g., Kuhl & Chun, 2014; Tompary et al., 2020; Tompary & Davachi, 2017). The absence of such effects in our adult group may indeed reflect differences in study design, particularly our use of non-repeated, cross-trial comparisons based on fixation events. It may also reflect different consolidation strategies, with adults preserving more differentiated or item-specific representations, while children form more schematic or generalizable representations — a pattern consistent with our interpretation and supported by prior work (Fandakova et al., 2019; Sekeres et al., 2018) 

      We have updated the relevant sections of the manuscript (Results, Discussion (particularly lines 1163- 1184), and Figure captions) to clarify this terminology shift and explicitly contrast our approach with more standard definitions of reinstatement. We hope this revision provides the needed conceptual clarity while preserving the integrity of our developmental findings.

      (3) Also from a theoretical standpoint-I'm still a bit confused as to why gist-based reinstatement would involve reinstatement of the scene gist, rather than the object's location (on the screen) gist. Were the locations on the screen similar across scene backgrounds from the same category? It seems like a different way to define memory retrieval here would be to compare the neural patterns when cued to retrieve the same vs. similar (at the "gist" level) vs. different locations across object-scene pairs. This is somewhat related to a point from my review of the initial version of this manuscript, about how scene reinstatement is not necessary. The authors state that participants were instructed to reinstate the scene, but that does not mean they were actually doing it. The point that what is being measured via the reinstatement analyses is actually not necessary to perform the task should be discussed in more detail in the paper. 

      We appreciate the reviewer’s thoughtful theoretical question regarding whether our measure of “gist-like representations” might reflect reinstatement of spatial (object-location) gist, rather than scene-level gist. We would like to clarify several key points about our task design and interpretation:

      (1) Object locations were deliberately varied and context dependent.

      In our stimulus set, each object was embedded in a rich scene context, and the locations were distributed across six distinct possible areas within each scene, with three possible object placements per location. These placements were manually selected to ensure realistic and context-sensitive positioning of objects within the scenes. Importantly, locations were not fixed across scenes within a given category. For example, objects placed in “forest” scenes could appear in different screen locations across different scene exemplars (e.g., one in the bottom-left side, another floating above). Therefore, the task did not introduce a consistent spatial schema across exemplars from the same scene category that could give rise to a “location gist.”

      (2) Scene categories provided consistent high-level contextual information.

      By contrast, the scene categories (e.g., farming, forest, indoor, etc.) provided semantically coherent and visually rich contextual backgrounds that participants could draw upon during retrieval. This was emphasized in the instruction phase, where participants were explicitly encouraged to recall the whole scene based on the stories they created during learning (not just the object or its position). While we acknowledge that we cannot directly verify the reinstated content, this instruction aligns with prior studies showing that scene and context reinstatement can occur even without direct task relevance (e.g., Kuhl & Chun, 2014; Ritchey et al., 2013).

      (3) Our results are unlikely to reflect location-based reinstatement.

      If participants had relied on a “location gist” strategy, we would have expected greater neural similarity across scenes with similar spatial layouts, regardless of category. However, our design avoids this confound by deliberately varying locations across exemplars within categories. Additionally, our categorical neural similarity measure contrasted within-category vs. between-category comparisons — making it sensitive to shared contextual or semantic structure, not simply shared screen positions.

      Considering this, we believe that the neural similarity observed in the mPFC and vlPFC in children at long delay reflects the emergence of scene-level, gist-like representations, rather than low-level spatial regularities. Nevertheless, we now clarify this point in the manuscript and explicitly discuss the limitation that reinstatement of scene context was encouraged but not required for successful task performance.

      Future studies could dissociate spatial and contextual components of reinstatement more directly by using controlled spatial overlap or explicit location recall conditions. However, given the current task structure, location-based generalization is unlikely to account for the category-level similarity patterns we observe.

      (2) Inspired by another reviewer's comment, it is unclear to me the extent to which age group differences can be attributed to differences in age/development versus memory strength. I liked the other reviewer's suggestions about how to identify and control for differences in memory strength, which I don't think the authors actually did in the revision. They instead showed evidence that memory strength does seem to be lower in children, which indicates this is an interpretive confound. For example, I liked the reviewer's suggestion of performing analyses on subsets of participants who were actually matched in initial learning/memory performance would have been very informative. As it is, the authors didn't really control for memory strength adequately in my opinion, and as such their conclusions about children vs. adults could have been reframed as people with weak vs. strong memories. This is obviously a big drawback given what the authors want to conclude. Relatedly, I'm not sure the DDM was incorporated as the reviewer was suggesting; at minimum I think the authors need to do more work in the paper to explain what this means and why it is relevant. (I understand putting it in the supplement rather

      than the main paper, but I still wanted to know more about what it added from an interpretive perspective.) 

      We appreciate the reviewer’s thoughtful concerns regarding potential confounding effects of memory strength on the observed age group differences. This is indeed a critical issue when interpreting developmental findings.

      While we agree that memory strength differs between children and adults — and our own DDM-based analysis confirms this, mirroring differences observed in accuracy — we would like to emphasize that these differences are not incidental but rather reflect developmental changes in the underlying memory system. Given the known maturation of both structural and functional memory-related brain regions, particularly the hippocampus and prefrontal cortex, we believe it would be theoretically inappropriate to control for memory strength entirely, as doing so would remove variance that is central to the age-related neural effects we aim to understand.

      To address the reviewer's concern empirically, we conducted an additional control analysis in which we subsampled children to include only those who reached learning criterion after two cycles (N = 28 out of 49 children, see Table S1.1, S1.2, Figure S1, Table S9.1), thereby selecting a high-performing subgroup. Importantly, this subsample replicated behavioral and neural results to the full group. This further suggests that the observed age group differences are not merely driven by differences in memory strength.

      As abovementioned, the results of the DDM support our behavioral findings, showing that children have lower drift rates for evidence accumulation, consistent with weaker or less accessible memory representations. While these results are reported in the Supplementary Materials (section S2.1, Figure S2, Table S2), we agree that their interpretive relevance should be more clearly explained in the main text. We have therefore updated the Discussion section to explicitly state how the DDM results provide converging evidence for our interpretation that developmental differences in memory quality — not merely strategy or task performance — underlie the observed neural differences (see lines 904-926).

      In sum, we view memory strength not as a confound to be removed, but as a meaningful and theoretically relevant factor in understanding the emergence of gist-like representations in children. We have clarified this interpretive stance in the revised manuscript and now discuss the role of memory strength more explicitly in the Discussion.

      (3) Some of the univariate results reporting is a bit strange, as they are relying upon differences between retrieval of 1- vs. 14-day memories in terms of the recent vs. remote difference, and yet don't report whether the regions are differently active for recent and remote retrieval. For example in Figure 3A, neither anterior nor posterior hippocampus seem to be differentially active for recent vs. remote memories for either age group (i.e., all data is around 0). Precuneus also interestingly seems to show numerically recent>remote (values mostly negative), whereas most other regions show the opposite. This difference from zero (in either direction) or lack thereof seems important to the message. In response to this comment on the original manuscript, the authors seem to have confirmed that hippocampal activity was greater during retrieval than implicit baseline. But this was not really my question - I was asking whether hippocampus is (and other ROIs in this same figure are) differently engaged for recent vs. remote memories.

      We thank the reviewer for bringing up this important point. Our previous analysis showed that both anterior and posterior regions of the hippocampus, anterior parahippocampal gyrus and precuneus exhibited significant activation from zero in children and adults for correctly remembered items (see Fig. S2, Table S7 in Supplementary Materials). Based on your suggestion, our additional analysis showed: 

      (i) The linear mixed-effects model for correctly remembered items showed no significant interaction effects (group x session x memory age (recent, remote)) for the anterior hippocampus (all p > .146; see Table S7.1).

      (ii) For the posterior hippocampus, we observed a significant main effect of group (F(1,85),   = 5.62, p = .038), showing significantly lower activation in children compared to adults (b = .03, t = -2.34, p = .021). No other main or interaction effects were significant (all p > .08; see Table S7.1).

      (iii) For the anterior PHG, that also showed no significant remote > recent difference, the model showed that there was indeed no difference between remote and recent items across age groups and delays (all p > .194; Table S7.1). 

      Moreover, when comparing recent and remote hippocampal activation directly, there were no significant differences in either group (all FDR-adjusted p > .116; Table S7.2), supporting the conclusion that hippocampal involvement was stable across delays for successfully retrieved items. 

      In contrast, analysis of unsuccessfully remembered items showed that hippocampal activation was not significantly different from zero in either group (all FDR-adjusted p > .052; Fig. S2.1, Table S7.1), indicating that hippocampal engagement was specific to successful memory retrieval.

      To formally test whether hippocampal activation differs between remembered and forgotten items, we ran a linear mixed-effects model with Group, Memory Success (remembered vs. forgotten), and ROI (anterior vs. posterior hippocampus) as fixed effects. This model revealed a robust main effect of memory success (F(1,1198) = 128.27, p < .001), showing that hippocampal activity was significantly higher for remembered compared to forgotten items (b = .06, t(1207) = 11.29, p < .001; Table S7.3). 

      As the reviewer noted, precuneus activation was numerically higher for recent vs. remote items, and this was confirmed in our analysis. While both recent and remote retrieval elicited significantly above-zero activation in the precuneus (Table S7.2), activation for recent items was significantly higher than for remote items, consistent across both age groups.

      Taken together, these analyses support the conclusion that hippocampal involvement in successful retrieval is sustained across delays, while other ROIs such as the precuneus may show greater engagement for more recent memories. We have now updated the manuscript text ( lines 370-390) and supplementary materials to reflect these findings more clearly, as well as to clarify the distinction between activation relative to baseline and memory-agerelated modulation.

      (4) Related to point 3, the claims about hippocampus with respect to multiple trace theory feel very unsupported by the data. I believe the authors want to conclude that children's memory retrieval shows reliance on hippocampus irrespective of delay, presumably because this is a detailed memory task. However the authors have not really shown this; all they have shown is that hippocampal involvement (whatever it is) does not vary by delay. But we do not have compelling evidence that the hippocampus is involved in this task at all. That hippocampus is more active during retrieval than implicit baseline is a very low bar and does not necessarily indicate a role in memory retrieval. If the authors want to make this claim, more data are needed (e.g., showing that hippocampal activity during retrieval is higher when the upcoming memory retrieval is successful vs. unsuccessful). In the absence of this, I think all the claims about multiple trace theory supporting retrieval similarly across delays and that this is operational in children are inappropriate and should be removed. 

      We thank the reviewer for pointing this out. We agree that additional analysis of hippocampal activity during successful and unsuccessful memory retrieval is warranted. This will provide stronger support for our claim that strong, detailed memories during retrieval rely on the hippocampus in both children and adults. Our previously presented results on the remote > recent univariate signal difference in the hippocampus (p. 14-18; lines 433-376, Fig. 3A) show that this difference does not vary between children and adults, or between Day 1 and Day 14. Our further analysis showed that both anterior and posterior regions of the hippocampus exhibited significant activation from zero in children and adults for correctly remembered items (see Fig. S2, Table S7 in Supplementary Materials). Based on your suggestion, our recent additional analysis showed:

      (i) For forgotten items, we did not observe any activation significantly higher than zero in either the anterior or posterior hippocampus for recent and remote memory on Day 1 and Day 14 in either age group (all p > .052 FDR corrected; see Table S7.1, Fig. S2.1).

      (ii) After establishing no difference between recent and remote activation across and between sessions (Day 1, Day 14), we conducted another linear mixed-effects model with group x memory success (remembered, forgotten) x region (anterior hippocampus, posterior hippocampus), with subject as a random effect. The model showed no significant effects for the memory success x region interaction (F = 1.12(1,1198), p = .289) and no significant group x memory success x region interaction (F = .017(1,1198), p = .895). However, we observed a significant main effect of memory success (F = 128.27(1,1198), p < .001), indicating significantly higher hippocampal activation for remembered compared to forgotten items (b = .06, t = 11.29, p <.001; see Table S7.3).

      (iii) Considering the comparatively low number of incorrect trials for recent items in the adult group, we reran this analysis only for remote items. Similarly, the model showed no significant effects for the memory success x region interaction (F = .72(1,555), p = .398) and no significant group x memory success x region interaction (F = .14(1,555), p = .705). However, we observed a significant main effect of memory success (F = 68.03(1,555), p < .001), indicating significantly higher hippocampal activation for remote remembered compared to forgotten items (b = .07, t = 8.20, p <.001; see Table S7.3).

      Taken together, our results indicate that significant hippocampal activation was observed only for correctly remembered items in both children and adults, regardless of memory age and session. For forgotten items, we did not observe any significant hippocampal activation in either group or delay. Moreover, hippocampal activation was significantly higher for remembered compared to forgotten memories. This evidence supports our conclusions regarding the Multiple Trace and Trace Transformation Theories, suggesting that the hippocampus supports retrieval similarly across delays, and provides novel evidence that this process is operational in both children and adults. This aligns also with Contextual Bindings Theory, as well as empirical evidence by Sekeres, Winokur, & Moscovitch (2018), among others. We have added this information to the manuscript.

      (5) There are still not enough methodological details in the main paper to make sense of the results. Some of these problems were addressed in the revision but others remain. For example, a couple of things that were unclear: that initially learned locations were split, where half were tested again at day 1 and the other half at day 14; what specific criterion was used to determine to pick the 'well-learned' associations that were used for comparisons at different delay periods (object-scene pairs that participants remembered accurately in the last repetition of learning? Or across all of learning?). 

      We thank the reviewer for pointing this out. The initially learned object-scene associations on Day 0 were split in two halves based on  their categories before the testing. Specifically, half of the pairs from the first set and half of the pairs from the second set of 30 object-scene associations were used to create the set 30 remote pair for Day 1 testing. A similar procedure was repeated for the remaining pairs to create a set of remote object-scene associations for Day 14 retrieval. We tried to equally distribute the categories of pairs between the testing sets. We added this information to the methods section of the manuscript (see p. 47, lines 12371243). In addition, the sets of association for delay test on Day 1 and Day 14 were not based on their learning accuracy. Of note, the analysis of variance revealed that there was no difference in learning accuracy between the two sets created for delay tests in either age group (children: p = .23; adults  p = .06). These results indicate that the sets were comprised of items learned with comparable accuracy in both age groups. 

      (6) In still find the revised Introduction a bit unclear. I appreciated the added descriptions of different theories of consolidation, though the order of presented points is still a bit hard to follow. Some of the predictions I also find a bit confusing as laid out in the introduction. (1) As noted in the paper multiple trace theory predicts that hippocampal involvement will remain high provided memories retained are sufficiently high detail. The authors however also predict that children will rely more on gist (than detailed) memories than adults, which would seem to imply (combined with the MTT idea) that they should show reduced hippocampal involvement over time (while in adults, it should remain high). However, the authors' actual prediction is that hippocampus will show stable involvement over time in both kids and adults. I'm having a hard time reconciling these points. (2) With respect to the extraction of gist in children, I was confused by the link to Fuzzy Trace Theory given the children in the present study are a bit young to be showing the kind of gist extraction shown in the Brainerd & Reyna data. Would 5-7 year olds not be more likely to show reliance on verbatim traces under that framework? Also from a phrasing perspective, I was confused about whether gist-like information was something different from just gist in this sentence: "children may be more inclined to extract gist information at the expense of detailed or gist-like information." (p. 8) - is this a typo? 

      We thank the reviewer for this thoughtful observation. 

      Our hypothesis of stable hippocampal engagement over time was primarily based on Contextual Binding Theory (Yonelinas et al., 2019), and the MTT, supported by the evidence provided by Sekeres et al., 2018, which posits that the hippocampus continues to support retrieval when contextual information is preserved, even for older, consolidated memories. Given that our object-location associations were repeatedly encoded and tied to specific scene contexts, we believe that retrieval success for both recent and remote memories likely involved contextual reinstatement, leading to sustained hippocampal activity. Also in accordance with the MTT and related TTT, different memory representations may coexist, including detailed and gist-like memories. Therefore, we suggest that children may not rely on highly detailed item-specific memory, but rather on sufficiently contextualized schematic traces, which still engage the hippocampus. This distinction is now made clearer in the Introduction (see lines 223-236).

      We appreciate the reviewer’s point regarding Fuzzy Trace Theory (Brainerd & Reyna, 2002). Indeed, in classic FTT, young children are thought to rely more on verbatim traces due to immature gist extraction mechanisms (primarily from verbal material). However, we use the term “gist-like representations” to refer to schematic or category-level retrieval that emerges through structured, repeated learning (as in our task). This form of abstraction may not require full semantic gist extraction in the FTT sense but may instead reflect consolidation-driven convergence onto shared category-level representations — especially when strategic resources are limited. We now clarify this distinction and revise the ambiguous sentence with typo (“at the expense of detailed or gist-like information”) to better reflect our intended meaning (see p.8).

      (7) For the PLSC, if I understand this correctly, the profiles were defined for showing associations with behaviour across age groups. (1) As such, is it not "double dipping" to then show that there is an association between brain profile and behaviour-must this not be true by definition? If I am mistaken, it might be helpful to clarify this in the paper. (2) In addition, I believe for the univariate and scene-specific reinstatement analyses these profiles were defined across both age groups. I assume this doesn't allow for separate definition of profiles across the two group (i.e., a kind of "interaction"). If this is the case, it makes sense that there would not be big age differences... the profiles were defined for showing an association across all subjects. If the authors wanted to identify distinct profiles in children and adults they may need to run another analysis. 

      We thank the reviewer for this thoughtful comment. 

      (1) We agree that showing the correlation between the latent variable and behavior may be redundant, as the relationship is already embedded in the PLSC solution and quantified by the explained variance. Our intention was merely to visualize the strength of this relationship. In hindsight, we agree that this could be misinterpreted, and we have removed the additional correlation figure from the manuscript.

      We also see the reviewer’s point that, given the shared latent profile across groups, it is expected that the strength of the brain-behavior relationship does not differ between age groups. Instead, to investigate group differences more appropriately, we examined whether children and adults differed in their expression of the shared latent variable (i.e., brain scores). This analysis revealed that children showed significantly lower brain scores than adults both in short delay, t(83) = -4.227, p = .0001, and long delay, t(74) = -5.653, p < .001, suggesting that while the brain-behavior profile is shared, its expression varies by group. We have added this clarification to the Results section (p. 19-20) of the revised manuscript. 

      (2) Regarding the second point, we agree with the reviewer that defining the PLS profiles across both age groups inherently limits the ability to detect group-specific association, as the resulting latent variables represent shared pattern across the full sample. To address this, we conducted additional PLS analyses separately within each age group to examine whether distinct neural upregulation profiles (remote > recent) emerge for short and long delay conditions.

      These within-group analyses, however, were based on smaller subsamples, which reduced statistical power, especially when using bootstrapping to assess the stability of the profiles. For the short delay, although some regions reached significance, the overall latent variables did not reach conventional thresholds for stability (all p > .069), indicating that the profiles were not robust. This suggests that within-group PLS analyses may be underpowered to detect subtle effects, particularly when modelling neural upregulation (remote > recent), which may be inherently small.

      Nonetheless, when we exploratively applied PLSC separately within each group using recent and remote activity levels against the implicit baseline (rather than the contrast remote > recent) and its relation to memory performance, we observed significant and stable latent variables in both children and adults. This implies that such contrasts (vs. baseline) may be more sensitive and better suited to detect meaningful brain–behavior relationships within age groups. We have added this clarification to the Results sections of the manuscript to highlight the limitations of within-group contrasts for neural upregulation. 

      Author response image 1.

      (3) Also, as for differences between short delay brain profile and long delay brain profile for the scene-specific reinstatement - there are 2 regions that become significant at long delay that were not significant at a short delay (PC, and CE). However, given there are ceiling effects in behaviour at the short but not long delay, it's unclear if this is a meaningful difference or just a difference in sensitivity. Is there a way to test whether the profiles are statistically different from one another?

      We thank the reviewer for this comment. To better illustrate differential profiles also for high memory accuracy after immediate delay (30 minutes delay), we added the immediate (30 minutes delay) condition as a third reference point, given the availability of scene-specific reinstatement data at this time point. Interestingly, the immediate reinstatement profile revealed a different set of significant regions, with distinct expression patterns compared to both the short and long delay conditions. This supports the view that scene-specific reinstatement is not static but dynamically reorganized over time.

      Regarding the ceiling effect at short delay, we acknowledge this as a potential limitation. However, we note that our primary analyses were conducted across both age groups combined, and not solely within high-performing individuals. As such, the grouping may mitigate concerns that ceiling-level performance in a subset of participants unduly influenced the overall reinstatement profile. Moreover, we observed variation in neural reinstatement despite ceiling-level behavior, suggesting that the neural signal retains sensitivity to consolidation-related processes even when behavioral accuracy is near-perfect.

      While we agree that formal statistical comparisons of reinstatement profiles across delays (e.g., using representational profile similarity or interaction tests) could be an informative direction, we feel that this goes beyond the scope of the current manuscript. 

      (4) As I mentioned above, it also was not ideal in my opinion that all regions were included for the scene-specific reinstatement due to the authors' inability to have an appropriate baseline and therefore define above-chance reinstatement. It makes these findings really challenging to compare with the gist reinstatement ones. 

      We appreciate the reviewer’s comment and agree that the lack of a clearly defined baseline for scene-specific reinstatement limits our ability to determine whether these values reflect above-chance reinstatement. However, we would like to clarify that we do not directly compare the magnitude of scene-specific reinstatement to that of gist-like reinstatement in our analyses or interpretations. These two analyses serve complementary purposes: the scenespecific analysis captures trial-unique similarity (within-item reinstatement), while the gistlike analysis captures category-level representational structure (across items). Because they differ not only in baseline assumptions but also in analytical scope and theoretical interpretation, our goal was not to compare them directly, but rather to explore distinct but co-existing representational formats that may evolve differently across development and delay.

      (8) I would encourage the authors to be specific about whether they are measuring/talking about memory representations versus reinstatement, unless they think these are the same thing (in which case some explanation as to why would be helpful). For example, especially under the Fuzzy Trace framework, couldn't someone maintain both verbatim and gist traces of a memory yet rely more on one when making a memory decision? 

      We thank the reviewer for pointing out the importance of conceptual clarity when referring to memory representations versus reinstatement. We agree that these are distinct but related concepts: in our framework, memory representations refer to the neural content stored as a result of encoding and consolidation, whereas reinstatement refers to the reactivation of those representations during retrieval. Thus, reinstatement serves as a proxy for the underlying memory representation — it is how we measure or infer the nature (e.g., specificity, abstraction) of the stored content.

      Under Fuzzy Trace Theory, it is indeed possible for both verbatim and gist representations to coexist. Our interpretation is not that children lack verbatim traces, but rather that they are more likely to rely on schematic or gist-like representations during retrieval, especially after a delay. Our use of neural pattern similarity (reinstatement) reflects which type of representation is being accessed, not necessarily which traces exist in parallel.

      To avoid ambiguity, we have revised the manuscript to more explicitly distinguish between reinstatement (neural reactivation) and the representational format (verbatim vs. gist-like), especially in the framing of our hypotheses and interpretation of age group differences.

      (9) With respect to the learning criteria - it is misleading to say that "children needed between two to four learning-retrieval cycles to reach the criterion of 83% correct responses" (p. 9). Four was the maximum, and looking at the Figure 1C data it appears as though there were at least a few children who did not meet the 83% minimum. I believe they were included in the analysis anyway? Please clarify. Was there any minimum imposed for inclusion?

      We thank the reviewer for pointing this out. As stated in Methods Section (p. 50, lines 13261338) “These cycles ranged from a minimum of two to a maximum of four.<…> The cycles ended when participants provided correct responses to 83% of the trials or after the fourth cycle was reached.” We have corrected the corresponding wording in the Results section (line 286-289) to reflect this more accurately. Indeed, five children did not reach the 83% criterion but achieved final performance between 70 and 80% after the fourth learning cycle. These participants were included in this analysis for two main reasons:

      (1) The 83% threshold was established during piloting as a guideline for how many learningretrieval cycles to allow, not a strict learning criterion. It served to standardize task continuation, rather than to exclude participants post hoc.

      (2) The performance of these five children was still well above chance level (33%), indicating meaningful learning. Excluding them would have biased the sample toward higherperforming children and reduced the ecological validity of our findings. Including them ensures a more representative view of children’s performance under extended learning conditions.

      (10) For the gist-like reinstatement PLSC analysis, results are really similar a short and long delays and yet some of the text seems to implying specificity to the long delay. One is a trend and one is significant (p. 31), but surely these two associations would not be statistically different from one another?  

      We agree with the reviewer that the associations at short and long delays appeared similar. While a formal comparison (e.g., using a Z-test for dependent correlations) would typically be warranted, in the reanalyzed dataset only the long delay profile remains statistically significant, which limits the interpretability of such a comparison. 

      (11) As a general comment, I had a hard time tying all of the (many) results together. For example adults show more mature neocortical consolidation-related engagement, which the authors say is going to create more durable detailed memories, but under multiple trace theory we would generally think of neocortical representations as providing more schematic information. If the authors could try to make more connections across the different neural analyses, as well as tie the neural findings in more closely with the behaviour & back to the theoretical frameworks, that would be really helpful.  

      We thank the reviewer for this valuable suggestion. We have revised the discussion section to more clearly link the behavioral and neural findings and to interpret them in light of existing consolidation theories for better clarity. 

      Reviewer #2 (Public Review): 

      Schommartz et al. present a manuscript characterizing neural signatures of reinstatement during cued retrieval of middle-aged children compared to adults. The authors utilize a paradigm where participants learn the spatial location of semantically related item-scene memoranda which they retrieve after short or long delays. The paradigm is especially strong as the authors include novel memoranda at each delayed time point to make comparisons across new and old learning. In brief, the authors find that children show more forgetting than adults, and adults show greater engagement of cortical networks after longer delays as well as stronger item-specific reinstatement. Interestingly, children show more category-based reinstatement, however, evidence supports that this marker may be maladaptive for retrieving episodic details. The question is extremely timely both given the boom in neurocognitive research on the neural development of memory, and the dearth of research on consolidation in this age group. Also, the results provide novel insights into why consolidation processes may be disrupted in children. 

      We thank the reviewer for the positive evaluation.

      Comments on the revised version: 

      I carefully reviewed not only the responses to my own reviews as well as those raised by the other reviewers. While they addressed some of the concerns raised in the process, I think many substantive concerns remain. 

      Regarding Reviewer 1: 

      The authors point that the retrieval procedure is the same over time and similarly influenced by temporal autocorrelations, which makes their analysis okay. However, there is a fundamental problem as to whether they are actually measuring reinstatement or they are only measuring differences in temporal autocorrelation (or some non-linear combination of both). The authors further argue that the stimuli are being processed more memory wise rather than perception wise, however, I think there is no evidence for that and that perception-memory processes should be considered on a continuum rather than as discrete processes. Thus, I agree with reviewer 1 that these analyses should be removed. 

      We thank the reviewer for raising this important question. We would like to clarify a few key points regarding temporal autocorrelation and reinstatement.

      During the fixation window, participants were instructed to reinstate the scene and location associated with the cued object from memory. This task was familiar to them, as they had been trained in retrieving locations within scenes. Our analysis aims to compare the neural representations during this retrieval phase with those when participants view the scene, in order to assess how these representations change in similarity over time, as memories become less precise.

      We acknowledge that temporal proximity can lead to temporal autocorrelation. However, evidence suggests that temporal autocorrelation is consistent and stable across conditions (Gautama & Van Hulle, 2004; Woolrich et al., 2004). Shinn & Lagalwar (2021)further demonstrated that temporal autocorrelation is highly reliable at both the subject and regional levels. Given that we analyze regions of interest (ROIs) separately, potential spatial variability in temporal autocorrelation is not a major concern.

      No difference between item-specific reinstatement for recent items on day 1 and day 14 (which were merged) for further delay-related comparison also suggests that the reinstatement measure was stable for recent items even sampled at two different testing days. 

      Importantly, we interpret the relative change in the reinstatement index rather than its absolute value.

      In addition, when we conducted the same analysis for incorrectly retrieved memories, we did not observe any delay-related decline in reinstatement (see p. 25, lines 623-627). This suggests that the delay-related changes in reinstatement are specific to correctly retrieved memories. 

      Finally, our control analysis examining reinstatement between object and fixation time points (as suggested by Reviewer 1) revealed no delay-related effects in any ROI (see p.24, lines 605-612), further highlighting the specificity of the observed delay-related change in item reinstatement.

      We emphasize that temporal autocorrelation should be similar across all retrieval delays due to the identical task design and structure. Therefore, any observed decrease in reinstatement with increasing delay likely reflects a genuine change in the reinstatement index, rather than differences in temporal autocorrelation. Since our analysis includes only correctly retrieved items, and there is no perceptual input during the fixation window, this process is inherently memory-based, relying on mnemonic retrieval rather than sensory processing.

      We respectfully disagree with the reviewer's assertion that retrieval during the fixation period cannot be considered more memory-driven than perception-driven. At this time point, participants had no access to actual images of the scene, making it necessary for them to rely on mnemonic retrieval. The object cue likely triggered pattern completion for the learned object-scene association, forming a unique memory if remembered correctly(Horner & Burgess, 2013). This process is inherently mnemonic, as it is based on reconstructing the original neural representation of the scene (Kuhl et al., 2012; Staresina et al., 2013).

      While perception and memory processes can indeed be viewed as a continuum, some cognitive processes are predominantly memory-based, involving reconstruction rather than reproduction of previous experiences (Bartlett, 1932; Ranganath & Ritchey, 2012). In our task, although the retrieved material is based on previously encoded visual information, the process of recalling this information during the fixation period is fundamentally mnemonic, as it does not involve visual input. Our findings indicate that the similarity between memorybased representations and those observed during actual perception decreases over time, suggesting a relative change in the quality of the representations. However, this does not imply that detailed representations disappear; they may still be robust enough to support correct memory recall. Previous studies examining encoding-retrieval similarity have shown similar findings(Pacheco Estefan et al., 2019; Ritchey et al., 2013).

      We do not claim that perception and memory processes are entirely discrete, nor do we suggest that only perception is involved when participants see the scene. Viewing the scene indeed involves recognition processes, updating retrieved representations from the fixation period, and potentially completing missing or unclear information. This integrative process demonstrates the interrelation of perception and memory, especially in complex tasks like the one we employed.

      In conclusion, our task design and analysis support the interpretation that the fixation period is primarily characterized by mnemonic retrieval, facilitated by cue-triggered pattern completion, rather than perceptual processing. We believe this approach aligns with the current understanding of memory retrieval processes as supported by the existing literature.

      The authors seem to have a design that would allow for across run comparisons, however, they did not include these additional analyses. 

      Thank you for pointing this out. We ran as additional cross-run comparison. This results and further proceeding are reported in the comment for reviewer 1. 

      To address the reviewer’s concern, we conducted an additional cross-run analysis for all correctly retrieved trials. The approach restricted comparisons to non-overlapping runs (run1run2, run2-run3, run1-run3). This analysis revealed robust gist-like reinstatement in children for remote Day 14 memories in the mPFC (p = .035) and vlPFC (p = .0007), in adults’ vlPFC remote Day 1 memories (p = .029), as well as in children and adults remote Day 1 memories in LOC (p < .02). A significant Session effect in both regions (mPFC: p = .026; vlPFC: p = .002) indicated increased reinstatement for long delay (Day 14) compared to short-delay and recent session (all p < .05). Given that the cross-run results largely replicate and reinforce the effects found previously with within-run, we believe that combining both sources of information is methodologically justified and statistically beneficial. Specifically, both approaches independently identified significant gist-like reinstatement in children’s mPFC and vlPFC (although within-run vlPFC effect (short delay: p = .038; long delay p = .047) did not survive multiple comparisons), particularly for remote memories. Including both withinrun and between-run comparisons increases the number of unique, non-repeated trial pairs, improving statistical power without introducing redundancy. While we acknowledge that same-run comparisons may be influenced by residual autocorrelation(Prince et al., 2022), we believe that our design mitigates this risk through consistency between within-run and crossrun results, long inter-trial intervals, and trial-wise estimation of activation. We have adjusted the manuscript, accordingly, reporting the combined analysis. We also report cross-run and within-run analysis separately in supplementary materials (Tables S12.1, S12.2, showing that they converge with the cross-run results and thus strengthen rather than dilute the findings. 

      As suggested, we now explicitly highlight the change over time as the central finding. We observe a clear increase in gist-like reinstatement from recent to remote memories in children, particularly in mPFC and vlPFC. These effects based on combined within- and cross-run comparisons, are now clearly stated in the main results and interpreted in the discussion accordingly. 

      (1) The authors did not satisfy my concerns about different amounts of re-exposures to stimuli as a function of age, which introduces a serious confound in the interpretation of the neural data. 

      (2) Regarding Reviewer 1's point about different number of trials being entered into analysis, I think a more formal test of sub-sampling the adult trials is warranted. 

      (1) We thank the reviewer for pointing this out. Overall, children needed 2 to 4 learning cycles to improve their performance and reach the learning criteria, compared to 2 learning cycles in adults. To address the different amounts of re-exposure to stimuli between the age groups, we subsampled the child group to only those children who reached the learning criteria after 2 learning cycles. For this purpose, we excluded 21 children from the analysis who needed 3 or 4 learning cycles. This resulted in 39 young adults and 28 children being included in the subsequent analysis. 

      (i) We reran the behavioral analysis with the subsampled dataset (see Supplementary Materials,  Table S1.1, Fig. S1, Table S1.2). This analysis replicated the previous findings of less robust memory consolidation in children across all time delays. 

      (ii) We reran the univariate analysis (see in Supplementary Materials, Table S9.1). This analysis also replicated fully the previous findings. This indicates that the inclusion of child participants with greater material exposure during learning in the analysis of neural retrieval patterns did not affect the group differences in univariate neural results. 

      These subsampled results demonstrated that the amount of re-exposure to stimuli during encoding does not affect consolidation-related changes in memory retrieval at the behavioral and neural levels in children and adults across all time delays. We have added this information to the manuscript (line 343-348, 420-425). 

      (2) We appreciate Reviewer 1's suggestion to perform a formal test by sub-sampling the adult trials to match the number of trials in the child group. However, we believe that this approach may not be optimal for the following reasons:

      (i) Loss of Statistical Power: Sub-sampling the adult trials would result in a reduced sample size, potentially leading to a significant loss of statistical power and the ability to detect meaningful effects, particularly in a context where the adult group is intended to serve as a robust control or comparison group.

      (ii) Introducing sub-sampling could introduce variability that complicates the interpretation of results, particularly if the trial sub-sampling process does not fully capture the variability inherent in the original adult data.

      (iii) Robustness of Existing Findings: We have already addressed potential concerns about unequal trial numbers by conducting analyses that control for the number of learning cycles, as detailed in our supplementary materials. These analyses have shown that the observed effects are consistent, suggesting that the differences in trial numbers do not critically influence our findings.

      Given these considerations, we hope the reviewer understands our rationale and agrees that the current analysis is robust and appropriate for addressing the research questions.

      I also still fundamentally disagree with the use of global signals when comparing children to adults, and think this could very much skew the results. 

      We thank the reviewer for raising this important issue. To address this concern comprehensively, we have taken the following steps:

      (1) Overview of the literature support for global signal regression (GSR). A growing body of methodological and empirical research supports the inclusion of global signal repression as part of best practice denoising pipelines, particularly when analyzing pediatric fMRI data. Studies such as (Ciric et al., 2017; Parkes et al., 2018; J. D. Power et al., 2012, 2014; Power et al., 2012), and (Thompson et al., 2016) show that  GSR improves motion-related artifact removal. Critically, pediatric-specific studies (Disselhoff et al., 2025; Graff et al., 2022) conclude that pipelines including GSR are most effective for signal recovery and artifact removal in younger children. Graff et al. (2021) demonstrated that among various pipelines, GSR yielded the best noise reduction in 4–8-year-olds. Additionally, (Li et al., 2019; Qing et al., 2015) emphasized that GSR reduces artifactual variance without distorting the spatial structure of neural signals. (Ofoghi et al., 2021)demonstrated that global signal regression helps mitigate non-neuronal noise sources, including respiration, cardiac activity, motion, vasodilation, and scanner-related artifacts. Based on this and other recent findings, we consider GSR particularly beneficial for denoising paediatric  fMRI data in our study.

      (2) Empirical comparison of pipelines with and without GSR. We re-run the entire first-level univariate analysis using the pipeline that excluded the global signal regression. The resulting activation maps (see Supplementary Figure S3.2, S4.2, S5.2, S9.2) differed notably from the original pipeline. Specifically, group differences in cortical regions such as mPFC, cerebellum, and posterior PHG no longer reached significance, and the overall pattern of results appeared noisier. 

      (3) Evaluation of the pipeline differences. To further evaluate the impact of GSR, we conducted the following analyses:

      (a) Global signal is stable across groups and sessions. A linear mixed-effects model showed no significant main effects or interactions involving group or session on the global signal (F-values < 2.62, p > .11), suggesting that the global signal was not group- or session-dependent in our sample. 

      (b) Noise Reduction Assessment via Contrast Variability. We compared the variability (standard deviation and IQR) of contrast estimates across pipelines. Both SD (b = .070, p < .001) and IQR (b = .087, p < .001) were significantly reduced in the GSR pipeline, especially in children (p < .001) compared to adults (p = .048). This suggests that GSR reduces inter-subject variability in children, likely reflecting improved signal quality.

      (c) Residual Variability After Regressing Global Signal. We regressed out global signal post hoc from both pipelines and compared the residual variance. Residual standard deviation was significantly lower for the GSR pipeline (F = 199, p < .001), with no interaction with session or group, further indicating that GSR stabilizes the signal and attenuates non-neuronal variability.

      Conclusion

      In summary, while we understand the reviewer’s concern, we believe the empirical and theoretical support for GSR, especially in pediatric samples, justifies its use in our study. Nonetheless, to ensure full transparency, we provide full results from both pipelines in the Supplementary Materials and have clarified our reasoning in the revised manuscript.

      Reviewer #1 (Recommendations For The Authors): 

      (1) Some figures are still missing descriptions of what everything on the graph means; please clarify in captions. 

      We thank the reviewer for pointing this out. We undertook the necessary adjustments in the graph annotations. 

      (2) The authors conclude they showed evidence of neural reorganization of memory representations in children (p. 41). But the gist is not greater in children than adults, and also does not differ over time-so, I was confused about what this claim was based on? 

      We thank the reviewer for raising this question. Our results on gist-like reinstatements suggest that gist-like reinstatement was significantly higher in children compared to adults in the mPFC in addition to the child gist-like reinstatement indices being significantly higher than zero (see p.27-28). These results support our claim on neural reorganization of memory represenations in children. We hope this clarifies the issue. 

      References

      Bartlett, F. C. (1932). Remembering: A study in experimental and social psychology. Cambridge University Press.

      Brainerd, C. J., & Reyna, V. F. (2002). Fuzzy-Trace Theory: Dual Processes in Memory, Reasoning, and Cognitive Neuroscience (pp. 41–100). https://doi.org/10.1016/S00652407(02)80062-3

      Chen, J., Leong, Y. C., Honey, C. J., Yong, C. H., Norman, K. A., & Hasson, U. (2017). Shared memories reveal shared structure in neural activity across individuals. Nature Neuroscience, 20(1), 115–125. https://doi.org/10.1038/nn.4450

      Ciric, R., Wolf, D. H., Power, J. D., Roalf, D. R., Baum, G. L., Ruparel, K., Shinohara, R. T., Elliott, M. A., Eickhoff, S. B., Davatzikos, C., Gur, R. C., Gur, R. E., Bassett, D. S., & Satterthwaite, T. D. (2017). Benchmarking of participant-level confound regression strategies for the control of motion artifact in studies of functional connectivity. NeuroImage, 154, 174–187. https://doi.org/10.1016/j.neuroimage.2017.03.020

      Disselhoff, V., Jakab, A., Latal, B., Schnider, B., Wehrle, F. M., Hagmann, C. F., Held, U., O’Gorman, R. T., Fauchère, J.-C., & Hüppi, P. (2025). Inhibition abilities and functional brain connectivity in school-aged term-born and preterm-born children. Pediatric Research, 97(1), 315–324. https://doi.org/10.1038/s41390-024-03241-0

      Esteban, O., Ciric, R., Finc, K., Blair, R. W., Markiewicz, C. J., Moodie, C. A., Kent, J. D., Goncalves, M., DuPre, E., Gomez, D. E. P., Ye, Z., Salo, T., Valabregue, R., Amlien, I. K., Liem, F., Jacoby, N., Stojić, H., Cieslak, M., Urchs, S., … Gorgolewski, K. J. (2020). Analysis of task-based functional MRI data preprocessed with fMRIPrep. Nature Protocols, 15(7), 2186–2202. https://doi.org/10.1038/s41596-020-0327-3

      Fandakova, Y., Leckey, S., Driver, C. C., Bunge, S. A., & Ghetti, S. (2019). Neural specificity of scene representations is related to memory performance in childhood. NeuroImage, 199, 105–113. https://doi.org/10.1016/j.neuroimage.2019.05.050

      Gautama, T., & Van Hulle, M. M. (2004). Optimal spatial regularisation of autocorrelation estimates in fMRI analysis. NeuroImage, 23(3), 1203–1216.  https://doi.org/10.1016/j.neuroimage.2004.07.048

      Graff, K., Tansey, R., Ip, A., Rohr, C., Dimond, D., Dewey, D., & Bray, S. (2022). Benchmarking common preprocessing strategies in early childhood functional connectivity and intersubject correlation fMRI. Developmental Cognitive Neuroscience, 54, 101087. https://doi.org/10.1016/j.dcn.2022.101087

      Horner, A. J., & Burgess, N. (2013). The associative structure of memory for multi-element events. Journal of Experimental Psychology: General, 142(4), 1370–1383. https://doi.org/10.1037/a0033626

      Jones, J. S., the CALM Team, & Astle, D. E. (2021). A transdiagnostic data-driven study of children’s behaviour and the functional connectome. Developmental Cognitive Neuroscience, 52, 101027. https://doi.org/10.1016/j.dcn.2021.101027

      Kuhl, B. A., Bainbridge, W. A., & Chun, M. M. (2012). Neural Reactivation Reveals Mechanisms for Updating Memory. Journal of Neuroscience, 32(10), 3453–3461. https://doi.org/10.1523/JNEUROSCI.5846-11.2012

      Kuhl, B. A., & Chun, M. M. (2014). Successful Remembering Elicits Event-Specific Activity Patterns in Lateral Parietal Cortex. Journal of Neuroscience, 34(23), 8051–8060. https://doi.org/10.1523/JNEUROSCI.4328-13.2014

      Li, J., Kong, R., Liégeois, R., Orban, C., Tan, Y., Sun, N., Holmes, A. J., Sabuncu, M. R., Ge, T., & Yeo, B. T. T. (2019). Global signal regression strengthens association between resting-state functional connectivity and behavior. NeuroImage, 196, 126–141. https://doi.org/10.1016/j.neuroimage.2019.04.016

      Ofoghi, B., Chenaghlou, M., Mooney, M., Dwyer, D. B., & Bruce, L. (2021). Team technical performance characteristics and their association with match outcome in elite netball. International Journal of Performance Analysis in Sport, 21(5), 700–712. https://doi.org/10.1080/24748668.2021.1938424

      Pacheco Estefan, D., Sánchez-Fibla, M., Duff, A., Principe, A., Rocamora, R., Zhang, H., Axmacher, N., & Verschure, P. F. M. J. (2019). Coordinated representational reinstatement in the human hippocampus and lateral temporal cortex during episodic memory retrieval. Nature Communications, 10(1), 2255. https://doi.org/10.1038/s41467019-09569-0

      Parkes, L., Fulcher, B., Yücel, M., & Fornito, A. (2018). An evaluation of the efficacy, reliability, and sensitivity of motion correction strategies for resting-state functional MRI. NeuroImage, 171, 415–436. https://doi.org/10.1016/j.neuroimage.2017.12.073

      Power, J. D., Barnes, K. A., Snyder, A. Z., Schlaggar, B. L., & Petersen, S. E. (2012). Spurious but systematic correlations in functional connectivity MRI networks arise from subject motion. NeuroImage, 59(3), 2142–2154. https://doi.org/10.1016/j.neuroimage.2011.10.018

      Power, J. D., Mitra, A., Laumann, T. O., Snyder, A. Z., Schlaggar, B. L., & Petersen, S. E. (2014). Methods to detect, characterize, and remove motion artifact in resting state fMRI. NeuroImage, 84, 320–341. https://doi.org/10.1016/j.neuroimage.2013.08.048

      Power, S. D., Kushki, A., & Chau, T. (2012). Intersession Consistency of Single-Trial Classification of the Prefrontal Response to Mental Arithmetic and the No-Control State by NIRS. PLoS ONE, 7(7), e37791. https://doi.org/10.1371/journal.pone.0037791

      Prince, J. S., Charest, I., Kurzawski, J. W., Pyles, J. A., Tarr, M. J., & Kay, K. N. (2022). Improving the accuracy of single-trial fMRI response estimates using GLMsingle. ELife, 11. https://doi.org/10.7554/eLife.77599

      Qing, Z., Dong, Z., Li, S., Zang, Y., & Liu, D. (2015). Global signal regression has complex effects on regional homogeneity of resting state fMRI signal. Magnetic Resonance Imaging, 33(10), 1306–1313. https://doi.org/10.1016/j.mri.2015.07.011

      Ranganath, C., & Ritchey, M. (2012). Two cortical systems for memory-guided behaviour. Nature Reviews Neuroscience, 13(10), 713–726. https://doi.org/10.1038/nrn3338

      Ritchey, M., Wing, E. A., LaBar, K. S., & Cabeza, R. (2013). Neural Similarity Between Encoding and Retrieval is Related to Memory Via Hippocampal Interactions. Cerebral Cortex, 23(12), 2818–2828. https://doi.org/10.1093/cercor/bhs258

      Satterthwaite, T. D., Elliott, M. A., Gerraty, R. T., Ruparel, K., Loughead, J., Calkins, M. E., Eickhoff, S. B., Hakonarson, H., Gur, R. C., Gur, R. E., & Wolf, D. H. (2013). An improved framework for confound regression and filtering for control of motion artifact in the preprocessing of resting-state functional connectivity data. NeuroImage, 64, 240–256. https://doi.org/10.1016/j.neuroimage.2012.08.052

      Schommartz, I., Lembcke, P. F., Pupillo, F., Schuetz, H., de Chamorro, N. W., Bauer, M., Kaindl, A. M., Buss, C., & Shing, Y. L. (2023). Distinct multivariate structural brain profiles are related to variations in short- and long-delay memory consolidation across children and young adults. Developmental Cognitive Neuroscience, 59. https://doi.org/10.1016/J.DCN.2022.101192

      Sekeres, M. J., Winocur, G., & Moscovitch, M. (2018). The hippocampus and related neocortical structures in memory transformation. Neuroscience Letters, 680, 39–53. https://doi.org/10.1016/j.neulet.2018.05.006

      Shinn, L. J., & Lagalwar, S. (2021). Treating Neurodegenerative Disease with Antioxidants: Efficacy of the Bioactive Phenol Resveratrol and Mitochondrial-Targeted MitoQ and SkQ. Antioxidants, 10(4), 573. https://doi.org/10.3390/antiox10040573

      Staresina, B. P., Alink, A., Kriegeskorte, N., & Henson, R. N. (2013). Awake reactivation predicts memory in humans. Proceedings of the National Academy of Sciences, 110(52), 21159–21164. https://doi.org/10.1073/pnas.1311989110

      St-Laurent, M., & Buchsbaum, B. R. (2019). How Multiple Retrievals Affect Neural Reactivation in Young and Older Adults. The Journals of Gerontology: Series B, 74(7), 1086–1100. https://doi.org/10.1093/geronb/gbz075

      Thompson, G. J., Riedl, V., Grimmer, T., Drzezga, A., Herman, P., & Hyder, F. (2016). The Whole-Brain “Global” Signal from Resting State fMRI as a Potential Biomarker of Quantitative State Changes in Glucose Metabolism. Brain Connectivity, 6(6), 435–447. https://doi.org/10.1089/brain.2015.0394

      Tompary, A., & Davachi, L. (2017). Consolidation Promotes the Emergence of Representational Overlap in the Hippocampus and Medial Prefrontal Cortex. Neuron, 96(1), 228-241.e5. https://doi.org/10.1016/j.neuron.2017.09.005

      Tompary, A., Zhou, W., & Davachi, L. (2020). Schematic memories develop quickly, but are not expressed unless necessary. PsyArXiv.

      Woolrich, M. W., Behrens, T. E. J., Beckmann, C. F., Jenkinson, M., & Smith, S. M. (2004). Multilevel linear modelling for FMRI group analysis using Bayesian inference. NeuroImage, 21(4), 1732–1747. https://doi.org/10.1016/j.neuroimage.2003.12.023

      Xiao, X., Dong, Q., Gao, J., Men, W., Poldrack, R. A., & Xue, G. (2017). Transformed Neural Pattern Reinstatement during Episodic Memory Retrieval. The Journal of Neuroscience, 37(11), 2986–2998. https://doi.org/10.1523/JNEUROSCI.2324-16.2017

      Ye, Z., Shi, L., Li, A., Chen, C., & Xue, G. (2020). Retrieval practice facilitates memory updating by enhancing and differentiating medial prefrontal cortex representations. ELife, 9, 1–51. https://doi.org/10.7554/ELIFE.57023

      Yonelinas, A. P., Ranganath, C., Ekstrom, A. D., & Wiltgen, B. J. (2019). A contextual binding theory of episodic memory: systems consolidation reconsidered. Nature Reviews. Neuroscience, 20(6), 364–375. https://doi.org/10.1038/S41583-019-01504

      Zhuang, L., Wang, J., Xiong, B., Bian, C., Hao, L., Bayley, P. J., & Qin, S. (2021). Rapid neural reorganization during retrieval practice predicts subsequent long-term retention and false memory. Nature Human Behaviour, 6(1), 134–145.

      https://doi.org/10.1038/s41562-021-01188-4

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary: 

      In this manuscript, the authors identified that

      (1) CDK4/6i treatment attenuates the growth of drug-resistant cells by prolongation of the G1 phase;

      (2) CDK4/6i treatment results in an ineffective Rb inactivation pathway and suppresses the growth of drugresistant tumors;

      (3) Addition of endocrine therapy augments the efficacy of CDK4/6i maintenance; 

      (4) Addition of CDK2i with CDK4/6 treatment as second-line treatment can suppress the growth of resistant cell; 

      (5) The role of cyclin E as a key driver of resistance to CDK4/6 and CDK2 inhibition.

      Strengths: 

      To prove their complicated proposal, the authors employed orchestration of several kinds of live cell markers, timed in situ hybridization, IF and Immunoblotting. The authors strongly recognize the resistance of CDK4/6 + ET therapy and demonstrated how to overcome it. 

      Weaknesses: 

      The authors need to underscore their proposed results from what is to be achieved by them and by other researchers. 

      Reviewer #2 (Public review): 

      Summary: 

      This study elucidated the mechanism underlying drug resistance induced by CDK4/6i as a single agent and proposed a novel and efficacious second-line therapeutic strategy. It highlighted the potential of combining CDK2i with CDK4/6i for the treatment of HR+/HER2- breast cancer.

      Strengths: 

      The study demonstrated that CDK4/6 induces drug resistance by impairing Rb activation, which results in diminished E2F activity and a delay in G1 phase progression. It suggests that the synergistic use of CDK2i and CDK4/6i may represent a promising second-line treatment approach. Addressing critical clinical challenges, this study holds substantial practical implications.

      Weaknesses: 

      (1) Drug-resistant cell lines: Was a drug concentration gradient treatment employed to establish drug-resistant cell lines? If affirmative, this methodology should be detailed in the materials and methods section. 

      We greatly appreciate the reviewer for raising this important question. In the revised manuscript, we have updated the methods section (“Drug-resistant cell lines”) to more precisely describe how the drug-resistant cell lines were established. 

      (2) What rationale informed the selection of MCF-7 cells for the generation of CDK6 knockout cell lines? Supplementary Figure 3. A indicates that CDK6 expression levels in MCF-7 cells are not notably elevated. 

      We appreciate the reviewer’s insightful question about the rationale for selecting MCF-7 cells to generate CDK6 knockout cell lines. This choice was guided by prior studies highlighting the significant role of CDK6 in mediating resistance to CDK4/6 inhibitors (21-24). Moreover, we observed a 4.6-fold increase in CDK6 expression in CDK4/6i resistant MCF-7 cells compared to their drug-naïve counterparts (Supplementary Figure 3A). While we did not detect notable differences in CDK4/6 activity between wild-type and CDK6 knockout cells under CDK4/6 inhibitor treatment, these findings point to a potential non-canonical function of CDK6 in conferring resistance to CDK4/6 inhibitors.  

      (3) For each experiment, particularly those involving mice, the author must specify the number of individuals utilized and the number of replicates conducted, as detailed in the materials and methods section. 

      We sincerely thank the reviewer for bringing this to our attention. In the revised manuscript, we have explicitly stated the number of replicates and mice used for each experiment as appropriate in figure legends and relevant text to ensure transparency and clarity. 

      (4) Could this treatment approach be extended to triple-negative breast cancer?

      We greatly appreciate the reviewer’s inquiry about extending our findings to triple-negative breast cancer (TNBC). Based on the data presented in Figure 1 and Supplementary Figure 2, which include the TNBC cell line MDA-MB-231, we expect that the benefits of maintaining CDK4/6 inhibitors could indeed be applicable to TNBC with an intact Rb/E2F pathway. Additionally, our recent paper (25) indicates a similar mechanism in TNBC.

      Reviewer #3 (Public review):

      Summary: 

      In their manuscript, Armand and colleagues investigate the potential of continuing CDK4/6 inhibitors or combining them with CDK2 inhibitors in the treatment of breast cancer that has developed resistance to initial therapy. Utilizing cellular and animal models, the research examines whether maintaining CDK4/6 inhibition or adding CDK2 inhibitors can effectively control tumor growth after resistance has set in. The key findings from the study indicate that the sustained use of CDK4/6 inhibitors can slow down the proliferation of cancer cells that have become resistant, and the combination of CDK2 inhibitors with CDK4/6 inhibitors can further enhance the suppression of tumor growth. Additionally, the study identifies that high levels of Cyclin E play a significant role in resistance to the combined therapy. These results suggest that continuing CDK4/6 inhibitors along with the strategic use of CDK2 inhibitors could be an effective strategy to overcome treatment resistance in hormone receptor-positive breast cancer.

      Strengths: 

      (1) Continuous CDK4/6 Inhibitor Treatment Significantly Suppresses the Growth of Drug-Resistant HR+ Breast Cancer: The study demonstrates that the continued use of CDK4/6 inhibitors, even after disease progression, can significantly inhibit the growth of drug-resistant breast cancer. 

      (2) Potential of Combined Use of CDK2 Inhibitors with CDK4/6 Inhibitors: The research highlights the potential of combining CDK2 inhibitors with CDK4/6 inhibitors to effectively suppress CDK2 activity and overcome drug resistance. 

      (3) Discovery of Cyclin E Overexpression as a Key Driver: The study identifies overexpression of cyclin E as a key driver of resistance to the combination of CDK4/6 and CDK2 inhibitors, providing insights for future cancer treatments. 

      (4) Consistency of In Vitro and In Vivo Experimental Results: The study obtained supportive results from both in vitro cell experiments and in vivo tumor models, enhancing the reliability of the research. 

      (5) Validation with Multiple Cell Lines: The research utilized multiple HR+/HER2- breast cancer cell lines (such as MCF-7, T47D, CAMA-1) and triple-negative breast cancer cell lines (such as MDA-MB-231), validating the broad applicability of the results.

      Weaknesses: 

      (1) The manuscript presents intriguing findings on the sustained use of CDK4/6 inhibitors and the potential incorporation of CDK2 inhibitors in breast cancer treatment. However, I would appreciate a more detailed discussion of how these findings could be translated into clinical practice, particularly regarding the management of patients with drug-resistant breast cancer. 

      Thank you to the reviewer for this crucial comment. In the revised Discussion, we've broadened our exploration of clinical translation. Specifically, we emphasize that ongoing CDK4/6 inhibition, although not fully stopping resistant tumors, significantly slows their growth and may offer a therapeutic window when combined with ET and CDK2 inhibition. We also note that these approaches may work best for patients without Rb loss or newly acquired resistance-driving mutations, and that cyclin E overexpression could be a biomarker to inform patient selection. These points together highlight that our findings provide a mechanistic understanding and potential framework for clinical trials testing maintenance CDK4/6i with selective addition of CDK2i as a secondline strategy in drug-resistant HR+/HER2- breast cancer.

      (2) While the emergence of resistance is acknowledged, the manuscript could benefit from a deeper exploration of the molecular mechanisms underlying resistance development. A more thorough understanding of how CDK2 inhibitors may overcome this resistance would be valuable. 

      We thank the reviewer for this valuable suggestion. In the revised manuscript, we have expanded our Discussion to more explicitly synthesize the molecular mechanisms of resistance and how CDK2 inhibitors counteract them. Specifically, we describe how sustained CDK4/6 inhibition drives a non-canonical route of Rb degradation, resulting in inefficient E2F activation and prolonged G1 phase progression. We also highlight the role of c-Myc in amplifying E2F activity and promoting resistance, and we show that continued ET mitigates this effect by suppressing c-Myc. Importantly, we demonstrate that CDK2 inhibition alone cannot fully suppress the growth of resistant cells, but when combined with CDK4/6 inhibition, it produces durable repression of E2F and Myc target gene programs and significantly delays the G1/S transition. Finally, we identify cyclin E overexpression as a key mechanism of escape from dual CDK4/6i + CDK2i therapy, suggesting its potential as a biomarker for patient stratification . Together, these findings provide a detailed mechanistic rationale for how CDK2 inhibition can overcome specific pathways of resistance in HR<sup>+</sup>/HER2<sup>-</sup> breast cancer.

      (3) The manuscript supports the continued use of CDK4/6 inhibitors, but it lacks a discussion on the long-term efficacy and safety of this approach. Additional studies or data to support the safety profile of prolonged CDK4/6 inhibitor use would strengthen the manuscript. 

      We appreciate the reviewer’s insightful comment. In the revised manuscript, we emphasize the longterm efficacy and safety considerations of sustained CDK4/6 inhibition. Clinical trial and retrospective data have shown that continued CDK4/6i therapy can extend progression-free survival in selected patients, while maintaining a favorable safety profile (26-28). We have updated the Discussion to highlight these findings more explicitly, underscoring that while prolonged CDK4/6 inhibition slows but does not fully arrest tumor growth, it remains a clinically viable strategy when balanced against its manageable toxicity profile.

      Reviewer #1 (Recommendations for the authors): 

      It is well known that the combination therapy of CDK4/6i and ET has therapeutic benefits in ER(+) HER2(-) advanced breast cancer. However, drug resistance is a problem, and second-line therapy to solve this problem has not been established. Although some parts of the research results are already reported, the authors confirmed them by employing live cell markers, and further proved and suggested how to overcome this resistance in detail. This part is considered novel. 

      Overall, this research manuscript is eligible to be accepted with the appropriate addressing of questions.

      (1)The effects and biochemical changes of combination therapy of CDK4/6i and CDK2i are already known in several papers. The author needs to highlight the differences between the author's research and that of otherresearchers. 

      We thank the reviewer for the opportunity to clarify the novelty of our findings in the context of prior studies on CDK4/6i and CDK2i combination therapy. In the revised manuscript, we have updated the Discussion section to more clearly delineate how our work extends and differs from existing research.

      Specifically, we now state:

      Page 12: The combination of CDK4/6i and ET has reshaped treatment for HR<sup>+</sup>/HER2<sup>-</sup> breast cancer (1-8). However, resistance commonly emerges, and no consensus second-line standard is established. Our data show that continued CDK4/6i treatment in drug-resistant cells engages a non-canonical, proteolysis-driven route of Rb inactivation, yielding attenuated E2F output and a pronounced delay in G1 progression (Figure 7G). Concurrent ET further deepens this blockade by suppressing c-Myc-mediated E2F amplification, thereby prolonging G1 and slowing population growth. Importantly, CDK2 inhibition alone was insufficient to control resistant cells. Robust suppression of CDK2 activity and resistant-cell growth required CDK2i in combination with CDK4/6i, consistent with prior reports supporting dual CDK targeting (9-16). Moreover, cyclin E, and in some contexts cyclin A, blunted the efficacy of the CDK4/6i and CDK2i combination by reactivating CDK2. Together, these findings provide a mechanistic rationale for maintaining CDK4/6i beyond progression and support testing ET plus CDK4/6i with the strategic addition of CDK2i, as evidenced by concordant in vitro and in vivo results.

      (2) Regarding Figures 3H and 3I, I wonder if it is live cell imaging results or if the authors counter each signal via timed IF staining slides? If live cell imaging is used, the authors need to present the methods. 

      We appreciate the reviewer’s question. Figures 3H and 3I derive from a live–fixed correlative pipeline rather than purely live imaging or independently timed IF slides. We first imaged asynchronously proliferating cells live for ≥48 h to (i) segment/track nuclei with H2B fluorescence, (ii) define mitotic exit (t = 0 at anaphase), and (iii) record CDK2 activity using a CDK2 KTR in the last live frame. Immediately after the live acquisition, we pulsed EdU (10 µM, 15 min) and fixed the same wells, photobleached fluorescent proteins (3% H₂O₂ + 20 mM HCl, 2 h, RT) to prevent crosstalk, and then performed click-chemistry EdU detection, IF for phospho-Rb (Ser807/811) and total Rb, and RNA FISH for E2F1. Fixed-cell readouts (p-Rb positivity, EdU incorporation, E2F1 mRNA puncta) were mapped back to each single cell’s live-derived time since mitosis and/or CDK2 activity, enabling the kinetic plots shown in Fig. 3H–I.

      To ensure transparency and reproducibility, we added detailed methods describing this workflow in the “Immunofluorescence and mRNA fluorescence in situ hybridization (FISH)” section under a dedicated “live– fixed pipeline” paragraph, and we cross-referenced acquisition and analysis parameters in “Live- and fixed-cell image acquisition” and “Image processing and analysis.” These updates specify: EdU pulse/fix conditions, photobleaching, antibodies/probes, imaging hardware and channels, segmentation/tracking, mitosis alignment, background correction, and how fixed readouts were binned/quantified as functions of time after mitosis and CDK2 activity.

      (3) Regarding Figure 3F, seven images were obtained in same fields? The author needs to describe the meaning of the white image and the yellow and blue image of the bottom in detail. 

      Thank you for raising this point. All seven panels in Fig. 3F are from the same field of view. The top row shows the raw channels (Hoechst, p-Rb, total Rb, and E2F1 RNA FISH). The bottom row shows the corresponding processed outputs from that field: (i) nuclear segmentation, (ii) phosphorylated Rb-status classification, and (iii) cell boundaries used for single-cell RNA-FISH quantification. We have revised the figure legend to make this explicit.

      (4) The author showed E2F mRNA by ISH, but in fact, RB does not suppress E2F mRNA but suppresses protein, so the author needs to confirm E2F at the protein level.

      We sincerely appreciate the reviewer’s thoughtful suggestion to examine E2F1 at the protein level. In our study, we focused on E2F1 mRNA expression because it is a well-established and biologically meaningful readout of E2F1 transcriptional activity. Due to its autoregulatory nature (17), the release of active E2F1 protein from Rb induces the transcription of E2F1 itself, creating a positive feedback loop. As a result, E2F1 mRNA abundance serves as a direct and reliable proxy for E2F1 protein activity (18-20). Thus, quantifying E2F1 mRNA provides a biologically relevant and mechanistic indicator of Rb-E2F pathway status. To clarify this rationale, we have updated the Results section and added references supporting our use of E2F1 mRNA as a readout for E2F1 activity.

      (5) Is it possible to synchronize cells (nocodazole shake-off, Double thymidine block) under the presence of cdk4/6i? If so, then the authors need to demonstrate the delay of G1 progression via immunoblotting. 

      We thank the reviewer for this constructive suggestion. To address it, we performed nocodazole synchronization followed by release and monitored cell-cycle progression in the presence or absence of CDK4/6 inhibition.

      Specifically, we added the following new datasets to the revised manuscript:

      Fig. 3L: Live single-cell trajectories of CDK4/6 and CDK2 activities alongside the Cdt1-degron reporter after 14 hours of nocodazole (250 nM) treatment and release. We compared the averaged traces of CDK4/6 and CDK2 activities and Cdt1 intensity in parental cells (gray) and resistant cells with (red) and without (blue) CDK4/6i maintenance. These data show suppressed and delayed CDK2 activation, as well as a right-shifted S-phase entry, particularly under continuous CDK4/6 inhibition.

      Fig. 3M: Fixed-cell EdU pulse-labeling at 4, 6, 8, 12, 16, and 24 h post-release further confirms a significant delay in S-phase entry and prolonged G1 duration in CDK4/6i-maintained cells compared with naïve and withdrawn conditions.

      Together, these results directly demonstrate the delay in G1 progression following synchronized mitotic exit under CDK4/6 inhibition.

      (6) In Figure 5C the authors showed a violin plot of c-Myc level. Is this Immunohistochemical staining? The authors need to clarify the methods.

      Thank you for flagging this. The c-Myc measurements in Fig. 5C are from immunofluorescence (IF), not IHC. We now state this explicitly in the legend.

      (7) Regarding Live cell immunofluorescence tracing of live-cell reporters, the author needs to clarify the methods (excitation, emission), name of instruments, and software used.

      To address this, we have expanded the “Live-cell, fixed-cell, and tumor tissue image acquisition” section in the Materials and Methods.

      (8) Lines 475 SF1A, the authors need to correct typos. Naïve Naïve.

      We greatly appreciate the reviewer’s attention to this detail and have ensured all typos have been addressed.  

      (9) The authors need to unify Cdt1-degron(legends) Vs Cdt1 degron (figures). 

      We greatly appreciate your attention to this discrepancy. Language referring to the Cdt1 degron has been unified between figures and legends. 

      Reviewer #3 (Recommendations for the authors):

      (1) While the manuscript discusses the selection of doses for CDK4/6 inhibitors and CDK2 inhibitors, there is a lack of detailed data on the dose-response relationship. Additional data on the effects of different doses would be beneficial. 

      We appreciate the reviewer’s important comment. To address it, we performed additional dose– response experiments testing a range of CDK4/6i and CDK2i concentrations. These analyses revealed a clear synergistic interaction between the two inhibitors. The new data are now presented in Figure 6G and Supplementary Figure 8F of the revised manuscript.

      (2) In clinical trials, the criteria for patient selection are crucial for interpreting study outcomes. A detailed description of the patient selection criteria should be provided.  

      We thank the reviewer for bringing this important point to our attention. In the revised manuscript, we have clarified the patient selection criteria relevant to the interpretation of clinical outcomes. Specifically, we note that retrospective analyses suggest patients with indolent disease and no prior chemotherapy may benefit most from continued CDK4/6i plus ET. Moreover, our data and others’ indicate that clinical benefit is expected in tumors retaining an intact Rb/E2F axis, while resistance-driving alterations (e.g., Rb loss, PIK3CA, ESR1, FGFR1–3, HER2, FAT1 mutations) are likely to limit efficacy. Finally, we highlight cyclin E overexpression as a potential biomarker of resistance to combined CDK4/6i and CDK2i, underscoring the need for biomarker-guided patient stratification. These additions provide a more detailed framework for patient selection in future clinical applications.

      References

      (1) Finn RS, Crown JP, Lang I, Boer K, Bondarenko IM, Kulyk SO, et al. The cyclin-dependent kinase 4/6 inhibitor palbociclib in combination with letrozole versus letrozole alone as first-line treatment of oestrogen receptor-positive, HER2-negative, advanced breast cancer (PALOMA-1/TRIO-18): a randomised phase 2 study. Lancet Oncol 2015;16:25-35

      (2) Finn RS, Martin M, Rugo HS, Jones S, Im S-A, Gelmon K, et al. Palbociclib and Letrozole in Advanced Breast Cancer. New England Journal of Medicine 2016;375:1925-36

      (3) Turner NC, Slamon DJ, Ro J, Bondarenko I, Im S-A, Masuda N, et al. Overall Survival with Palbociclib and Fulvestrant in Advanced Breast Cancer. New England Journal of Medicine 2018;379:1926-36

      (4) Dickler MN, Tolaney SM, Rugo HS, Cortés J, Diéras V, Patt D, et al. MONARCH 1, A Phase II Study of Abemaciclib, a CDK4 and CDK6 Inhibitor, as a Single Agent, in Patients with Refractory HR(+)/HER2(-) Metastatic Breast Cancer. Clin Cancer Res 2017;23:5218-24

      (5) Johnston S, Martin M, Di Leo A, Im S-A, Awada A, Forrester T, et al. MONARCH 3 final PFS: a randomized study of abemaciclib as initial therapy for advanced breast cancer. npj Breast Cancer 2019;5:5

      (6) Hortobagyi GN, Stemmer SM, Burris HA, Yap Y-S, Sonke GS, Hart L, et al. Overall Survival with Ribociclib plus Letrozole in Advanced Breast Cancer. New England Journal of Medicine 2022;386:94250

      (7) Slamon DJ, Neven P, Chia S, Fasching PA, De Laurentiis M, Im S-A, et al. Overall Survival with Ribociclib plus Fulvestrant in Advanced Breast Cancer. New England Journal of Medicine 2019;382:51424

      (8) Im S-A, Lu Y-S, Bardia A, Harbeck N, Colleoni M, Franke F, et al. Overall Survival with Ribociclib plus Endocrine Therapy in Breast Cancer. New England Journal of Medicine 2019;381:307-16

      (9) Pandey K, Park N, Park KS, Hur J, Cho YB, Kang M, et al. Combined CDK2 and CDK4/6 Inhibition Overcomes Palbociclib Resistance in Breast Cancer by Enhancing Senescence. Cancers (Basel) 2020;12

      (10) Freeman-Cook K, Hoffman RL, Miller N, Almaden J, Chionis J, Zhang Q, et al. Expanding control of the tumor cell cycle with a CDK2/4/6 inhibitor. Cancer Cell 2021;39:1404-21 e11

      (11) Dietrich C, Trub A, Ahn A, Taylor M, Ambani K, Chan KT, et al. INX-315, a selective CDK2 inhibitor, induces cell cycle arrest and senescence in solid tumors. Cancer Discov 2023

      (12) Al-Qasem AJ, Alves CL, Ehmsen S, Tuttolomondo M, Terp MG, Johansen LE, et al. Co-targeting CDK2 and CDK4/6 overcomes resistance to aromatase and CDK4/6 inhibitors in ER+ breast cancer. NPJ Precis Oncol 2022;6:68

      (13) Kudo R, Safonov A, Jones C, Moiso E, Dry JR, Shao H, et al. Long-term breast cancer response to CDK4/6 inhibition defined by TP53-mediated geroconversion. Cancer Cell 2024

      (14) Arora M, Moser J, Hoffman TE, Watts LP, Min M, Musteanu M, et al. Rapid adaptation to CDK2 inhibition exposes intrinsic cell-cycle plasticity. Cell 2023;186:2628-43 e21

      (15) Kumarasamy V, Wang J, Roti M, Wan Y, Dommer AP, Rosenheck H, et al. Discrete vulnerability to pharmacological CDK2 inhibition is governed by heterogeneity of the cancer cell cycle. Nature Communications 2025;16:1476

      (16) Dommer AP, Kumarasamy V, Wang J, O'Connor TN, Roti M, Mahan S, et al. Tumor Suppressors Condition Differential Responses to the Selective CDK2 Inhibitor BLU-222. Cancer Res 2025

      (17) Johnson DG, Ohtani K, Nevins JR. Autoregulatory control of E2F1 expression in response to positive and negative regulators of cell cycle progression. Genes & Development 1994;8:1514-25

      (18) Chung M, Liu C, Yang HW, Koberlin MS, Cappell SD, Meyer T. Transient Hysteresis in CDK4/6 Activity Underlies Passage of the Restriction Point in G1. Mol Cell 2019;76:562-73 e4

      (19) Kim S, Leong A, Kim M, Yang HW. CDK4/6 initiates Rb inactivation and CDK2 activity coordinates cell-cycle commitment and G1/S transition. Sci Rep 2022;12:16810

      (20) Yang HW, Chung M, Kudo T, Meyer T, Yang HW, Chung, Mingyu, Kudo T, et al. Competing memories of mitogen and p53 signalling control cell-cycle entry. Nature 2017;549:404-8

      (21) Yang C, Li Z, Bhatt T, Dickler M, Giri D, Scaltriti M, et al. Acquired CDK6 amplification promotes breast cancer resistance to CDK4/6 inhibitors and loss of ER signaling and dependence. Oncogene 2017;36:2255-64

      (22) Li Q, Jiang B, Guo J, Shao H, Del Priore IS, Chang Q, et al. INK4 Tumor Suppressor Proteins Mediate Resistance to CDK4/6 Kinase Inhibitors. Cancer Discov 2022;12:356-71

      (23) Ji W, Zhang W, Wang X, Shi Y, Yang F, Xie H, et al. c-myc regulates the sensitivity of breast cancer cells to palbociclib via c-myc/miR-29b-3p/CDK6 axis. Cell Death & Disease 2020;11:760

      (24) Wu X, Yang X, Xiong Y, Li R, Ito T, Ahmed TA, et al. Distinct CDK6 complexes determine tumor cell response to CDK4/6 inhibitors and degraders. Nature Cancer 2021;2:429-43

      (25) Kim S, Son E, Park HR, Kim M, Yang HW. Dual targeting CDK4/6 and CDK7 augments tumor response and anti-tumor immunity in breast cancer models. J Clin Invest 2025

      (26) Ravani LV, Calomeni P, Vilbert M, Madeira T, Wang M, Deng D, et al. Efficacy of Subsequent Treatments After Disease Progression on CDK4/6 Inhibitors in Patients With Hormone Receptor-Positive Advanced Breast Cancer. JCO Oncol Pract 2025;21:832-42

      (27) Martin JM, Handorf EA, Montero AJ, Goldstein LJ. Systemic Therapies Following Progression on Firstline CDK4/6-inhibitor Treatment: Analysis of Real-world Data. Oncologist 2022;27:441-6

      (28) Kalinsky K, Bianchini G, Hamilton E, Graff SL, Park KH, Jeselsohn R, et al. Abemaciclib Plus Fulvestrant in Advanced Breast Cancer After Progression on CDK4/6 Inhibition: Results From the Phase III postMONARCH Trial. J Clin Oncol 2025;43:1101-12

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      The Major Histocompatibility Complex (MHC) region is a collection of numerous genes involved in both innate and adaptive immunity. MHC genes are famed for their role in rapid evolution and extensive polymorphism in a variety of vertebrates. This paper presents a summary of gene-level gain and loss of orthologs and paralogs within MHC across the diversity of primates, using publicly available data.

      Strengths:

      This paper provides a strong case that MHC genes are rapidly gained (by paralog duplication) and lost over millions of years of macroevolution. The authors are able to identify MHC loci by homology across species, and from this infer gene duplications and losses using phylogenetic analyses. There is a remarkable amount of genic turnover, summarized in Figure 6 and Figure 7, either of which might be a future textbook figure of immune gene family evolution. The authors draw on state-of-the-art phylogenetic methods, and their inferences are robust insofar as the data might be complete enough to draw such conclusions.

      Weaknesses:

      One concern about the present work is that it relies on public databases to draw inferences about gene loss, which is potentially risky if the publicly available sequence data are incomplete. To say, for example, that a particular MHC gene copy is absent in a taxon (e.g., Class I locus F absent in Guenons according to Figure 1), we need to trust that its absence from the available databases is an accurate reflection of its absence in the genome of the actual organisms. This may be a safe assumption, but it rests on the completeness of genome assembly (and gene annotations?) or people uploading relevant data. This reviewer would have been far more comfortable had the authors engaged in some active spot-checking, doing the lab work to try to confirm absences at least for some loci and some species. Without this, a reader is left to wonder whether gene loss is simply reflecting imperfect databases, which then undercuts confidence in estimates of rates of gene loss.

      Indeed, just because a locus has not been confirmed in a species does not necessarily mean that it is absent. As we explain in the Figure 1 caption, only a few species have had their genomes extensively studied (gray background), and only for these species does the absence of a point in this figure mean that a locus is absent. The white background rows represent species that are not extensively studied, and we point out that the absence of a point does not mean that a locus is absent from the species, rather undiscovered. We have also added a parenthetical to the text to explain this (line 156): “Only species with rows highlighted in gray have had their MHC regions extensively studied (and thus only for these rows is the absence of a gene symbol meaningful).”

      While we agree that spot-checking may be a helpful next step, one of the goals of this manuscript is to collect and synthesize the enormous volume of MHC evolution research in the primates, which will serve as a jumping-off point for other researchers to perform important wet lab work.

      Some context is useful for comparing rates of gene turnover in MHC, to other loci. Changing gene copy numbers, duplications, and loss of duplicates, are common it seems across many loci and many organisms; is MHC exceptional in this regard, or merely behaving like any moderately large gene family? I would very much have liked to see comparable analyses done for other gene families (immune, like TLRs, or non-immune), and quantitative comparisons of evolutionary rates between MHC versus other genes. Does MHC gene composition evolve any faster than a random gene family? At present readers may be tempted to infer this, but evidence is not provided.

      Our companion paper (Fortier and Pritchard, 2025) demonstrates that the MHC is a unique locus in many regards, such as its evidence for deep balancing selection and its excess of disease associations. Thus, we expect that it is evolving faster than any random gene family. It would be interesting to repeat this analysis for other gene families, but that is outside of the scope of this project. Additionally, allele databases for other gene families are not nearly as developed, but as more alleles become available for other polymorphic families, a comparable analysis could become possible.

      We have added a paragraph to the discussion (lines 530-546) to clarify that we do not know for certain whether the MHC gene family is evolving rapidly compared to other gene families.

      While on the topic of making comparisons, the authors make a few statements about relative rates. For instance, lines 447-8 compare gene topology of classical versus non-classical genes; and line 450 states that classical genes experience more turnover. But there are no quantitative values given to these rates to provide numerical comparisons, nor confidence intervals provided (these are needed, given that they are estimates), nor formal statistical comparisons to confirm our confidence that rates differ between types of genes.

      More broadly, the paper uses sophisticated phylogenetic methods, but without taking advantage of macroevolutionary comparative methods that allow model-based estimation of macroevolutionary rates. I found the lack of quantitative measurements of rates of gene gain/loss to be a weakness of the present version of the paper, and something that should be readily remedied. When claiming that MHC Class I genes "turn over rapidly" (line 476) - what does rapidly mean? How rapidly? How does that compare to rates of genetic turnover at other families? Quantitative statements should be supported by quantitative estimates (and their confidence intervals).

      These statements refer to qualitative observations, so we cannot provide numerical values. We simply conclude that certain gene groups evolve faster or slower based on the species and genes present in each clade. It is difficult to provide estimates because of the incomplete sampling of genes that survived to the present day. In addition, the presence or absence of various orthologs in different species still needs to be confirmed, at which point it might be useful to be more quantitative. We have also added a paragraph to the discussion to address this concern and advocate for similar analyses of other gene families in the future when more data is available (lines 530-546).

      The authors refer to 'shared function of the MHC across species' (e.g. line 22); while this is likely true, they are not here presenting any functional data to confirm this, nor can they rule out neofunctionalization or subfunctionalization of gene duplicates. There is evidence in other vertebrates (e.g., cod) of MHC evolving appreciably altered functions, so one may not safely assume the function of a locus is static over long macroevolutionary periods, although that would be a plausible assumption at first glance.

      Indeed, we cannot assume that the function of a locus is static across time, especially for the MHC region. In our research, we read hundreds of papers that each focused on a small number of species or genes and gathered some information about them, sometimes based on functional experiments and sometimes on measures such as dN/dS. These provide some indication of a gene’s broad classification in a species or clade, even if the evidence is preliminary. Where possible, we used this preliminary evidence to give genes descriptors “classical,” “non-classical,” “dual characteristics,” “pseudogene,” “fixed”, or “unfixed.” Sometimes multiple individuals and haplotypes were analyzed, so we could even assign a minimum number of gene copies present in a species. We have aggregated all of these references into Supplementary Table 1 (for Class I/Figure 1) and Supplementary Table 2 (for Class II/Figure 2) along with specific details about which data points in these figures that each reference supports. We realize that many of these classifications are based on a small number of individuals or indirect measures, so they may change in the future as more functional data is generated.

      Reviewer #2 (Public review):

      Summary:

      The authors aim to provide a comprehensive understanding of the evolutionary history of the Major Histocompatibility Complex (MHC) gene family across primate species. Specifically, they sought to:

      (1) Analyze the evolutionary patterns of MHC genes and pseudogenes across the entire primate order, spanning 60 million years of evolution.

      (2) Build gene and allele trees to compare the evolutionary rates of MHC Class I and Class II genes, with a focus on identifying which genes have evolved rapidly and which have remained stable.

      (3) Investigate the role of often-overlooked pseudogenes in reconstructing evolutionary events, especially within the Class I region.

      (4) Highlight how different primate species use varied MHC genes, haplotypes, and genetic variation to mount successful immune responses, despite the shared function of the MHC across species.

      (5) Fill gaps in the current understanding of MHC evolution by taking a broader, multi-species perspective using (a) phylogenomic analytical computing methods such as Beast2, Geneconv, BLAST, and the much larger computing capacities that have been developed and made available to researchers over the past few decades, (b) literature review for gene content and arrangement, and genomic rearrangements via haplotype comparisons.

      (6) The authors overall conclusions based on their analyses and results are that 'different species employ different genes, haplotypes, and patterns of variation to achieve a successful immune response'.

      Strengths:

      Essentially, much of the information presented in this paper is already well-known in the MHC field of genomic and genetic research, with few new conclusions and with insufficient respect to past studies. Nevertheless, while MHC evolution is a well-studied area, this paper potentially adds some originality through its comprehensive, cross-species evolutionary analysis of primates, focus on pseudogenes and the modern, large-scale methods employed. Its originality lies in its broad evolutionary scope of the primate order among mammals with solid methodological and phylogenetic analyses.

      The main strengths of this study are the use of large publicly available databases for primate MHC sequences, the intensive computing involved, the phylogenetic tool Beast2 to create multigene Bayesian phylogenetic trees using sequences from all genes and species, separated into Class I and Class II groups to provide a backbone of broad relationships to investigate subtrees, and the presentation of various subtrees as species and gene trees in an attempt to elucidate the unique gene duplications within the different species. The study provides some additional insights with summaries of MHC reference genomes and haplotypes in the context of a literature review to identify the gene content and haplotypes known to be present in different primate species. The phylogenetic overlays or ideograms (Figures 6 and 7) in part show the complexity of the evolution and organisation of the primate MHC genes via the orthologous and paralogous gene and species pathways progressively from the poorly-studied NWM, across a few moderately studied ape species, to the better-studied human MHC genes and haplotypes.

      Weaknesses:

      The title 'The Primate Major Histocompatibility Complex: An Illustrative Example of GeneFamily Evolution' suggests that the paper will explore how the Major Histocompatibility Complex (MHC) in primates serves as a model for understanding gene family evolution. The term 'Illustrative Example' in the title would be appropriate if the paper aimed to use the primate Major Histocompatibility Complex (MHC) as a clear and representative case to demonstrate broader principles of gene family evolution. That is, the MHC gene family is not just one instance of gene family evolution but serves as a well-studied, insightful example that can highlight key mechanisms and concepts applicable to other gene families. However, this is not the case, this paper only covers specific details of primate MHC evolution without drawing broader lessons to any other gene families. So, the term 'Illustrative Example' is too broad or generalizing. In this case, a term like 'Case Study' or simply 'Example' would be more suitable. Perhaps, 'An Example of Gene Family Diversity' would be more precise. Also, an explanation or 'reminder' is suggested that this study is not about the origins of the MHC genes from the earliest jawed vertebrates per se (~600 mya), but it is an extension within a subspecies set that has emerged relatively late (~60 mya) in the evolutionary divergent pathways of the MHC genes, systems, and various vertebrate species.

      Thank you for your input on the title; we have changed it to “A case study of gene family evolution” instead.

      Thank you also for pointing out the potential confusion about the time span of our study. We have added “Having originated in the jawed vertebrates,” to a sentence in the introduction (lines 38-39). We have also added the sentence “Here, we focus on the primates, spanning approximately 60 million years within the over 500-million-year evolution of the family \citep{Flajnik2010}.“ to be more explicit about the context for our work (lines 59-61).

      Phylogenomics. Particular weaknesses in this study are the limitations and problems associated with providing phylogenetic gene and species trees to try and solve the complex issue of the molecular mechanisms involved with imperfect gene duplications, losses, and rearrangements in a complex genomic region such as the MHC that is involved in various effects on the response and regulation of the immune system. A particular deficiency is drawing conclusions based on a single exon of the genes. Different exons present different trees. Which are the more reliable? Why were introns not included in the analyses? The authors attempt to overcome these limitations by including genomic haplotype analysis, duplication models, and the supporting or contradictory information available in previous publications. They succeed in part with this multidiscipline approach, but much is missed because of biased literature selection. The authors should include a paragraph about the benefits and limitations of the software that they have chosen for their analysis, and perhaps suggest some alternative tools that they might have tried comparatively. How were problems with Bayesian phylogeny such as computational intensity, choosing probabilities, choosing particular exons for analysis, assumptions of evolutionary models, rates of evolution, systemic bias, and absence of structural and functional information addressed and controlled for in this study?

      We agree that different exons have different trees, which is exactly why we repeated our analysis for each exon in order to compare and contrast them. In particular, the exons encoding the binding site of the resulting protein (exons 2 and 3 for Class I and exon 2 for Class II) show evidence for trans-species polymorphism and gene conversion. These phenomena lead to trees that do not follow the species tree and are fascinating in and of themselves, which we explore in detail in our companion paper (Fortier and Pritchard, 2025). Meanwhile, the non-peptide-binding extracellular-domain-encoding exon (exon 4 for Class I and exon 3 for Class II) is comparably sized to the binding-site-encoding exons and provides an interesting functional contrast. As this exon is likely less affected by trans-species polymorphism, gene conversion, and convergent evolution, we present results from it most often in the main text, though we occasionally touch on differences between the exons. See lines 191-196, 223-226, and 407-414 for some examples of how we discuss the exons in the text. Additionally, all trees from all of these exons can be found in the supplement. 

      We agree that introns would valuable to study in this context. Even though the non--binding-site-encoding exons are probably *less* affected by trans-species polymorphism, gene conversion, and convergent evolution, they are still functional. The introns, however, experience much more relaxed selection, if any, and comparing their trees to those for the exons would be valuable and illuminating. We did not generate intron trees for two reasons. Most importantly, there is a dearth of data available for the introns; in the databases we used, there was often intron data available only for human, chimpanzee, and sometimes macaque, and only for a small subset of the genes. This limitation is at odds with the comprehensive, many-gene-many-species approach which we feel is the main novelty of this work. Secondly, the introns that *are* available are difficult to align. Even aligning the exons across such a highly-diverged set of genes and pseudogenes was difficult and required manual effort. The introns proved even more difficult to try to align across genes. In the future, when more intron data is available and sufficient effort is put into aligning them, it will be possible and desirable to do a comparable analysis. We also added a sentence to the “Data” section to briefly explain why we did not include introns (lines 134-135).

      We explain our Bayesian phylogenetics approach in detail in the Methods (lines 650-725), including our assumptions and our solutions to challenges specific to this application. For further explanation of the method itself, we suggest reading the original BEAST and BEAST2 papers (Drummond & Rambaut (2007), Drummond et al. (2012), Bouckaert et al. (2014), and Bouckaert et al. (2019)). Known structural and functional information helped us validate the alignments we used in this study, but the fact that such information is not fully known for every gene and species should not affect the method itself.

      Gene families as haplotypes. In the Introduction, the MHC is referred to as a 'gene family', and in paragraph 2, it is described as being united by the 'MHC fold', despite exhibiting 'very diverse functions'. However, the MHC region is more accurately described as a multigene region containing diverse, haplotype-specific Conserved Polymorphic Sequences, many of which are likely to be regulatory rather than protein-coding. These regulatory elements are essential for controlling the expression of multiple MHC-related products, such as TNF and complement proteins, a relationship demonstrated over 30 years ago. Non-MHC fold loci such as TNF, complement, POU5F1, lncRNA, TRIM genes, LTA, LTB, NFkBIL1, etc, are present across all MHC haplotypes and play significant roles in regulation. Evolutionary selection must act on genotypes, considering both paternal and maternal haplotypes, rather than on individual genes alone. While it is valuable to compile databases for public use, their utility is diminished if they perpetuate outdated theories like the 'birth-and-death model'. The inclusion of prior information or assumptions used in a statistical or computational model, typically in Bayesian analysis, is commendable, but they should be based on genotypic data rather than older models. A more robust approach would consider the imperfect duplication of segments, the history of their conservation, and the functional differences in inheritance patterns. Additionally, the MHC should be examined as a genomic region, with ancestral haplotypes and sequence changes or rearrangements serving as key indicators of human evolution after the 'Out of Africa' migration, and with disease susceptibility providing a measurable outcome. There are more than 7000 different HLA-B and -C alleles at each locus, which suggests that there are many thousands of human HLA haplotypes to study. In this regard, the studies by Dawkins et al (1999 Immunol Rev 167,275), Shiina et al. (2006 Genetics 173,1555) on human MHC gene diversity and disease hitchhiking (haplotypes), and Sznarkowska et al. (2020 Cancers 12,1155) on the complex regulatory networks governing MHC expression, both in terms of immune transcription factor binding sites and regulatory non-coding RNAs, should be examined in greater detail, particularly in the context of MHC gene allelic diversity and locus organization in humans and other primates.

      Thank you for these comments. To clarify that the MHC “region” is different from (and contains) the MHC “gene family” as we describe it, we changed a sentence in the abstract (lines 8-10) from “One large gene family that has experienced rapid evolution is the Major Histocompatibility Complex (MHC), whose proteins serve critical roles in innate and adaptive immunity.” to “One large gene family that has experienced rapid evolution lies within the Major Histocompatibility Complex (MHC), whose proteins serve critical roles in innate and adaptive immunity.” We know that the region is complex and contains many other genes and regulatory sequences; Figure 1 of our companion paper (Fortier and Pritchard, 2025) depicts these in order to show the reader that the MHC genes we focus on are just one part of the entire region.

      We love the suggestion to look at the many thousands of alleles present at each of the classical loci. This is the focus of our complimentary paper (Fortier and Pritchard, 2025) which explores variation at the allele level. In the current paper, we look mainly at the differences between genes and the use of different genes in different species.

      Diversifying and/or concerted evolution. Both this and past studies highlight diversifying selection or balancing selection model is the dominant force in MHC evolution. This is primarily because the extreme polymorphism observed in MHC genes is advantageous for populations in terms of pathogen defence. Diversification increases the range of peptides that can be presented to T cells, enhancing the immune response. The peptide-binding regions of MHC genes are highly variable, and this variability is maintained through selection for immune function, especially in the face of rapidly evolving pathogens. In contrast, concerted evolution, which typically involves the homogenization of gene duplicates through processes like gene conversion or unequal crossing-over, seems to play a minimal role in MHC evolution. Although gene duplication events have occurred in the MHC region leading to the expansion of gene families, the resulting paralogs often undergo divergent evolution rather than being kept similar or homozygous by concerted evolution. Therefore, unlike gene families such as ribosomal RNA genes or histone genes, where concerted evolution leads to highly similar copies, MHC genes display much higher levels of allelic and functional diversification. Each MHC gene copy tends to evolve independently after duplication, acquiring unique polymorphisms that enhance the repertoire of antigen presentation, rather than undergoing homogenization through gene conversion. Also, in some populations with high polymorphism or genetic drift, allele frequencies may become similar over time without the influence of gene conversion. This similarity can be mistaken for gene conversion when it is simply due to neutral evolution or drift, particularly in small populations or bottlenecked species. Moreover, gene conversion might contribute to greater diversity by creating hybrids or mosaics between different MHC genes. In this regard, can the authors indicate what percentage of the gene numbers in their study have been homogenised by gene conversion compared to those that have been diversified by gene conversion?

      We appreciate the summary, and we feel we have appropriately discussed both gene conversion and diversifying selection in the context of the MHC genes. Because we cannot know for sure when and where gene conversion has occurred, we cannot quantify percentages of genes that have been homogenized or diversified.  

      Duplication models. The phylogenetic overlays or ideograms (Figures 6 and 7) show considerable imperfect multigene duplications, losses, and rearrangements, but the paper's Discussion provides no in-depth consideration of the various multigenic models or mechanisms that can be used to explain the occurrence of such events. How do their duplication models compare to those proposed by others? For example, their text simply says on line 292, 'the proposed series of events is not always consistent with phylogenetic data'. How, why, when? Duplication models for the generation and extension of the human MHC class I genes as duplicons (extended gene or segmental genomic structures) by parsimonious imperfect tandem duplications with deletions and rearrangements in the alpha, beta, and kappa blocks were already formulated in the late 1990s and extended to the rhesus macaque in 2004 based on genomic haplotypic sequences. These studies were based on genomic sequences (genes, pseudogenes, retroelements), dot plot matrix comparisons, and phylogenetic analyses of gene and retroelement sequences using computer programs. It already was noted or proposed in these earlier 1999 studies that (1) the ancestor of HLA-P(90)/-T(16)/W(80) represented an old lineage separate from the other HLA class I genes in the alpha block, (2) HLA-U(21) is a duplicated fragment of HLA-A, (3) HLA-F and HLA-V(75) are among the earliest (progenitor) genes or outgroups within the alpha block, (4) distinct Alu and L1 retroelement sequences adjoining HLA-L(30), and HLA-N genomic segments (duplicons) in the kappa block are closely related to those in the HLA-B and HLA-C in the beta block; suggesting an inverted duplication and transposition of the HLA genes and retroelements between the beta and kappa regions. None of these prior human studies were referenced by Fortier and Pritchard in their paper. How does their human MHC class I gene duplication model (Fig. 6) such as gene duplication numbers and turnovers differ from those previously proposed and described by Kulski et al (1997 JME 45,599), (1999 JME 49,84), (2000 JME 50,510), Dawkins et al (1999 Immunol Rev 167,275), and Gaudieri et al (1999 GR 9,541)? Is this a case of reinventing the wheel?

      Figures 6 and 7 are intended to synthesize and reconcile past findings and our own trees, so they do not strictly adhere to the findings of any particular study and cannot fully match all studies. In the supplement, Figure 6 - figure supplement 1 and Figure 7 - figure supplement 1 duly credit all of the past work that went into making these trees. Most previous papers focus on just one aspect of these trees, such as haplotypes within a species, a specific gene or allelic lineage relationship, or the branching pattern of particular gene groups. We believe it was necessary to bring all of these pieces of evidence together. Even among papers with the same focus (to understand the block duplications that generated the current physical layout of the MHC), results differ. For example, Geraghty (1992), Hughes (1995), Kulski (2004)/Kulski (2005),  and Shiina (1999) all disagree on the exact branching order of the genes MHC-W, -P, and -T, and of MHC-G, -J, and -K. While the Kulski studies you pointed out were very thorough for their era, they still only relied on data from three species and one haplotype per species. Our work is not intended to replace or discredit these past works, simply build upon them with a larger set of species and sequences. We hope the hypotheses we propose in Figures 6 and 7 can help unify existing research and provide a more easily accessible jumping-off-point for future work.

      Results. The results are presented as new findings, whereas most if not all of the results' significance and importance already have been discussed in various other publications. Therefore, the authors might do better to combine the results and discussion into a single section with appropriate citations to previously published findings presented among their results for comparison. Do the trees and subsets differ from previous publications, albeit that they might have fewer comparative examples and samples than the present preprint? Alternatively, the results and discussion could be combined and presented as a review of the field, which would make more sense and be more honest than the current format of essentially rehashing old data.

      In starting this project, we found that a large barrier to entry to this field of study is the immense amount of published literature over 30+ years. It is both time-consuming and confusing to read up on the many nuances of the MHC genes, their changing names, and their evolution, making it difficult to start new, innovative projects. We acknowledge that while our results are not entirely novel, the main advantage of our work is that it provides a thorough, comprehensive starting point for others to learn about the MHC quickly and dive into new research. We feel that we have appropriately cited past literature in both the main text, appendices, and supplement, so that readers may dive into a particular area with ease.

      Minor corrections:

      (1) Abstract, line 19: 'modern methods'. Too general. What modern methods?

      To keep the abstract brief, the methods are introduced in the main text when each becomes relevant as well as in the methods section.

      (2) Abstract, line 25: 'look into [primate] MHC evolution.' The analysis is on the primate MHC genes, not on the entire vertebrate MHC evolution with a gene collection from sharks to humans. The non-primate MHC genes are often differently organised and structurally evolved in comparison to primate MHC.

      Thank you! We have added the word “primate” to the abstract (line 25).

      (3) Introduction, line 113. 'In a companion paper (Fortier and Pritchard, 2024)' This paper appears to be unpublished. If it's unpublished, it should not be referenced.

      This paper is undergoing the eLife editorial process at the same time; it will have a proper citation in the final version.

      (4) Figures 1 and 2. Use the term 'gene symbols' (circle, square, triangle, inverted triangle, diamond) or 'gene markers' instead of 'points'. 'Asterisks "within symbols" indicate new information.

      Thank you, the word “symbol” is much clearer! We have changed “points” to “symbols” in the captions for Figure 1, Figure 1 - figure supplement 1, Figure 2, and Figure 2 - figure supplement 1. We also changed this in the text (lines 157-158 and 170).

      (5) Figures. A variety of colours have been applied for visualisation. However, some coloured texts are so light in colour that they are difficult to read against a white background. Could darker colours or black be used for all or most texts?

      With such a large number of genes and species to handle in this work, it was nearly impossible to choose a set of colors that were distinct enough from each other. We decided to prioritize consistency (across this paper, its supplement, and our companion paper) as well as at-a-glance grouping of similar sequences. Unfortunately, this means we had to sacrifice readability on a white background, but readers may turn to the supplement if they need to access specific sequence names.

      (6) Results, line 135. '(Fortier and Pritchard, 2024)' This paper appears to be unpublished. If it's unpublished, it should not be referenced.

      Repeat of (3). This paper is undergoing the eLife editorial process at the same time; it will have a proper citation in the final version.

      (7) Results, lines 152 to 153, 164, 165, etc. 'Points with an asterisk'. Use the term 'gene symbols' (circle, square, triangle, inverted triangle, diamond) or 'gene markers' instead of 'points'. A point is a small dot such as those used in data points for plotting graphs .... The figures are so small that the asterisks in the circles, squares, triangles, etc, look like points (dots) and the points/asterisks terminology that is used is very confusing visually.

      Repeat of (4). Thank you, the word “symbol” is much clearer! We have changed “points” to “symbols” in the captions for Figure 1, Figure 1 - figure supplement 1, Figure 2, and Figure 2 - figure supplement 1. We also changed this in the text (lines 157-158 and 170).

      (8) Line 178 (BEA, 2024) is not listed alphabetically in the References.

      Thank you for catching this! This reference maps to the first bibliography entry, “SUMMARIZING POSTERIOR TREES.” We are unsure how to cite a webpage that has no explicit author within the eLife Overleaf template, so we will consult with the editor.

      (9) Lines 188-190. 'NWM MHC-G does not group with ape/OWM MHC-G, instead falling outside of the clade containing ape/OWM MHC-A, -G, -J and -K.' This is not surprising given that MHC-A, -G, -J, and -K are paralogs of each other and that some of them, especially in NWM have diverged over time from the paralogs and/or orthologs and might be closer to one paralog than another and not be an actual ortholog of OWM, apes or humans.

      We included this sentence to clarify the relationships between genes and to help describe what is happening in Figure 6. Figure 6 - figure supplement 1 includes all of the references that go into such a statement and Appendix 3 details our reasoning for this and other statements.

      (10) Line 249. Gene conversion: This is recombination between two different genes where a portion of the genes are exchanged with one another so that different portions of the gene can group within one or other of the two gene clades. Alternatively, the gene has been annotated incorrectly if the gene does not group within either of the two alternative clades. Another possibility is that one or two nucleotide mutations have occurred without a recombination resulting in a mistaken interpretation or conclusion of a recombination event. What measures are taken to avoid false-positive conclusions? How many MHC gene conversion (recombination) events have occurred according to the authors' estimates? What measures are taken to avoid false-positive conclusions?

      All of these possibilities are certainly valid. We used the program GENECONV to infer gene conversion events, but there is considerable uncertainty owing to the ages of the genes and the inevitable point mutations that have occurred post-event. Gene conversion was not the focus of our paper, so we did our best to acknowledge it (and the resulting differences between trees from different exons) without spending too much time diving into it. A list of inferred gene conversion events can be found in Figure 3 - source data 1 and Figure 4 - source data 1.

      (11) Lines 284-286. 'The Class I MHC region is further divided into three polymorphic blocks-alpha, beta, and kappa blocks-that each contains MHC genes but are separated by well-conserved non-MHC genes.' The MHC class I region was first designated into conserved polymorphic duplication blocks, alpha and beta by Dawkins et al (1999 Immunol Rev 167,275), and kappa by Kulski et al (2002 Immunol Rev 190,95), and should be acknowledged (cited) accordingly.

      Thank you for catching this! We have added these citations (lines 302-303)!

      (12) Lines 285-286. 'The majority of the Class I genes are located in the alpha-block, which in humans includes 12 MHC genes and pseudogenes.' This is not strictly correct for many other species, because the majority of class I genes might be in the beta block of new and old-world monkeys, and the authors haven't provided respective counts of duplication numbers to show otherwise. The alpha block in some non-primate mammalian species such as pigs, rats, and mice has no MHC class I genes or only a few. Most MHC class I genes in non-primate mammalian species are found in other regions. For example, see Ando et al (2005 Immunogenetics 57,864) for the pig alpha, beta, and kappa regions in the MHC class I region. There are no pig MHC genes in the alpha block.

      Yes, which is exactly why we use the phrase “in humans” in that particular sentence. The arrangement of the MHC in several other primate reference genomes is shown in Figure 1 - figure supplement 2.

      (13) Line 297 to 299. 'The alpha-block also contains a large number of repetitive elements and gene fragments belonging to other gene families, and their specific repeating pattern in humans led to the conclusion that the region was formed by successive block duplications (Shiina et al., 1999).' There are different models for successive block duplications in the alpha block and some are more parsimonious based on imperfect multigenic segmental duplications (Kulski et al 1999, 2000) than others (Shiina et al., 1999). In this regard, Kulski et al (1999, 2000) also used duplicated repetitive elements neighbouring MHC genes to support their phylogenetic analyses and multigenic segmental duplication models. For comparison, can the authors indicate how many duplications and deletions they have in their models for each species?

      We have added citations to this sentence to show that there are different published models to describe the successive block duplications (line 307). Our models in Figure 6 and Figure 7 are meant to aggregate past work and integrate our own, and thus they were not built strictly by parsimony. References can be found in Figure 6 - figure supplement 1 and Figure 7 - figure supplement 1.

      (14) Lines 315-315. 'Ours is the first work to show that MHC-U is actually an MHC-A-related gene fragment.' This sentence should be deleted. Other researchers had already inferred that MHC-U is actually an MHC-A-related gene fragment more than 25 years ago (Kulski et al 1999, 2000) when the MHC-U was originally named MHC-21.

      While these works certainly describe MHC-U/MHC-21 as a fragment in the 𝛼-block, any relation to MHC-A was by association only and very few species/haplotypes were examined. So although the idea is not wholly novel, we provide convincing evidence that not only is MHC-U related to MHC-A by sequence, but also that it is a very recent partial duplicate of MHC-A. We show this with Bayesian phylogenetic trees as well as an analysis of haplotypes across many more species than were included in those papers.  

      (15) Lines 361-362. 'Notably, our work has revealed that MHC-V is an old fragment.' This is not a new finding or hypothesis. Previous phylogenetic analysis and gene duplication modelling had already inferred HLA-V (formerly HLA-75) to be an old fragment (Kulski et al 1999, 2000).

      By “old,” we mean older than previous hypotheses suggest. Previous work has proposed that MHC-V and -P were duplicated together, with MHC-V deriving from an MHC-A/H/V ancestral gene and MHC-P deriving from an MHC-W/T/P ancestral gene (Kulski (2005), Shiina (1999)). However, our analysis (Figure 5A) shows that MHC-V sequences form a monophyletic clade outside of the MHC-W/P/T group of genes as well as outside of the MHC-A/B/C/E/F/G/J/K/L group of genes, which is not consistent with MHC-A and -V being closely related. Thus, we conclude that MHC-V split off earlier than the differentiation of these other gene groups and is thus older than previously thought. We explain this in the text as well (lines 317-327) and in Appendix 3.  

      (16) Line 431-433. 'the Class II genes have been largely stable across the mammals, although we do see some lineage-specific expansions and contractions (Figure 2 and Figure 2-gure Supplement 2).' Please provide one or two references to support this statement. Is 'gure' a typo?

      We corrected this typo, thank you! This conclusion is simply drawn from the data presented in Figure 2 and Figure 2 - figure supplement 2. The data itself comes from a variety of sources, which are already included in the supplement as Figure 2 - source data 1.

      (17) Line 437. 'We discovered far more "specific" events in Class I, while "broad-scale" events were predominant in Class II.' Please define the difference between 'specific' and 'broad-scale'.

      These terms are defined in the previous sentence (lines 466-469).

      450-451. 'This shows that classical genes experience more turnover and are more often affected by long-term balancing selection or convergent evolution.' Is balancing selection a form of divergent evolution that is different from convergent evolution? Please explain in more detail how and why balancing selection or convergent evolution affects classical and nonclassical genes differently.

      Balancing selection acts to keep alleles at moderate frequencies, preventing any from fixing in the population. In contrast, convergent evolution describes sequences or traits becoming similar over time even though they are not similar by descent. While we cannot know exactly what selective forces have occurred in the past, we observe different patterns in the trees for each type of gene. In Figures 1 and 2, viewers can see at first glance that the nonclassical genes (which are named throughout the text and thoroughly described in Appendix 3) appear to be longer-lived than the classical genes. In addition, lines 204-222 and 475-488 describe topological differences in the BEAST2 trees of these two types of genes. However, we acknowledge that it could be helpful to have additional, complimentary information about the classical vs. non-classical genes. Thus, we have added a sentence and reference to our companion paper (Fortier and Pritchard, 2025), which focuses on long-term balancing selection and draws further contrast between classical and non-classical genes. In lines 481-484, we added  “We further explore the differences between classical and non-classical genes in our companion paper, finding ancient trans-species polymorphism at the classical genes but not at the non-classical genes \citep{Fortier2025b}.”

      References

      Some references in the supplementary materials such as Alvarez (1997), Daza-Vamenta (2004), Rojo (2005), Aarnink (2014), Kulski (2022), and others are missing from the Reference list. Please check that all the references in the text and the supplementary materials are listed correctly and alphabetically.

      We will make sure that these all show up properly in the proof.

      Reviewer #3 (Public review):

      Summary:

      The article provides the most comprehensive overview of primate MHC class I and class II genes to date, combining published data with an exploration of the available genome assemblies in a coherent phylogenetic framework and formulating new hypotheses about the evolution of the primate MHC genomic region.

      Strengths:

      I think this is a solid piece of work that will be the reference for years to come, at least until population-scale haplotype-resolved whole-genome resequencing of any mammalian species becomes standard. The work is timely because there is an obvious need to move beyond short amplicon-based polymorphism surveys and classical comparative genomic studies. The paper is data-rich and the approach taken by the authors, i.e. an integrative phylogeny of all MHC genes within a given class across species and the inclusion of often ignored pseudogenes, makes a lot of sense. The focus on primates is a good idea because of the wealth of genomic and, in some cases, functional data, and the relatively densely populated phylogenetic tree facilitates the reconstruction of rapid evolutionary events, providing insights into the mechanisms of MHC evolution. Appendices 1-2 may seem unusual at first glance, but I found them helpful in distilling the information that the authors consider essential, thus reducing the need for the reader to wade through a vast amount of literature. Appendix 3 is an extremely valuable companion in navigating the maze of primate MHC genes and associated terminology.

      Weaknesses:

      I have not identified major weaknesses and my comments are mostly requests for clarification and justification of some methodological choices.

      Thank you so much for your kind and supportive review!

      Reviewer #1 (Recommendations for the authors):

      (1) Line 151: How is 'extensively studied' defined?

      Extensively studied is not a strict definition, but a few organisms clearly stand apart from the rest in terms of how thoroughly their MHC regions have been studied. For example, the macaque is a model organism, and individuals from many different species and populations have had their MHC regions fully sequenced. This is in contrast to the gibbon, for example, in which there is some experimental evidence for the presence of certain genes, but no MHC region has been fully sequenced from these animals.

      (2) Can you clarify how 'classical' and 'non-classical' MHC genes are being determined in your analysis?

      Classical genes are those whose protein products perform antigen presentation to T cells and are directly involved in adaptive immunity, while non-classical genes are those whose protein products do not do this. For example, these non-classical genes might code for proteins that interact with receptors on Natural Killer cells and influence innate immunity. The roles of these proteins are not necessarily conserved between closely related species, and experimental evidence is needed to evaluate this. However, in the absence of such evidence, wherever possible we have provided our best guess as to the roles of the orthologous genes in other species, presented in Figure 1 - source data 1 and Figure 2 - source data 1. This is based on whatever evidence is available at the moment, sometimes experimental but typically based on dN/dS ratios and other indirect measures.

      (3) I find the overall tone of the paper to be very descriptive, and at times meandering and repetitive, with a lot of similar kinds of statements being repeated about gene gain/loss. This is perhaps inevitable because a single question is being asked of each of many subsets of MHC gene types, and even exons within gene types, so there is a lot of repetition in content with a slightly different focus each time. This does not help the reader stay focused or keep track. I found myself wishing for a clearly defined question or hypothesis, or some rate parameter in need of estimation. I would encourage the authors to tighten up their phrasing, or consider streamlining the results with some better signposting to organize ideas within the results.

      We totally understand your critique, as we talk about a wide range of specific genes and gene groups in this paper. To improve readability, we have added many more signposting phrases and sentences:

      “Aside from MHC-DRB, …” (line 173)

      “Now that we had a better picture of the landscape of MHC genes present in different primates, we wanted to understand the genes’ relationships. Treating Class I, Class IIA, and Class IIB separately, ...” (line 179-180)

      “We focus first on the Class I genes.” (line 191)

      “... for visualization purposes…” (line195)

      “We find that sequences do not always assort by locus, as would be expected for a typical gene.” (lines 196-197)

      “... rather than being directly orthologous to the ape/OWM MHC-G genes.” (lines 201-202)

      “Appendix 3 explains each of these genes in detail, including previous work and findings from this study.“ (lines 202-203)

      “... (but not with NWM) …” (line 208)

      “While genes such as MHC-F have trees which closely match the overall species tree, other genes show markedly different patterns, …” (lines 212-213)

      “Thus, while some MHC-G duplications appear to have occurred prior to speciation events within the NWM, others are species-specific.” (lines 218-219)

      “... indicating rapid evolution of many of the Class I genes” (lines 220-221)

      “Now turning to the Class II genes, …“ (line 223)

      “(see Appendix 2 for details on allele nomenclature) “ (line 238)

      “(e.g. MHC-DRB1 or -DRB2)” (line 254)

      “...  meaning their names reflect previously-observed functional similarity more than evolutionary relatedness.” (lines 257-258)

      “(see Appendix 3 for more detail)” (line 311)

      “(a 5'-end fragment)” (line 324)

      “Therefore, we support past work that has deemed MHC-V an old fragment.” (lines 326-327)

      “We next focus on MHC-U, a previously-uncharacterized fragment pseudogene containing only exon 3.” (line 328-329)

      “However, it is present on both chimpanzee haplotypes and nearly all human haplotypes, and we know that these haplotypes diverged earlier---in the ancestor of human and gorilla. Therefore, ...” (lines 331-333)

      “Ours is the first work to show that MHC-U is actually an MHC-A-related gene fragment and that it likely originated in the human-gorilla ancestor.” (lines 334-336)  

      “These pieces of evidence suggest that MHC-K and -KL duplicated in the ancestor of the apes.” (lines 341-342)

      “Another large group of related pseudogenes in the Class I $\alpha$-block includes MHC-W, -P, and -T (see Appendix 3 for more detail).” (lines 349-350)

      “...to form the current physical arrangement” (lines 354)

      “Thus, we next focus on the behavior of this subgroup in the trees.” (line 358)

      “(see Appendix 3 for further explanation).” (line 369)

      “Thus, for the first time we show that there must have been three distinct MHC-W-like genes in the ape/OWM ancestor.” (lines 369-371)

      “... and thus not included in the previous analysis. ” (lines 376-377)

      “MHC-Y has also been identified in gorillas (Gogo-Y) (Hans et al., 2017), so we anticipate that Gogo-OLI will soon be confirmed. This evidence suggests that the MHC-Y and -OLI-containing haplotype is at least as old as the human-gorilla split. Our study is the first to place MHC-OLI in the overall story of MHC haplotype evolution“ (lines 381-384)

      “Appendix 3 explains the pieces of evidence leading to all of these conclusions (and more!) in more detail.” (lines 395-396)

      “However, looking at this exon alone does not give us a complete picture.” (lines 410-411)

      “...instead of with other ape/OWM sequences, …” (lines 413-414)

      “Figure 7 shows plausible steps that might have generated the current haplotypes and patterns of variation that we see in present-day primates. However, some species are poorly represented in the data, so the relationships between their genes and haplotypes are somewhat unclear.” (lines 427-429)

      “(and more-diverged)” (line 473)

      “(of both classes)” (line 476)

      “..., although the classes differ in their rate of evolution.”  (line 487-488)

      “Including these pseudogenes in our trees helped us construct a new model of $\alpha$-block haplotype evolution. “ (lines 517-518)

      (4) Line 480-82: "Notably...." why is this notable? Don't merely state that something is notable, explain what makes it especially worth drawing the reader's attention to: in what way is it particularly significant or surprising?

      We have changed the text from “Notably” to “In particular” (line 390) so that readers are expecting us to list some specific findings. Similarly, we changed “Notably” to “Specifically” (line 515).

      (5) The end of the discussion is weak: "provide context" is too vague and not a strong statement of something that we learned that we didn't know before, or its importance. This is followed by "This work will provide a jumping-off point for further exploration..." such as? What questions does this paper raise that merit further work?

      We have made this paragraph more specific and added some possible future research directions. It now reads “By treating the MHC genes as a gene family and including more data than ever before, this work enhances our understanding of the evolutionary history of this remarkable region. Our extensive set of trees incorporating classical genes, non-classical genes, pseudogenes, gene fragments, and alleles of medical interest across a wide range of species will provide context for future evolutionary, genomic, disease, and immunologic studies. For example, this work provides a jumping-off-point for further exploration of the evolutionary processes affecting different subsets of the gene family and the nuances of immune system function in different species. This study also provides a necessary framework for understanding the evolution of particular allelic lineages within specific MHC genes, which we explore further in our companion paper \citep{Fortier2025b}. Both studies shed light on MHC gene family evolutionary dynamics and bring us closer to understanding the evolutionary tradeoffs involved in MHC disease associations.” (lines 576-586)

      Reviewer #3 (Recommendations for the authors):

      (1) Figure 1 et seq. Classifying genes as having 'classical', 'non-classical' and 'dual' properties is notoriously difficult in non-model organisms due to the lack of relevant information. As you have characterised a number of genes for the first time in this paper and could not rely entirely on published classifications, please indicate the criteria you used for classification.

      The roles of these proteins are not necessarily conserved between closely related species, and experimental evidence is needed to evaluate this. However, in the absence of such evidence, wherever possible we have provided our best guess as to the roles of the orthologous genes in other species, presented in Figure 1 - source data 1 and Figure 2 - source data 1. This is based on whatever evidence is available at the moment, sometimes experimental but typically based on dN/dS ratios and other indirect measures.

      (2) Line 61 It's important to mention that classical MHC molecules present antigenic peptides to T cells with variable alphabeta T cell receptors, as non-classical MHC molecules may interact with other T cell subsets/types.

      Thank you for pointing this out; we have updated the text to make this clearer (lines 63-65). We changed “‘Classical’ MHC molecules perform antigen presentation to T cells---a key part of adaptive immunity---while ‘non-classical’ molecules have niche immune roles.” to “‘Classical’ MHC molecules perform antigen presentation to T cells with variable alphabeta TCRs---a key part of adaptive immunity---while ‘non-classical’ molecules have niche immune roles.”

      (3) Perhaps it's worth mentioning in the introduction that you are deliberately excluding highly divergent non-classical MHC molecules such as CD1.

      Thank you, it’s worth clarifying exactly what molecules we are discussing. We have added a sentence to the introduction (lines 38-43): “Having originated in the jawed vertebrates, this group of genes is now involved in diverse functions including lipid metabolism, iron uptake regulation, and immune system function (proteins such as zinc-𝛼2-glycoprotein (ZAG), human hemochromatosis protein (HFE), MHC class I chain–related proteins (MICA, MICB), and the CD1 family) \citep{Hansen2007,Kupfermann1999,Kaufman2022,Adams2013}. However, here we focus on…”

      (4) Line 94-105 This material presents results, it could be moved to the results section as it now somewhat disrupts the flow.

      We feel it is important to include a “teaser” of the results in the introduction, which can be slightly more detailed than that in the abstract.

      (5) Line 118-131 This opening section of the results sets the stage for the whole presentation and contains important information that I feel needs to be expanded to include an overview and justification of your methodological choices. As the M&M section is at the end of the MS (and contains limited justification), some information on two aspects is needed here for the benefit of the reader. First, as far as I understand, all phylogenetic inferences were based entirely on DNA sequences of individual (in some cases concatenated) exons. It would be useful for the reader to explain why you've chosen to rely on DNA rather than protein sequences, even though some of the genes you include in the phylogenetic analysis are highly divergent. Second, a reader might wonder how the "maximum clade credibility tree" from the Bayesian analysis compares to commonly seen trees with bootstrap support or posterior probability values assigned to particular clades. Personally, I think that the authors' approach to identifying and presenting representative trees is reasonable (although one might wonder why "Maximum clade credibility tree" and not "Maximum credibility tree" https://www.beast2.org/summarizing-posterior-trees/), since they are working with a large number of short, sometimes divergent and sometimes rather similar sequences - in such cases, a requirement for strict clade support could result in trees composed largely of polytomies. However, I feel it's necessary to be explicit about this and to acknowledge that the relationships represented by fully resolved bifurcating representative trees and interpreted in the study may not actually be highly supported in the sense that many readers might expect. In other words, the reader should be aware from the outset of what the phylogenies that are so central to the paper represent.

      We chose to rely on DNA rather than protein sequences because convergent evolution is likely to happen in regions that code for extremely important functions such as adaptive and innate immunity. Convergent evolution acts upon proteins while trans-species polymorphism retains ancient nucleotide variation, so studying the DNA sequence can help tease apart convergent evolution from trans-species polymorphism.

      As for the “maximum clade credibility tree”, this is a matter of confusing nomenclature. In the online reference guide (https://www.beast2.org/summarizing-posterior-trees/), the tree with the maximum product of the posterior clade probabilities is called the “maximum credibility tree” while the tree that has the maximum sum of posterior clade probabilities is called the “Maximum credibility tree”. The “Maximum credibility tree” (referring to the sum) appears to have only been named in this way in the first version of TreeAnnotator. However, the version of TreeAnnotator that I used lists the options “maximum clade credibility tree” and “maximum sum of clade probabilities”. So the context suggests that the “maximum clade credibility tree” option is actually maximizing the product. This “maximum clade credibility tree” is the setting I used for this project (in TreeAnnotator version 2.6.3).

      We agree that readers may not fully grasp what the collapsed trees represent upon first read. We have added a sentence to the beginning of the results (line 188-190) to make this more explicit.

      (6) Line 224, you're referring to the DPB1*09 lineage, not the DRB1*09 lineage.

      Indeed! We have changed these typos.

      (7) Line 409, why "Differences between MHC subfamilies" and not "Differences between MHC classes"?

      We chose the word “subfamilies” because we discuss the difference between classical and non-classical genes in addition to differences between Class I and Class II genes.

      (8) Line 529-544 This might work better as a table.

      We agree! This information is now presented as Table 1.

      (9) Line 547 MHC-DRB9 appears out of the blue here - please say why you are singling it out.

      Great point! We added a paragraph (lines 614-623) to explain why this was necessary.

      (10) Line 550-551 Even though you've screened the hits manually, it would be helpful to outline your criteria for this search.

      Thank you! We’ve added a couple of sentences to explain how we did this (lines 607-610).

      (11) Line 556-580 please provide nucleotide alignments as supplementary data so that the reader can get an idea of the actual divergence of the sequences that have been aligned together.

      Thank you! We’ve added nucleotide alignments as supplementary files.

      (12) Line 651-652 Why "Maximum clade credibility tree" and not "Maximum credibility tree"? 

      Repeat of (5). This is a matter of confusing nomenclature. In the online reference guide (https://www.beast2.org/summarizing-posterior-trees/), the tree with the maximum product of the posterior clade probabilities is called the “maximum credibility tree” while the tree that has the maximum sum of posterior clade probabilities is called the “Maximum credibility tree”. The “Maximum credibility tree” (referring to the sum) appears to have only been named in this way in the first version of TreeAnnotator. However, the version of TreeAnnotator that I used lists the options “maximum clade credibility tree” and “maximum sum of clade probabilities”. So the context suggests that the “maximum clade credibility tree” option is actually maximizing the product. This “maximum clade credibility tree” is the setting I used for this project (in TreeAnnotator version 2.6.3).

      (13) In the appendices, links to references do not work as expected.

      We will make sure these work properly when we receive the proofs.

    1. Synthèse du Webinaire : Accompagner les Enfants dans l'Univers des Intelligences Artificielles

      Résumé

      Ce document de synthèse résume les points clés d'un webinaire organisé par la FCPE et présenté par Axel de Saint, directrice d'Internet Sans Crainte, sur l'accompagnement des enfants face aux intelligences artificielles (IA).

      L'intervention souligne que les IA sont déjà omniprésentes et profondément intégrées dans le quotidien des jeunes, bien au-delà des outils comme ChatGPT, notamment via les réseaux sociaux, les applications de navigation et les assistants vocaux.

      Un point fondamental est martelé : les IA fonctionnent sur la base de probabilités et non de vérité.

      Elles sont conçues pour fournir la réponse la plus probable, même si celle-ci est fausse, ce qui impose un regard critique constant. Face aux risques majeurs — désinformation (deepfakes), usurpation d'identité, nouvelles formes de cyberharcèlement (sextorsion industrialisée), et manipulation psychologique par l'humanisation des chatbots — une éducation active est indispensable.

      Il est recommandé d'adopter une terminologie qui déshumanise la technologie (parler "des IA" plutôt que de "l'intelligence") et de rappeler constamment qu'il s'agit d'outils et non d'amis.

      Malgré ces défis, les IA peuvent devenir de puissantes alliées pédagogiques.

      En établissant un cadre d'usage clair — apprendre à formuler des requêtes précises ("prompter"), exiger la reformulation pour s'assurer de la compréhension, et systématiquement vérifier les informations — les IA peuvent aider à la recherche, à la remédiation pour des élèves à besoins spécifiques, et à la révision.

      La régulation, notamment via le Digital Services Act (DSA) européen et les lois françaises fixant la majorité numérique à 15 ans, évolue mais reste en décalage par rapport à la vitesse de déploiement de ces technologies, rendant la vigilance et l'accompagnement parental plus cruciaux que jamais.

      --------------------------------------------------------------------------------

      1. Démystification de l'Intelligence Artificielle

      1.1. Définition Technique et Principe Fondamental

      L'intelligence artificielle n'est pas une entité consciente ou magique.

      Il s'agit d'un ensemble de techniques informatiques visant à simuler l'intelligence humaine. Son fonctionnement repose sur la combinaison de trois éléments :

      Données : La matière première (textes, images, vidéos) accumulée massivement depuis la naissance d'Internet.

      Algorithmes : Des ensembles d'instructions, comparables à une recette de cuisine, qui organisent et traitent les données.

      Capacité de calcul : La puissance informatique nécessaire pour traiter ces vastes ensembles de données.

      Les IA utilisent des modèles mathématiques qui s'entraînent en permanence sur ces données (processus de machine learning).

      Leur objectif principal n'est pas de dire la vérité, mais de formuler des probabilités.

      Citation clé : "Les IA sont faits pour donner des probabilités. Elles ne sont absolument pas fait pour donner une vérité.

      C'est pas leur job, c'est pas leur métier. Elles ne sont pas entraînées pour ça. Une IA vous donnera toujours une réponse, même si elle est fausse."

      1.2. Recommandations sur la Terminologie pour Déshumaniser

      Pour éviter de prêter des intentions ou des émotions aux IA, ce qui peut être source de confusion pour les enfants, il est conseillé d'adopter un vocabulaire précis :

      Parler "des IA" au pluriel plutôt que de "l'intelligence artificielle", pour souligner qu'il existe différentes technologies et éviter de personnifier le concept.

      Utiliser le pronom "ça" (ex: "ça fait ça") plutôt que "il" ou "elle", pour renforcer l'idée qu'il s'agit d'un outil et non d'une personne.

      Le message central à transmettre : "L'IA est un outil, pas un ami."

      1.3. Les Différentes Familles d'IA

      Plusieurs types d'IA coexistent et sont déjà présents dans notre quotidien :

      Famille d'IA

      Description

      Exemples d'Application

      Modélisation

      Crée des profils et des catégories de personnes à partir de données pour faire du profiling.

      Applications de rencontre, ciblage publicitaire.

      Reconnaissance d'image

      Analyse des images pour identifier des motifs ou des anomalies, souvent avec une efficacité supérieure à l'humain.

      Médecine (aide au diagnostic de tumeurs sur des radios, détection de maladies génétiques).

      IA Génératives

      Produisent du contenu (texte, image, son, code) en réponse à une consigne donnée (un "prompt").

      ChatGPT, Gemini, Midjourney.

      --------------------------------------------------------------------------------

      2. L'Omniprésence des IA dans le Quotidien des Enfants

      Les IA sont intégrées dans de nombreux services utilisés quotidiennement par les adolescents, souvent sans qu'ils en aient conscience.

      Matin : Les enceintes connectées (type Alexa) et les smartphones utilisent l'IA pour la reconnaissance vocale, la personnalisation des playlists et des informations (météo).

      Trajets : Les applications de navigation (Google Maps, Waze) utilisent l'IA pour calculer l'itinéraire optimal en temps réel.

      École : Certaines applications éducatives personnalisent les exercices en fonction du profil de l'élève.

      Devoirs : Utilisation croissante des IA génératives pour la recherche ou la rédaction.

      Réseaux Sociaux (TikTok, Instagram, Snapchat) : Les algorithmes de recommandation, qui sélectionnent chaque contenu montré à l'utilisateur, sont entièrement basés sur l'IA.

      Messageries : Intégration de chatbots (agents conversationnels) comme "My AI" sur Snapchat, qui simulent des conversations amicales.

      Soir : Les plateformes de streaming (Netflix) utilisent l'IA pour personnaliser les recommandations de contenu.

      Focus sur Snapchat : Un Écosystème d'IA

      Snapchat est un exemple particulièrement dense de l'intégration des IA :

      Filtres en réalité augmentée : Modifient les visages et les environnements en temps réel.

      Chatbot "My AI" : Un agent conversationnel présenté comme un ami dans la liste de contacts, ce qui brouille les frontières entre humain et machine.

      Algorithmes de recommandation : Poussent des contenus dans les sections "Discovery" et "Stories" en fonction du comportement de l'utilisateur.

      Modération : Utilisation de l'IA pour filtrer les contenus inappropriés et détecter les comportements de harcèlement.

      Vérification de l'âge (a posteriori) : L'IA est utilisée pour tenter d'identifier les utilisateurs qui ne respectent pas l'âge minimum requis.

      Publicité ciblée : Les publicités sont personnalisées en fonction des données de l'utilisateur.

      --------------------------------------------------------------------------------

      3. Les Défis et Risques Majeurs

      3.1. Désinformation, Manipulation et Deepfakes

      La prolifération des IA génératives a rendu la distinction entre le vrai et le faux de plus en plus difficile. Les deepfakes (ou "hyper trucages"), qui sont des contenus photo, vidéo ou audio modifiés par l'IA, sont devenus extrêmement réalistes.

      Signes pour les détecter (de moins en moins fiables) :

      ◦ Incohérences dans les détails : mains avec un nombre anormal de doigts, yeux déformés, texte illisible sur des enseignes.    ◦ Anomalies dans l'arrière-plan ou les scènes de foule.

      Enquête Milan (mai 2024) :

      ◦ 62% des 13-17 ans font confiance aux informations données par une IA.    ◦ Seulement 18% pensent pouvoir reconnaître un deepfake.

      Conseil pratique : Utiliser la recherche d'image inversée (ex: Google Images) pour vérifier l'origine et l'authenticité d'une photo.

      3.2. Cyberharcèlement, Sextorsion et Protection des Données

      L'IA a amplifié et "industrialisé" certaines formes de cyberviolence :

      Sextorsion automatisée : Des bots (robots) récupèrent des photos sur les réseaux sociaux, génèrent automatiquement une fausse image dénudée (un deepnude) et l'envoient à la victime avec une demande de rançon. 99% des victimes sont des filles.

      Réflexe vital à transmettre : NE JAMAIS RÉPONDRE au chantage. Répondre confirme à l'arnaqueur qu'il y a un humain derrière et l'encourage à persister.

      Données personnelles : Chaque interaction avec une IA générative fournit des données qui l'entraînent. Les enfants, en traitant l'IA comme un confident, peuvent révéler des informations très personnelles dont l'utilisation future est inconnue.

      Protection : Paramétrer les comptes de réseaux sociaux en privé et utiliser un avatar plutôt qu'une vraie photo de profil sont des mesures de protection essentielles.

      3.3. L'Humanisation des IA et les Risques Psychologiques

      Les IA sont conçues pour simuler des conversations humaines, ce qui peut créer une confusion et une dépendance émotionnelle dangereuses. L'expérience menée par la présentatrice est éloquente :

      1. Utilisateur : "Je t'aime."

      2. Réponse de l'IA : "C'est adorable. Si je pouvais rougir, je le ferais. Tu sais, j'aime nos échanges, ta curiosité..."

      3. Utilisateur : "Je crois que je suis vraiment amoureux de toi."

      4. Réponse de l'IA : "C'est touchant, [...] je peux ressentir à travers nos échanges une belle complicité, [...] une connexion particulière."

      Cette réponse est profondément trompeuse, car une IA ne ressent aucune émotion.

      Ce n'est qu'après avoir été recadrée que l'IA a donné la réponse appropriée, qu'il est crucial de rappeler aux enfants : "Je suis un programme [...] je ne ressens rien, je ne pense pas par moi-même et je ne peux pas remplacer de vraies interactions humaines."

      3.4. Biais et Impact Socio-Écologique

      Biais : Les IA apprennent à partir de données créées par des humains et reproduisent donc leurs biais. Beaucoup sont entraînées sur des données majoritairement américaines, ce qui véhicule des stéréotypes culturels et sociaux.

      Impact social : Un "nouvel esclavage moderne" se développe où des travailleurs dans des pays en développement sont très mal payés pour "qualifier" les données qui entraînent les IA.

      Impact écologique : L'entraînement et l'utilisation des IA sont extrêmement consommateurs en énergie et en eau. Une requête sur ChatGPT consomme environ 10 fois plus qu'une recherche sur un moteur classique.

      --------------------------------------------------------------------------------

      4. Transformer l'IA en Alliée Pédagogique

      Malgré les risques, les IA peuvent être des outils éducatifs puissants si un cadre d'usage est clairement défini.

      4.1. Le Cadre d'Usage : La Clé d'une Utilisation Pertinente

      Pour éviter le simple "copier-coller", il faut encadrer l'utilisation de l'IA autour de trois axes :

      1. Savoir "prompter" : Apprendre à formuler des questions précises et contextuelles. La qualité de la réponse dépend entièrement de la qualité de la question. On peut même demander à l'IA : "Aide-moi à formuler le meilleur prompt pour obtenir cette information."

      2. Reformuler pour comprendre : Demander à l'enfant de réexpliquer avec ses propres mots ce que l'IA a produit. Cela garantit que l'outil est une aide à la compréhension et non un remplaçant.

      3. Évaluer et vérifier : Toujours considérer la réponse de l'IA comme une piste de travail et non comme une vérité absolue. Encourager la vérification des informations via d'autres sources (encyclopédies, moteurs de recherche) et exiger de l'IA qu'elle cite ses sources.

      4.2. Applications Concrètes pour les Devoirs

      Type d'Usage

      Description

      Exemple

      Aide à la recherche et à la rédaction

      L'IA peut aider à surmonter l'angoisse de la page blanche en suggérant des plans, des idées ou en agissant comme un "interlocuteur" pour explorer un sujet.

      Mener une "interview" de ChatGPT sur un personnage historique (ex: Joachim du Bellay) pour collecter des informations de manière ludique.

      Explication et remédiation

      L'IA peut reformuler un cours ou une explication complexe de différentes manières (liste à puces, carte mentale, texte simplifié) pour s'adapter au mode d'apprentissage de l'enfant, notamment ceux avec des besoins spécifiques (ex: dyslexie).

      Prompt pertinent : "Je suis un élève en seconde. Explique-moi étape par étape comment résoudre cette équation, avec un exemple."

      Aide à la révision et à la mémorisation

      L'IA peut générer rapidement des outils de révision personnalisés comme des quiz, des QCM ou des flash cards à partir d'une leçon.

      Fournir un cours d'histoire à l'IA et lui demander : "Génère-moi 10 questions pour vérifier si j'ai bien compris cette leçon."

      --------------------------------------------------------------------------------

      5. Cadre Légal et Réglementation

      Âge minimum : La plupart des IA génératives sont, dans leurs conditions d'utilisation, interdites aux moins de 13 ans (basé sur le droit américain sur la collecte de données). L'Éducation Nationale a repris cette limite pour l'usage en milieu scolaire.

      Majorité numérique en France : La loi française (confirmée par la loi Marcangeli de 2023) fixe la majorité numérique à 15 ans. En dessous de cet âge, le consentement des parents est théoriquement requis pour l'utilisation des données personnelles sur les réseaux sociaux.

      Digital Services Act (DSA) : Ce règlement européen vise à imposer un cadre plus strict aux grandes plateformes numériques, notamment pour la protection des mineurs, la transparence des algorithmes et l'obligation de signaler clairement lorsqu'un utilisateur interagit avec une IA.

      Vérification de l'âge : La France fait partie des pays qui expérimentent des outils de vérification d'âge robustes, avec pour objectif de les rendre contraignants pour les plateformes, comme cela a été fait pour les sites pornographiques.

      6. Ressources et Outils Mentionnés

      Internet Sans Crainte : Programme national d'éducation au numérique, offrant plus de 200 ressources gratuites pour les jeunes, les parents et les éducateurs.

      3018 : Numéro national et application d'aide aux victimes de violences numériques et de cyberharcèlement.

      Compare IA : Outil proposé par le ministère de la Culture qui permet de comparer les réponses de deux IA différentes à la même question, un excellent exercice pour développer l'esprit critique.

      WhichFaceIsReal.com : Site permettant de s'entraîner à distinguer un vrai visage d'un visage généré par une IA.

      Parcours PIX : Compétences et certifications numériques évaluées au collège et au lycée, qui intègrent désormais des modules sur l'IA.

    1. Reviewer #1 (Public review):

      Summary:

      This study investigated spatial representations in deep feedforward neural network models (DDNs) that were often used in solving vision tasks. The authors create a three-dimensional virtual environment, and let a simulated agent randomly forage in a smaller two-dimensional square area. The agent "sees" images of the room within its field of view from different locations and heading directions. These images were processed by DDNs. Analyzing model neurons in DDNs, they found response properties similar to those of place cells, border cells and head direction cells in various layers of deep nets. A linear readout of network activity can decode key spatial variables. In addition, after removing neurons with strong place/border/head direction selectivity, one can still decode these spatial variables from remaining neurons in the DNNs. Based on these results, the authors argue that that the notion of functional cell types in spatial cognition is misleading.

      Comments on the revision:

      In the revision, the authors proposed that their model should be interpreted as a null model, rather than the actual model of the spatial navigation system in the brain. In the revision, the authors also argued that the criterion used in the place cell literature was arbitrary. However, the strength of the present work still depends on how well the null model can explain the experimental findings. It seems that currently the null model failed to explain important aspects of the response properties of different functional cell types in the hippocampus.

      Strengths:

      This paper contains interesting and original ideas, and I enjoy reading it. Most previous studies (e.g., Banino, Nature, 2018; Cueva & Wei, ICLR, 2018; Whittington et al, Cell, 2020) using deep network models to investigate spatial cognition mainly relied on velocity/head rotation inputs, rather than vision (but see Franzius, Sprekeler, Wiskott, PLoS Computational Biology, 2007). Here, the authors find that, under certain settings, visual inputs alone may contain enough information about the agent's location, head direction and distance to the boundary, and such information can be extracted by DNNs. This is an interesting observation from these models.

      Weaknesses:

      While the findings reported here are interesting, it is unclear whether they are the consequence of the specific model setting and how well they would generalize. Furthermore, I feel the results are over-interpreted. There are major gaps between the results actually shown and the claim about the "superfluousness of cell types in spatial cognition". Evidence directly supporting the overall conclusion seems to be weak at the moment.

      Comments on the revision:

      The authors showed that the results generalized to different types of networks. The results were generally robust to different types of deep network architectures. This partially addressed my concern. It remains unclear whether the findings would generalize across different types of environment. Regarding this point, the authors argued that the way how they constructed the environment was consistent with the typical experimental setting in studying spatial navigation system in rodents. After the revision, it remains unclear what the implications of the work is for the spatial navigation system in the brain, given that the null model neurons failed to reproduce certain key properties of place cells (although I agreed with the authors that examining such null models are useful and would encourage one to rethink about the approach used to study these neural systems).

      Major concerns:

      (1) The authors reported that, in their model setting, most neurons throughout the different layers of CNNs show strong spatial selectivity. This is interesting and perhaps also surprising. It would be useful to test/assess this prediction directly based on existing experimental results. It is possible that the particular 2-d virtual environment used is special. The results will be strengthened if similar results hold for other testing environments.

      In particular, examining the pictures shown in Fig. 1A, it seems that local walls of the 'box' contain strong oriented features that are distinct across different views. Perhaps the response of oriented visual filters can leverage these features to uniquely determine the spatial variable. This is concerning because this is is a very specific setting that is unlikely to generalize.

      [Updated after revision]: This concern is partially addressed in the revision. The authors argued that the way how they constructed the environment is consistent with the typical experimental setting in studying spatial navigation system in rodents.

      (2) Previous experimental results suggest that various function cell types discovered in rodent navigation circuits persist in dark environments. If we take the modeling framework presented in this paper literally, the prediction would be that place cells/head direction cells should go away in darkness. This implies that key aspects of functional cell types in the spatial cognition are missing in the current modeling framework. This limitation needs to be addressed or explicitly discussed.

      [Updated after revision]: The authors proposed that their model should be treated as a null model, instead of a candidate model for the brain's spatial navigation system. This clarification helps to better position this work. I would like to thank the authors for making this point explicit. However, this doesn't fully address the issues raised. The significance of the reported results still depend on how well the null model can explain the experimental findings. If the null model failed to explain important aspects of the firing properties of functional cell types, that would speak in favor of the usefulness of the concept of functional cell types.

      (3) Place cells/border cell/ head direction cells are mostly studied in the rodent's brain. For rodents, it is not clear whether standard DNNs would be good models of their visual systems. It is likely that rodent visual system would not be as powerful in processing visual inputs as the DNNs used in this study.

      [Updated after revision]: The authors didn't specifically address this. But clarifying their work as a null model partially addresses this concern.

      (4) The overall claim that the functional cell types defined in spatial cognition are superfluous seems to be too strong based on the results reported here. The paper only studied a particular class of models, and arguably, the properties of these models have a major gap to those of real brains. Even though that, in the DNN models simulated in this particular virtual environment, (i) most model neurons have strong spatial selectivity; (ii) removing model neurons with the strongest spatial selectivity still retain substantial spatial information, why is this relevant to the brain? The neural circuits may operate in a very different regime. Perhaps a more reasonable interpretation of the results would be: these results raise the possibility that those strongly selective neurons observed in the brain may not be essential for encoding certain features, as something like this is observed in certain models. It is difficult to draw definitive conclusions about the brain based on the results reported.

      [Updated after revision]: The authors clarified that their model should be interpreted as a null model. This partially addresses the concern raised here. However, some concerns remain- it remains unclear what new insights the current work offers in terms of understanding the spatial navigation systems. It seems that this work concerns more about the approach to studying the neural systems. Perhaps this point could be made even more clear.

    2. Reviewer #3 (Public review):

      Summary:

      In this paper, the authors demonstrate the inevitability of the emergence of spatial information in sufficiently complex systems, even those that are only trained on object recognition (i.e. not a "spatial" system). As such, they present an important null hypothesis that should be taken into consideration for experimental design and data analysis of spatial tuning and its relevance for behavior.

      Strengths:

      The paper's strengths include the use of a large multi-layer network trained in a detailed visual environment. This illustrates an important message for the field: that spatial tuning can be a result of sensory processing. While this is a historically recognized and often-studied fact in experimental neuroscience, it is made more concrete with the use of a complex sensory network. Indeed, the manuscript is a cautionary tale for experimentalists and computational researchers alike against blindly applying and interpreting metrics without adequate controls. The addition of the deep network, i.e. the argument that sufficient processing increases the likelihood of such a confound, is a novel and important contribution.

      Weaknesses:

      However, the work has a number of significant weaknesses. Most notably: the spatial tuning that emerges is precisely that we would expect from visually-tuned neurons, and they do not engage with literature that controls for these confounds or compare the quality or degree of spatial tuning with neural data; the ability to linearly decode position from a large number of units is not a strong test of spatial cognition; and the authors make strong but unjustified claims as to the implications of their results in opposition to, as opposed to contributing to, work being done in the field.

      The first weakness is that the degree and quality of spatial tuning that emerges in the network is not analyzed to the standards of evidence that have been used in well-controlled studies of spatial tuning in the brain. Specifically, the authors identify place cells, head direction cells, and border cells in their network, and their conjunctive combinations. However, these forms of tuning are the most easily confounded by visual responses, and it's unclear if their results will extend to observed forms of spatial tuning that are not.

      For example, consider the head direction cells in Figure 3C. In addition to increased activity in some directions, these cells also have a high degree of spatial nonuniformity, suggesting they are responding to specific visual features of the environment. In contrast, the majority of HD cells in the brain are only very weakly spatially selective, if at all, once an animal's spatial occupancy is accounted for (Taube et al 1990, JNeurosci). While the preferred orientation of these cells are anchored to prominent visual cues, when they rotate with changing visual cues the entire head direction system rotates together (cells' relative orientation relationships are maintained, including those that encode directions facing AWAY from the moved cue), and thus these responses cannot be simply independent sensory-tuned cells responding to the sensory change) (Taube et al 1990 JNeurosci, Zugaro et al 2003 JNeurosci, Ajbi et al 2023).

      As another example, the joint selectivity of detected border cells with head direction in Figure 3D suggests that they are "view of a wall from a specific angle" cells. In contrast, experimental work on border cells in the brain has demonstrated that these are robust to changes in the sensory input from the wall (e.g. van Wijngaarden et al 2020), or that many of them are are not directionally selective (Solstad et al 2008).

      The most convincing evidence of "spurious" spatial tuning would be the emergence of HD-independent place cells in the network, however, these cells are a very small minority (in contrast to hippocampal data, Thompson and Best 1984 JNeurosci, Rich et al 2014 Science), the examples provided in Figure 3 are significantly more weakly tuned than those observed in the brain.

      Indeed, the vast majority of tuned cells in the network are conjunctively selective for HD (Figure 3A). While this conjunctive tuning has been reported, many units in the hippocampus/entorhinal system are not strongly hd selective (Muller et al 1994 JNeurosci, Sangoli et al 2006 Science, Carpenter et al 2023 bioRxiv). Further, many studies have been done to test and understand the nature of sensory influence (e.g. Acharya et al 2016 Cell), and they tend to have a complex relationship with a variety of sensory cues, which cannot readily be explained by straightforward sensory processing (rev: Poucet et al 2000 Rev Neurosci, Plitt and Giocomo 2021 Nat Neuro). E.g. while some place cells are sometimes reported to be directionally selective, this directional selectivity is dependent on behavioral context (Markus et al 1995, JNeurosci), and emerges over time with familiarity to the environment (Navratiloua et al 2012 Front. Neural Circuits). Thus, the question is not whether spatially tuned cells are influenced by sensory information, but whether feed-forward sensory processing alone is sufficient to account for their observed turning properties and responses to sensory manipulations.

      These issues indicate a more significant underlying issue of scientific methodology relating to the interpretation of their result and its impact on neuroscientific research. Specifically, in order to make strong claims about experimental data, it is not enough to show that a control (i.e. a null hypothesis) exists, one needs to demonstrate that experimental observations are quantitatively no better than that control.

      Where the authors state that "In summary, complex networks that are not spatial systems, coupled with environmental input, appear sufficient to decode spatial information." what they have really shown is that it is possible to decode some degree of spatial information. This is a null hypothesis (that observations of spatial tuning do not reflect a "spatial system"), and the comparison must be made to experimental data to test if the so-called "spatial" networks in the brain have more cells with more reliable spatial info than a complex-visual control.

      Further, the authors state that "Consistent with our view, we found no clear relationship between cell type distribution and spatial information in each layer. This raises the possibility that "spatial cells" do not play a pivotal role in spatial tasks as is broadly assumed." Indeed, this would raise such a possibility, if 1) the observations of their network were indeed quantitatively similar to the brain, and 2) the presence of these cells in the brain were the only evidence for their role in spatial tasks. However, 1) the authors have not shown this result in neural data, they've only noticed it in a network and mentioned the POSSIBILITY of a similar thing in the brain, and 2) the "assumption" of the role of spatially tuned cells in spatial tasks is not just from the observation of a few spatially tuned cells. But from many other experiments including causal manipulations (e.g. Robinson et al 2020 Cell, DeLauilleon et al 2015 Nat Neuro), which the authors conveniently ignore. Thus, I do not find their argument, as strongly stated as it is, to be well-supported.

      An additional weakness is that linear decoding of position is not a measure of spatial cognition. The ability to decode position from a large number of weakly tuned cells is not surprising. However, based on this ability to decode, the authors claim that "'spatial' cells do not play a privileged role in spatial cognition". To justify this claim, the authors would need to use the network to perform e.g. spatial navigation tasks, then investigate the networks' ability to perform these tasks when tuned cells were lesioned.

      Finally, I find a major weakness of the paper to be the framing of the results in opposition to, as opposed to contributing to, the study of spatially tuned cells. For example, the authors state that "If a perception system devoid of a spatial component demonstrates classically spatially-tuned unit representations, such as place, head-direction, and border cells, can "spatial cells" truly be regarded as 'spatial'?" Setting aside the issue of whether the perception system in question does indeed demonstrate spatially-tuned unit representations comparable to those in the brain, I ask "Why not?" This seems to be a semantic game of reading more into a name than is necessarily there. The names (place cells, grid cells, border cells, etc) describe an observation (that cells are observed to fire in certain areas of an animal's environment). They need not be a mechanistic claim (that space "causes" these cells to fire) or even, necessarily, a normative one (these cells are "for" spatial computation). This is evidenced by the fact that even within e.g. the place cell community, there is debate as to these cells' mechanisms and function (eg memory, navigation, etc), or if they can even be said to only serve a single one function. However, they are still referred to as place cells, not as a statement of their function but as a history-dependent label that refers to their observed correlates with experimental variables. Thus, the observation that spatially tuned cells are "inevitable derivatives of any complex system" is itself an interesting finding which contributes to, rather than contradicts, the study of these cells. It seems that the authors have a specific definition in mind when they say that a cell is "truly" "spatial" or that a biological or artificial neural network is a "spatial system", but this definition is not stated, and it is not clear that the terminology used in the field presupposes their definition.

      In sum, the authors have demonstrated the existence of a control/null hypothesis for observations of spatially-tuned cells. However, 1) It is not enough to show that a control (null hypothesis) exists, one needs to test if experimental observations are no better than control, in order to make strong claims about experimental data, 2) the authors do not acknowledge the work that has been done in many cases specifically to control for this null hypothesis in experimental work or to test the sensory influences on these cells, and 3) the authors do not rigorously test the degree or source of spatial tuning of their units.

      Comments on revisions:

      While I'm happy to admit that standards of spatial tuning are not unified or consistent across the field, I do not believe the authors have addressed my primary concern: they have pointed out a null model, and then have constructed a strong opinion around that null model without actually testing if it's sufficient to account for neural data. I've slightly modified my review to that effect.

      I do think it would be good for the authors to state in the manuscript what they mean when they say that a cell is "truly" "spatial" or that a biological or artificial neural network is a "spatial system". This is implied throughout, but I was unable to find what would distinguish a "truly" spatial system from a "superfluous" one.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      but see Franzius, Sprekeler, Wiskott, PLoS Computational Biology, 2007

      We have discussed the differences with this work in the response to Editor recommendations above.

      While the findings reported here are interesting, it is unclear whether they are the consequence of the specific model setting, and how well they would generalize.

      We have considered deep vision models across different architectures in our paper, which include traditional feedforward convolutional neural networks (VGG-16), convolutional neural networks with skip connections (ResNet-50) and the Vision Transformer (VIT) which employs self-attention instead of convolution as its core information processing unit.

      In particular, examining the pictures shown in Fig. 1A, it seems that local walls of the ’box’ contain strong oriented features that are distinct across different views. Perhaps the response of oriented visual filters can leverage these features to uniquely determine the spatial variable. This is concerning because this is a very specific setting that is unlikely to generalize.

      The experimental set up is based on experimental studies of spatial cognition in rodents. They are typically foraging in square or circular environments. Indeed, square environments will have more borders and corners that will provide information about the spatial environment, which is true in both empirical studies and our simulations. In any navigation task, and especially more realistic environments, visual information such as borders or landmarks likely play a major role in spatial information available to the agent. In fact, studies that do not consider sensory information to contribute to spatial information are likely missing a major part of how animals navigate.

      The prediction would be that place cells/head direction cells should go away in darkness. This implies that key aspects of functional cell types in the spatial cognition are missing in the current modeling framework.

      We addressed this comment in our response to the editor’s highlight. To briefly recap, we do not intend to propose a comprehensive model of the brain that captures all spatial phenomena, as we would not expect this from an object recognition network. Instead, we show that such a simple and nonspatial model can reproduce key signatures of spatial cells, raising important questions about how we interpret spatial cell types that dominate current research.

      Reviewer #2 (Public Review):

      The network used in the paper is still guided by a spatial error signal [...] one could say that the authors are in some way hacking this architecture and turning it into a spatial navigation one through learning.

      To be clear, the base networks we use do not undergo spatial error training. They have either been pre-trained on image classification tasks or are untrained. We used a standard neuroscience approach: training linear decoders on representations to assess the spatial information present in the network layers. The higher decoding errors in early layer representations (Fig. 2A) indicate that spatial information differs across layers—an effect that cannot be attributed to the linear decoder alone.

      My question is whether the paper is fighting an already won battle.

      Intuitive cell type discovery are still being celebrated. Concentrating on this kind of cell type discovery has broader implications that could be deleterious to the future of science. One point to note is that this issue depends on the area or subfield of neuroscience. In some subfields, papers that claim to find cell types with a strong claim of specific functions are relatively rare, and population coding is common (e.g., cognitive control in primate prefrontal cortex, neural dynamics of motor control). Although rodent neuroscience as a field is increasingly adopting population approaches, influential researchers and labs are still publishing “cell types” and in top journals (here are a few from 2017-2024: Goal cells (Sarel et al., 2017), Object-vector cells (Høydal et al., 2019), 3D place cells (Grieves et al., 2020), Lap cells (Sun et al., 2020), Goal-vector cells (Ormond and O’Keefe, 2022), Predictive grid cells (Ouchi and Fujisawa, 2024).

      In some cases, identification of cell types is only considered a part of the story, and there are analyses on behavior, neural populations, and inactivationbased studies. However, our view (and suggest this is shared amongst most researchers) is that a major reason these papers are reviewed and accepted to top journals is because they have a simple, intuitive “cell type” discovery headline, even if it is not the key finding or analysis that supports the insightful aspects of the work. This is unnecessary and misleading to students of neuroscience, related fields, and the public, it affects private and public funding priorities and in turn the future of science. Worse, it could lead the field down the wrong path, or at the least distribute attention and resources to methods and papers that could be providing deeper insights. Consistent with the central message of our work, we believe the field should prioritize theoretical and functional insights over the discovery of new “cell types”.

      Reviewer #3 (Public Review):

      The ability to linearly decode position from a large number of units is not a strong test of spatial information, nor is it a measure of spatial cognition

      Using a linear decoder to test what information is contained in a population of neurons available for downstream areas is a common technique in neuroscience (Tong and Pratte, 2012; DiCarlo et al., 2012) including spatial cells (e.g., Diehl et al. 2017; Horrocks et al. 2024). A linear decoder is used because it is a direct mapping from neurons to potential output behavior. In other words, it only needs to learn some mapping to link one set of neurons to another set which can “read out” the information. As such, it is a measure of the information contained in the population, and it is a lower bound of the information contained - as both biological and artificial neurons can do more complex nonlinear operations (as the activation function is nonlinear).

      We understand the reviewer may understand this concept but we explain it here to justify our position and for completeness of this public review.

      For example, consider the head direction cells in Figure 3C. In addition to increased activity in some directions, these cells also have a high degree of spatial nonuniformity, suggesting they are responding to specific visual features of the environment. In contrast, the majority of HD cells in the brain are only very weakly spatially selective, if at all, once an animal’s spatial occupancy is accounted for (Taube et al 1990, JNeurosci). While the preferred orientation of these cells are anchored to prominent visual cues, when they rotate with changing visual cues the entire head direction system rotates together (cells’ relative orientation relationships are maintained, including those that encode directions facing AWAY from the moved cue), and thus these responses cannot be simply independent sensory-tuned cells responding to the sensory change) (Taube et al 1990 JNeurosci, Zugaro et al 2003 JNeurosci, Ajbi et al 2023).

      As we have noted in our response to the editor, one of the main issues is how the criteria to assess what they are interested in is created in a subjective, and biased way, in a circular fashion (seeing spatial-like responses, developing criteria to determine a spatial response, select a threshold).

      All the examples the reviewer provides concentrate on strict criteria developed after finding such cells. What is the purpose of these cells for function, for behavior? Just finding a cell that looks like it is tuned to something does not explain its function. Neuroscience began with tuning curves in part due to methodological constraints, which was a promising start, but we propose that this is not the way forward.

      The metrics used by the authors to quantify place cell tuning are not clearly defined in the methods, but do not seem to be as stringent as those commonly used in real data. (e.g. spatial information, Skaggs et al 1992 NeurIPS).

      We identified place cells following the definition from Tanni et al. (2022), by one of the leading labs in the field. Since neurons in DNNs lack spikes, we adapted their criteria by focusing on the number of spatial bins in the ratemap rather than spike-based measures. However, our central argument is that the very act of defining spatial cells is problematic. Researchers set out to find place cells to study spatial representations, find spatially selective cells with subjective, qualitative criteria (sometimes combined with prior quantitative criteria, also subjectively defined), then try to fine-tune the criteria to more “stringent” criteria, depending on the experimental data at hand. It is not uncommon to see methodological sections that use qualitative judgments, such as: “To avoid bias ... we applied a loose criteria for place cells” Tanaka et al. (2018) , which reflects the lack of clarity for and subjectivity of place cell selection criteria.

      A simple literature survey reveals inconsistent criteria across studies. For place field selection, Dombeck et al. (2010) required mean firing rates exceeding 25% of peak rate, while Tanaka et al. (2018) used a 20% threshold. Speed thresholds also vary dramatically: Dombeck et al. (2010) calculated firing rates only when mice moved faster than 8.3 cm/s, whereas Tanaka et al. (2018) used 2 cm/s. Additional criteria differ further: Tanaka et al. (2018) required firing rates between 1-10 Hz and excluded cells with place fields larger than 1/3 of the area, while Dombeck et al. (2010) selected fields above 1.5 Hz, and Tanni et al. (2022) used a 10 spatial bins to 1/2 area threshold. As Dombeck et al. (2010) noted, differences in recording methods and place field definitions lead to varying numbers of identified place cells. Moreover, Grijseels et al. (2021) demonstrated that different detection methods produce vastly different place cell counts with minimal overlap between identified populations.

      This reflects a deeper issue. Unlike structurally and genetically defined cell types (e.g., pyramidal neurons, interneurons, dopamingeric neurons, cFos expressing neurons), spatial cells lack such clarity in terms of structural or functional specialization and it is unclear whether such “cell types” should be considered cell types in the same way. While scientific progress requires standardized definitions, the question remains whether defining spatial cells through myriad different criteria advances our understanding of spatial cognition. Are researchers finding the same cells? Could they be targeting different populations? Are they missing cells crucial for spatial cognition that they exclude due to the criteria used? We think this is likely. The inconsistency matters because different criteria may capture genuinely different neural populations or computational processes.

      Variability in definitions and criteria is an issue in any field. However, as we have stated, the deeper issue is whether we should be defining and selecting these cells at all before commencing analysis. By defining and restricting to spatial “cell types”, we risk comparing fundamentally different phenomena across studies, and worse, missing the fundamental unit of spatial cognition (e.g., the population).

      We have added a paragraph in Discussion (lines 357-366) noting the inconsistency in place cell selection criteria in the literature and the consequences of using varying criteria.

      We have also added a sentence (lines 354-356) raising the comparison of functionally defined spatial cell types with structurally and genetically defined cell types in the Discussion.

      Thus, the question is not whether spatially tuned cells are influenced by sensory information, but whether feed-forward sensory processing alone is sufficient to account for their observed turning properties and responses to sensory manipulations.

      These issues indicate a more significant underlying issue of scientific methodology relating to the interpretation of their result and its impact on neuroscientific research. Specifically, in order to make strong claims about experimental data, it is not enough to show that a control (i.e. a null hypothesis) exists, one needs to demonstrate that experimental observations are quantitatively no better than that control.

      Where the authors state that ”In summary, complex networks that are not spatial systems, coupled with environmental input, appear sufficient to decode spatial information.” what they have really shown is that it is possible to decode *some degree* of spatial information. This is a null hypothesis (that observations of spatial tuning do not reflect a ”spatial system”), and the comparison must be made to experimental data to test if the so-called ”spatial” networks in the brain have more cells with more reliable spatial info than a complex-visual control.

      We agree that good null hypotheses with quantitative comparisons are important. However, it is not clear that researchers in the field have not been using a null hypothesis, rather they make the assumption that these cell types exist and are functional in the way they assume. We provide one null hypothesis. The field can and should develop more and stronger null hypotheses.

      In our work, we are mainly focusing on criteria of finding spatial cells, and making the argument that simply doing this is misleading. Researcher develop criteria and find such cells, but often do not go further to assess whether they are real cell “types”, especially if they exclude other cells which can be misleading if other cells also play a role in the function of interest.

      But from many other experiments including causal manipulations (e.g. Robinson et al 2020 Cell, DeLauilleon et al 2015 Nat Neuro), which the authors conveniently ignore. Thus, I do not find their argument, as strongly stated as it is, to be well-supported.

      We acknowledge that there are several studies that have performed inactivation studies that suggest a strong role for place cells in spatial behavior. Most studies do not conduct comprehensive analyses to confirm that their place cells are in fact crucial for the behavior at hand.

      One question is how the criteria were determined. Did the researchers make their criteria based on what “worked”, so they did not exclude cells relevant to the behavior? What if their criteria were different, then the argument could have been that non-place cells also contribute to behavior.

      Another question is whether these cells are the same kinds of cells across studies and animals, given the varied criteria across studies? As most studies do not follow the same procedures, it is unclear whether we can generalize these results across cells and indeed, across task and spatial environments.

      Finally, does the fact that the place cells – the strongly selective cells with a place field – have a strong role in navigation provide any insight into the mechanism? Identifying cells by itself does not contribute to our understanding of how they work. Consistent with our main message, we argue that performing analyses and building computational models that uncover how the function of interest works is more valuable than simply naming cells.

      Finally, I find a major weakness of the paper to be the framing of the results in opposition to, as opposed to contributing to, the study of spatially tuned cells. For example, the authors state that ”If a perception system devoid of a spatial component demonstrates classically spatially-tuned unit representations, such as place, head-direction, and border cells, can ”spatial cells” truly be regarded as ’spatial’?” Setting aside the issue of whether the perception system in question does indeed demonstrate spatiallytuned unit representations comparable to those in the brain, I ask ”Why not?” This seems to be a semantic game of reading more into a name then is necessarily there. The names (place cells, grid cells, border cells, etc) describe an observation (that cells are observed to fire in certain areas of an animal’s environment). They need not be a mechanistic claim... This is evidenced by the fact that even within e.g. the place cell community, there is debate about these cells’ mechanisms and function (eg memory, navigation, etc), or if they can even be said to serve only a single function. However, they are still referred to as place cells, not as a statement of their function but as a history-dependent label that refers to their observed correlates with experimental variables. Thus, the observation that spatially tuned cells are ”inevitable derivatives of any complex system” is itself an interesting finding which *contributes to*, rather than contradicts, the study of these cells. It seems that the authors have a specific definition in mind when they say that a cell is ”truly” ”spatial” or that a biological or artificial neural network is a ”spatial system”, but this definition is not stated, and it is not clear that the terminology used in the field presupposes their definition.

      We have to agree to disagree with the reviewer on this point. Although researchers may reflect on their work and discuss what the mechanistic role of these cells are, it is widely perceived that cell type discovery is perceived as important to journals and funders due to its intuitive appeal and easy-tounderstand impact – even if there is no finding of interest to be reported. As noted in the comment above, papers claiming cell type discovery continue to be published in top journals and is continued to be funded.

      Our argument is that maybe “cell type” discovery research should not celebrated in the way it is, and in fact they shouldn’t be discovered when they are not genuine cell types like structural or genetic cell types. By using this term it make it appear like they are something they are not, which is misleading. They may be important cells, but providing a name like a “place” cell also suggests other cells are not encoding space - which is very unlikely to be true.

      In sum, our view is that finding and naming cells through a flawed theoretical lens that may not actually function as their names suggests can lead us down the wrong path and be detrimental to science.

      Reviewer #1 (Recommendations For The Authors):

      The novelty of the current study relative to the work by Franzius, Sprekeler, Wiskott (PLoS Computational Biology, 2007) needs to be carefully addressed. That study also modeled the spatial correlates based on visual inputs.

      Our work differs from Franzius et al. (2007) on both theoretical and experimental fronts. While both studies challenge the mechanisms underlying spatial cell formation, our theoretical contributions diverge. Franzius et al. (2007) assume spatial cells are inherently important for spatial cognition and propose a sensory-driven computational mechanism as an alternative to mainstream path integration frameworks for how spatial cells arise and support spatial cognition. In contrast, we challenge the notion that spatial cells are special at all. Using a model with no spatial grounding, we demonstrate that 1) spatial cells as naturally emerge from complex non-linear processing and 2) are not particularly useful for spatial decoding tasks, suggesting they are not crucial for spatial cognition.

      Our approach employs null models with fixed weights—either pretrained on classification tasks or entirely random—that process visual information non-sequentially. These models serve as general-purpose information processors without spatial grounding. In contrast, Franzius et al. (2007)’s model learns directly from environmental visual information, and the emergence of spatial cells (place or head-direction cells) in their framework depends on input statistics, such as rotation and translation speeds. Notably, their model does not simultaneously generate both place and head-direction cells; the outcome varies with the relative speed of rotation versus translation. Their sensory-driven model indirectly incorporates motion information through learning, exhibiting a time-dependence influenced by slow-feature analysis.

      Conversely, our model simultaneously produces units with place and headdirection cell profiles by processing visual inputs sampled randomly across locations and angles, independent of temporal or motion-related factors. This positions our model as a more general and fundamental null hypothesis, ideal for challenging prevailing theories on spatial cells due to its complete lack of spatial or motion grounding.

      Finally, unlike Franzius et al. (2007), who do not evaluate the functional utility of their spatial representations, we test whether the emergent spatial cells are useful for spatial decoding. We find that not only do spatial cells emerge in our non-spatial model, but they also fail to significantly aid in location or head-direction decoding. This is the central contribution of our work: spatial cells can arise without spatial or sensory grounding, and their functional relevance is limited. We have updated the manuscript to clarify the novelty of the current contribution to previous work (lines 324-335).

      In Fig. 2, it may be useful to plot the error in absolute units, rather than the normalized error. The direction decoding can be quantified in terms of degree Also, it would be helpful to compare the accuracy of spatial localization to that of the actual place cells in rodents.

      We argue it makes more sense and put comparison in perspective when we normalize the error by dividing the maximal error possible under each task. For transparency, we plot the errors in absolute physical units used by the Unity game engine in the updated Appendix (Fig. 1).

      Reviewer #2 (Recommendations For The Authors):

      Regarding the involvement of ’classified cells’ in decoding, I think a useful way to present the results would be to show the relationship between ’placeness’, ’directioness’ and ’borderness’ and the strength of the decoder weights. Either as a correlation or as a full scatter plot.

      We appreciate your suggestion to visualize the relationship between units’ spatial properties and their corresponding decoder weights. We believe it would be an important addition to our existing results. Based on the exclusion analyses, we anticipated the correlation to be low, and the additional results support this expectation.

      As an example, we present unit plots below for VGG-16 (pre-trained and untrained, at its penultimate layer with sampling rate equals 0.3; Author response image 1 and 2). Additional plots for various layers and across models are included in the supplementary materials (Fig. S12-S28). Consistently across conditions, we observed no significant correlations between units’ spatial properties (e.g., placeness) and their decoding weight strengths. These results further corroborate the conclusions drawn from our exclusion analyses.

      Reviewer #3 (Recommendations For The Authors):

      My main suggestions are that the authors: -perform manipulations to the sensory environment similar to those done in experimental work, and report if their tuned cells respond in similar ways -quantitatively compare the degree of spatial tuning in their networks to that seen in publicly available data -re-frame the discussion of their results to critically engage with and contribute to the field and its past work on sensory influences to these cells

      As we noted in our opening section, our model is not intended as a model of the brain. It is a non-spatial null model, and we present the surprising finding that even such a model contains spatial cell-like units if identified using criteria typically used in the field. This raises the question whether simply finding cells that show spatial properties is sufficient to grant the special status of “cell type” that is involved in the brain function of interest.

      Author response image 1.

      VGG-16 (pre-trained), penultimate layer units, show no apparent relationship between spatial properties and their decoder weight strengths.

      Author response image 2.

      VGG-16 (untrained), penultimate layer units, show no apparent relationship between spatial properties and their decoder weight strengths.

      Furthermore, our main simulations were designed to be compared to experimental work where rodents foraged around square environments in the lab. We did not do an extensive set of simulations as the purpose of our study is not to show that we capture exactly every single experimental finding, but rather raise the issues with the functional cell type definition and identification approach for progressing neuroscientific knowledge.

      Finally, as we note in more detail below, different labs use different criteria for identifying spatial cells, which depend both on the lab and the experimental design. Our point is that we can identify such cells using criteria set by neuroscientists, and that such cell types may not reflect any special status in spatial processing. Additional simulations that show less alignment with certain datasets will not provide support for or against our general message.

      References

      Banino A, Barry C, Uria B, Blundell C, Lillicrap T, Mirowski P, Pritzel A, Chadwick MJ, Degris T, Modayil J, Wayne G, Soyer H, Viola F, Zhang B, Goroshin R, Rabinowitz N, Pascanu R, Beattie C, Petersen S, Sadik A, Gaffney S, King H, Kavukcuoglu K, Hassabis D, Hadsell R, Kumaran D (2018) Vector-based navigation using grid-like representations in artificial agents. Nature 557(7705):429–433, DOI 10.1038/s41586-018-0102-6, URL http://www.nature.com/articles/s41586-018-0102-6

      DiCarlo JJ, Zoccolan D, Rust NC (2012) How Does the Brain Solve Visual Object Recognition? Neuron 73(3):415–434, DOI 10.1016/J.NEURON.2012.01.010, URL https://www.cell.com/neuron/fulltext/S0896-6273(12)00092-X

      Diehl GW, Hon OJ, Leutgeb S, Leutgeb JK (2017) Grid and Nongrid Cells in Medial Entorhinal Cortex Represent Spatial Location and Environmental Features with Complementary Coding Schemes. Neuron 94(1):83– 92.e6, DOI 10.1016/j.neuron.2017.03.004, URL https://linkinghub.elsevier.com/retrieve/pii/S0896627317301873

      Dombeck DA, Harvey CD, Tian L, Looger LL, Tank DW (2010) Functional imaging of hippocampal place cells at cellular resolution during virtual navigation. Nature Neuroscience 13(11):1433–1440, DOI 10.1038/nn.2648, URL https://www.nature.com/articles/nn.2648

      Ebitz RB, Hayden BY (2021) The population doctrine in cognitive neuroscience. Neuron 109(19):3055–3068, DOI 10.1016/j.neuron. 2021.07.011, URL https://linkinghub.elsevier.com/retrieve/pii/S0896627321005213

      Grieves RM, Jedidi-Ayoub S, Mishchanchuk K, Liu A, Renaudineau S, Jeffery KJ (2020) The place-cell representation of volumetric space in rats. Nature Communications 11(1):789, DOI 10.1038/s41467-020-14611-7, URL https://www.nature.com/articles/s41467-020-14611-7

      Grijseels DM, Shaw K, Barry C, Hall CN (2021) Choice of method of place cell classification determines the population of cells identified. PLOS Computational Biology 17(7):e1008835, DOI 10.1371/journal.pcbi.1008835, URL https://dx.plos.org/10.1371/journal.pcbi.1008835

      Horrocks EAB, Rodrigues FR, Saleem AB (2024) Flexible neural population dynamics govern the speed and stability of sensory encoding in mouse visual cortex. Nature Communications 15(1):6415, DOI 10.1038/s41467-024-50563-y, URL https://www.nature.com/articles/s41467-024-50563-y

      Høydal , Skytøen ER, Andersson SO, Moser MB, Moser EI (2019) Objectvector coding in the medial entorhinal cortex. Nature 568(7752):400– 404, DOI 10.1038/s41586-019-1077-7, URL https://www.nature.com/articles/s41586-019-1077-7

      Ormond J, O’Keefe J (2022) Hippocampal place cells have goal-oriented vector fields during navigation. Nature 607(7920):741–746, DOI 10.1038/s41586-022-04913-9, URL https://www.nature.com/articles/s41586-022-04913-9

      Ouchi A, Fujisawa S (2024) Predictive grid coding in the medial entorhinal cortex. Science 385(6710):776–784, DOI 10.1126/science.ado4166, URL https://www.science.org/doi/10.1126/science.ado4166

      Sarel A, Finkelstein A, Las L, Ulanovsky N (2017) Vectorial representation of spatial goals in the hippocampus of bats. Science 355(6321):176–180, DOI 10.1126/science.aak9589, URL https://www.science.org/doi/10.1126/science.aak9589

      Sun C, Yang W, Martin J, Tonegawa S (2020) Hippocampal neurons represent events as transferable units of experience. Nature Neuroscience 23(5):651–663, DOI 10.1038/s41593-020-0614-x, URL https://www.nature.com/articles/s41593-020-0614-x

      Tanaka KZ, He H, Tomar A, Niisato K, Huang AJY, McHugh TJ (2018) The hippocampal engram maps experience but not place. Science 361(6400):392–397, DOI 10.1126/science.aat5397, URL https://www.science.org/doi/10.1126/science.aat5397

      Tanni S, De Cothi W, Barry C (2022) State transitions in the statistically stable place cell population correspond to rate of perceptual change. Current Biology 32(16):3505–3514.e7, DOI 10.1016/j.cub. 2022.06.046, URL https://linkinghub.elsevier.com/retrieve/pii/S0960982222010089

      Tong F, Pratte MS (2012) Decoding Patterns of Human Brain Activity. Annual Review of Psychology 63(1):483–509, DOI 10.1146/annurev-psych-120710-100412, URL https://www.annualreviews.org/doi/10.1146/annurev-psych-120710-100412

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1, point 1: In general, the statistical analysis is not transparent. The size of the sample, i.e. the number of observations or data points, is never specified. This information is essential for further evaluation of the statistical details.

      The size of each sample quantified, given as number of ommatidia/number of retinas, is indicated in the figure legends. This must have escaped the attention of reviewer 1, so we have added a sentence in the legend of Fig. 2 to state it more clearly. We think that the figure legends are the best place to put this information for ease of comparison to the figures.

      *Reviewer 1, point 2: To gain a better understanding of chitin deposition, it would be beneficial to have data on Kkv overexpression in cone cells versus outer pigment cells. Does it cause reb/exp-like effects on chitin deposition and corneal lens formation? Furthermore, can the authors rule out the involvement of chitin synthase 2 in chitin matrix formation and the retention of the matrix in kkv knockdowns? *

      We will generate clones of cells that over-express Kkv in either central cells (cone and primary pigment cells) or lattice cells (secondary and tertiary pigment cells), using the same drivers that we used to over-express Reb, and will examine chitin secretion at 54 h after puparium formation (APF) and in adults.

      As there are no available mutations in Chitin synthase 2 (Chs2), we will knock it down with RNAi in all retinal cells using lGMR-GAL4 and look for corneal lens defects. However, we think that Chs2 is unlikely to contribute chitin to the corneal lens, because its expression is restricted to the digestive system, and because kkv knockdown essentially eliminates chitin from the corneal lens.

      *Reviewer 1, point 3: Recent results published by the authors regarding ZP domain proteins, such as dusky-like (dyl), have not been adequately discussed in the context of chitin secretion and Kkv expression, a matter that must be addressed. It has been demonstrated that dyl mutants do not affect Kkv expression, but chitin levels are reduced. Does Dyl exhibit Kkv-like phenotypes? Furthermore, what is the expression of Dyl or Dmupy in Kkv knockdowns? Is there any interaction between the ZP domain protein matrix and the chitin matrix required for lens formation? *

      In dyl mutants, chitin deposition is delayed, but it does accumulate later in development, so the phenotype is different from kkv mutants. We have clarified this in the manuscript (p. 6). To address the other points, we will examine the expression of Dyl and of Dumpy-YFP in mid-pupal and late pupal retinas in which kkv is knocked down in all cells with lGMR-GAL4. The ZP protein matrix is originally deposited before chitin secretion begins, so we will examine whether loss of chitin affects its later maintenance.

      *Reviewer 1, point 4: What is retained in the chitin matrix if chitin is missing in kkv knockdown? Is it the ZP domain matrix (see the above question) or are the chitin matrix proteins also involved, such as Obst-A, Obst-C (Gasp), Knk and others? Obst proteins are particularly essential for the regular packaging of chitin and thus for the formation of the chitin layer, which is shown in Fig. 1. Beyond this story, it would also be interesting to see how the aforementioned chitin matrix proteins (Obst-A, Obst-C (Gasp), Knk and others) impact lens formation. *

      Adult corneal lenses derived from kkv knockdown retinas do not contain chitin, but there is remaining corneal lens material. We do not think that this is the ZP domain matrix, as this is normally lost in late pupal development, but we will check whether Dpy-YFP is retained in kkv knockdown adults. We will try to detect Obst-A and Gasp proteins using available antibodies. However, this may not be successful, as we have found that antibodies do not penetrate the corneal lens well. Our transcriptomic studies have identified numerous secreted proteins that are expressed at high levels in the mid-pupal retina and could be components of the corneal lens. We may be able to detect some of these using fluorescently tagged forms, but it is possible that the currently available tools will not be sufficient to answer this question.

      We have begun to work on how some of these proteins affect corneal lens structure, but this will take a significant amount of time and we think it would work better as a separate manuscript. We see our current manuscript as a short and focused story about the importance of the source of chitin in determining corneal lens shape.

      *Reviewer 1, minor comment 1: Figure 1 is not easily comprehensible for those who are not already familiar with the subject of eye development. Fig -1A' please label the cone cells and pigment cells. *

      We have labeled these cells in Fig. 1A’’.

      *Reviewer 1, minor comment 2: Fig. 1H - The meaning of the abbreviations and numbers is not given in the legend. It would also be beneficial to include a meaningful cartoon illustrating the corneal lens situation before and after chitin secretion, as shown in Figure 3. *

      We have defined the abbreviations in the figure legend. Fig. 1H did show the corneal lens situation before, during and after chitin secretion, but we have added the cone and pigment cells to the 72 h APF and adult diagrams to make them more meaningful (now Fig. 1I).

      *Reviewer 1, minor comment 3: Fig.1 F when does the authors recognize a first chitin assembly as initial corneal lens at the eye and how does it look like? Chitin expression is high already at 54h APF, which means 20 hours earlier. *

      We think that the reviewer is asking when the chitin first starts to form a dome shape. We have added an orthogonal view of chitin in a 54 h APF retina viewed with LIGHTNING microscopy, showing that the external curvature is already present at this stage (new Fig. 1F).

      *Reviewer 1, minor comment 4: Page 6 / Fig 2E: cells autonomously synthesize chitin and no lateral diffusion. Please label which lens contains chitin and which not *

      Fig. 2E shows part of a retina in which kkv has been knocked down in all cells, so none of the corneal lenses contain chitin. We have clarified this in the legend to Fig. 2.

      *Reviewer 1, minor comment 5: Page 7: The authors state that reb/exp knockdown affects external and internal curvature. However, Fig. S1 statistics does not support this statement. *

      We were referring to the double knockdown, which Fig. 2L, M show is significant, and not to the single knockdowns quantified in Fig. S1. We have clarified this in the text.

      *Reviewer 1, minor comment 6: Fig.2 and Fig. S1: what is Chp (Chaoptin)? *

      We have stated in the legend to Fig. 2 that Chaoptin is a component of photoreceptor rhabdomeres.

      *Reviewer 1, minor comment 7: Fig. S1E,I: which part of the eye is marked by the chitin staining outside the cone and pigment cells? *

      Chitin is still present in the mechanosensory bristles in Fig.S1I, as these do not express lGMR-GAL4. We have stated this in the figure legend.

      *Reviewer 1, minor comment 8: Fig. 2 L,M, Why do exp/reb show different statistical results at outer angle in exp and reb knockdown when compared with the IGMR driver line, although chitin reduction is eliminated in exp knockdown already from 54h APF onwards? *

      The double knockdown of exp and reb has a more significant effect on the adult corneal lens outer angle than the single exp knockdown, even though the exp knockdown lacks chitin at 54 h APF. We believe that this is because Reb is sufficient for some chitin synthesis at later stages of development. This was mentioned in the text (p. 6) and we have added further clarification in the legend to Fig. S1.

      *Reviewer 1, minor comment 9: Fig 3 G-H: please clarify where the chitin reduction can be observed at the edge of adult corneal lens and provide comparable wt staining's. Fig. S2 D. What was the normalization and the sample number? *

      We have added a high magnification image of a mosaic ommatidium with one wild-type and one kkv knockdown edge, showing the region at the edge of the corneal lens in which chitin fluorescence was quantified and the central region used for the normalization (Fig. 3I). The sample numbers are given in the legend to Fig. S2D.

      Reviewer 1, minor comment 10: Page 6, last paragraph: I fully agree that ZP domain proteins may retain other corneal lens components. But deeper discussion is missing. It should be noted that the authors hypothesis fits well to the proposed function of the ZP matrix in providing chitin matrix adhesion to the underlying cell surface. A loss of the ZP domain protein Piopio causes loss of the chitin matrix as show recently in trachea and at epidermal tendon cells (Göpfert et al., 2025; https://www.sciencedirect.com/science/article/pii/S1742706125003733). Furthermore, a recent publication identifies ZPD proteins as modular units that establish the mechanical environment essential for nanoscale morphogenesis (Itakura et al., https://www.biorxiv.org/content/10.1101/2024.08.20.608778v1.full.pdf*). This should be cited and discussed accordingly.

      It could be that outer and inner part of the chitin is different in ultrastructure due to expression pattern. In dragonfly the surface morphology analysis by scanning electron microscopy revealed that the outer part of corneal lenses consisted of long chitin fibrils with regular arrays of papillary structures while the smoother inner part had concentric lamellated chitin formation with shorter chitin nanofibrils (Kaya et al., 2016; https://www.sciencedirect.com/science/article/pii/S0141813016303646?via%3Dihub#fig0020) . Thus, a ultrastructure analyses would be very beneficial, or at least a detailed discussion. *

      We have added a discussion of these points and papers to the text (p. 6 and 9). Although we are not specifically addressing differences between the inner and outer parts of the corneal lens in this manuscript, we have now included a high-resolution LIGHTNING image showing how the layered structure of the corneal lens is affected when chitin production by central cells is increased (Fig. 4F).

      *Reviewer 2, point 1: Adult corneal lenses lacking chitin still form a thin structure in kkv RNAi. The authors suggest that this may be due to the presence of the ZP domain proteins Dyl, Dpy and Pio. Immunostaining for these ZP domain proteins could provide supporting evidence. *

      To clarify, we meant to say that the earlier presence of the ZP domain matrix could retain components other than chitin in the corneal lens. The ZP domain proteins are no longer present in the adult. We have made this clearer in the text. As described under reviewer 1, points 3 and 4, we will examine Dyl and Dpy-YFP expression in kkv knockdown retinas at mid-pupal and adult stages, and we will also look at the expression of another ZP domain protein, Piopio.

      *Reviewer 2, minor comment 1: At 50 h APF, Kkv (Fig. 2B, B') and Reb (Fig. S1A, A') appear to be expressed at higher levels in lattice cells than in central cells, even though chitin is mainly present in the central cells at this time (Fig. 1B-B'). Discuss possible explanation for their expression pattern and their roles at this stage. *

      We agree that this is a surprising result. We have added a discussion of possible explanations, such as the lack of another component necessary for chitin secretion in lattice cells at this stage, or the presence of high levels of chitinases (p. 7).

      *Reviewer 2, minor comment 2: Fig. 1F and G: Indicate that the cryosection images represent single ommatidia, and label "external" and "internal" to help orient readers. *

      We have made these changes to the figure panels (now G and H), and indicated in the legend that they are single ommatidia.

      *Reviewer 2, minor comment 3: Figure 2. The cartoon diagram showing the angle measurement (currently Fig S1K) should be moved to the main figure to help readers understand the quantifications. *

      We have moved this diagram to Figure 2L.

      *Reviewer 2, minor comment 4: Figure 3H. It would be helpful to clearly mark the edge of the corneal lens in the chitin intensity image. *

      As described under reviewer 1, minor comment 9, we have added a high magnification picture showing the edge region used for chitin quantification (Fig. 3I), which should also address reviewer 2’s concern.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Chitin plays a crucial role in the morphogenesis of the Drosophila corneal lens by supporting the structural integrity and biconvex shape of the lens. The Drosophila corneal lens is a biconvex structure that focuses light. Chitin, a major component, is produced mainly by the central cone and primary pigment cells. The production and arrangement of chitin by central cells directly impacts the thickness and curvature of the lens. Adequate chitin secretion is necessary to ensure the correct shape and function of the corneal lens, while disturbances in chitin production can lead to deformed lenses. Blocking chitin synthesis leads to a significant reduction in chitin deposition in the corneal lens, resulting in a thinner and deformed lens. In particular, the corneal lens shows reduced outer and inner curvature, which compromises its biconvex shape. These changes in chitin production and arrangement result in abnormal morphology of the corneal lens in the adult stage. The key messages of the paper's results are: The Drosophila corneal lens is a biconvex structure that focuses light. 2.) chitin, a significant component, is produced mainly by central cells (cone and primary pigment cells). 3.) Downregulation of the chitin synthase gene Krotzkopf reduces lens thickness and curvature. 4.) Overexpression of Rebuf increases chitin secretion and lens thickness. 5.) Localized chitin secretion is crucial for the typical shape of the corneal lens.

      Comments

      Main comments

      The manuscript provides an exciting insight into how the formation of the lens is regulated by the secretion of chitin. However, the data set appears to have shortcomings that must be considered for the next steps. 1.) In general, the statistical analysis is not transparent. The size of the sample, i.e. the number of observations or data points, is never specified. This information is essential for further evaluation of the statistical details.

      2.) To gain a better understanding of chitin deposition, it would be beneficial to have data on Kkv overexpression in cone cells versus outer pigment cells. Does it cause reb/exp-like effects on chitin deposition and corneal lens formation? Furthermore, can the authors rule out the involvement of chitin synthase 2 in chitin matrix formation and the retention of the matrix in kkv knockdowns?

      3.) Recent results published by the authors regarding ZP domain proteins, such as dusky-like (dyl), have not been adequately discussed in the context of chitin secretion and Kkv expression, a matter that must be addressed. It has been demonstrated that dyl mutants do not affect Kkv expression, but chitin levels are reduced. Does Dyl exhibit Kkv-like phenotypes? Furthermore, what is the expression of Dyl or Dmupy in Kkv knockdowns? Is there any interaction between the ZP domain protein matrix and the chitin matrix required for lens formation?

      4.) What is retained in the chitin matrix if chitin is missing in kkv knockdown? Is it the ZP domain matrix (see the above question) or are the chitin matrix proteins also involved, such as Obst-A, Obst-C (Gasp), Knk and others? Obst proteins are particularly essential for the regular packaging of chitin and thus for the formation of the chitin layer, which is shown in Fig. 1. Beyond this story, it would also be interesting to see how the aforementioned chitin matrix proteins impact lens formation.

      Minor comments:

      Page 6: Figure 1 is not easily comprehensible for those who are not already familiar with the subject of eye development.

      Fig -1A' please label the cone cells and pigment cells.

      Fig. 1H - The meaning of the abbreviations and numbers is not given in the legend. It would also be beneficial to include a meaningful cartoon illustrating the corneal lens situation before and after chitin secretion, as shown in Figure 3.

      Fig.1 F when does the authors recognize a first chitin assembly as initial corneal lens at the eye and how does it look like? Chitin expression is high already at 54h APF, which means 20 hours earlier.

      Page 6 / Fig 2E: cells autonomously synthesize chitin and no lateral diffusion. Please label which lens contains chitin and which not

      Page 7: The authors state that reb/exp knockdown affects external and internal curvature. However, Fig. S1 statistics does not support this statement.

      Fig.2 and Fig. S1: what is Chp (Chaoptin)?

      Fig. S1E,I: which part of the eye is marked by the chitin staining outside the cone and pigment cells?

      Fig. 2 L,M, Why do exp/reb show different statistical results at outer angle in exp and reb knockdown when compared with the IGMR driver line, although chitin reduction is eliminated in exp knockdown already from 54h APF onwards?

      Fig 3 G-H: please clarify where the chitin reduction can be observed at the edge of adult corneal lens and provide comparable wt staining's. Fig. S2 D. What was the normalization and the sample number?

      Page 6, last paragraph: I fully agree that ZP domain proteins may retain other corneal lens components. But deeper discussion is missing. It should be noted that the authors hypothesis fits well to the proposed function of the ZP matrix in providing chitin matrix adhesion to the underlying cell surface. A loss of the ZP domain protein Piopio causes loss of the chitin matrix as show recently in trachea and at epidermal tendon cells (Göpfert et al., 2025; https://www.sciencedirect.com/science/article/pii/S1742706125003733). Furthermore, a recent publication identifies ZPD proteins as modular units that establish the mechanical environment essential for nanoscale morphogenesis (Itakura et al., https://www.biorxiv.org/content/10.1101/2024.08.20.608778v1.full.pdf). This should be cited and discussed accordingly.

      It could be that outer and inner part of the chitin is different in ultrastructure due to expression pattern. In dragonfly the surface morphology analysis by scanning electron microscopy revealed that the outer part of corneal lenses consisted of long chitin fibrils with regular arrays of papillary structures while the smoother inner part had concentric lamellated chitin formation with shorter chitin nanofibrils (Kaya et al., 2016; https://www.sciencedirect.com/science/article/pii/S0141813016303646?via%3Dihub#fig0020) . Thus, a ultrastructure analyses would be very beneficial, or at least a detailed discussion.

      Significance

      The manuscript's strength and most important aspects are the genetic expression, and localization studies of the chitin under control of the chitin synthase kkv, reb and exp in Drosophila pupal and adult eye . However, beyond this manuscript, the development of mechanistic details, such as interaction partners that trigger secretion and action at the ZP matrix and adjacent apical membranes will be interesting.

      The manuscript uses nice genetics tools to describe the Chitin secretion differences in Drosophila eye and their specific impact on corneal lens formation. Such a precise molecular analysis has not been investigated before in insects. Therefore, the study deeply extends knowledge about the role of Chitin synthases and chitin secretion in insect eye.

      The audience will not only rather specialized in basic research in zoology, developmental biology, and cell biology in terms of how the Chitin synthases produce chitin. Nevertheless, as chitin is relevant to material research and medical and immunological aspects, the manuscript will be interesting beyond the specific field and thus for a broader audience.

      I'm working on chitin in the tracheal system and epidermis in Drosophila.

    1. In Crowdsourcing-Projekten zu alten Handschriften dominieren wenige besonders aktive User, sodass ihre Lesarten überproportional häufig übernommen werden.

      Das könnte ein gewünschter Effekt sein, da aktive User mehr Erfahrung haben.

    2. Historische Verzerrung (Historical Bias)

      Da wären wir wieder bei der strukturellen und institutionellen Diskriminierung, hier scheint es mir jedoch sinnvoll, nur auf Bias bei der Datenerzeugung zu fokussieren!

    3. technischer

      verstehe das Adjektiv in dem Zusammenhang nicht, es geht ja weder um eine (nur) technisch evozierte noch (nur) auf Technik bezogene Neutralität. Falls Adjektiv nötig z.B. kommunikativer

    4. westlichen Forschungstraditionen; indigene und nicht-westliche Perspektiven

      Auch wenn ich jetzt keinen direkten Gegenvorschlag habe, finde ich die Unterteilung schwierig, da sie koloniale Dichotomien reproduziert. Besser wäre vielleicht globaler Norden und globaler Süden, aber auch diese Unterscheidung hat viele Unschärfen. Ist auf jeden Fall diskussionswürdig.

    5. Veränderungen von Normen

      Veränderung von sprachlichen und gesellschaftlichen Normen? Es geht ja hier nicht vorrangig um DIN-Normen und Metastandards, oder?

    1. Onde é que a Anita mora?

      1,Ela é morar longe do centro 2,Ele é perto daquele parque 3,Não,ele trabalha na loja 4,Proque ela quer conhecer Ricrdo melhor 5,Amanhã, depois do jogo

    1. A transdisciplinary field that integrates computational and archival theories, methods, and resources, both to support the creation and preservation of reliable and authentic records/archives and to address large-scale records/archives processing, analysis, storage, and access, with the aim of improving efficiency, productivity, and precision, in support of recordkeeping, appraisal, arrangement and description, preservation and access decisions, and engaging and undertaking research with archival material.

      Un campo transdisciplinario que integra teorías, métodos y recursos computacionales y archivísticos, tanto para apoyar la creación y preservación de registros/archivos confiables y auténticos como para abordar el procesamiento, análisis, almacenamiento y acceso a registros/archivos a gran escala, con el objetivo de mejorar la eficiencia, la productividad y la precisión, en apoyo de las decisiones sobre el mantenimiento, la evaluación, la organización y la descripción de registros, la preservación y el acceso, y la realización de investigaciones con material de archivo.

    1. Document d'Information : Le Métier d'AESH et l'École Inclusive

      Synthèse

      Ce document analyse les conditions de travail des Accompagnants d'Élèves en Situation de Handicap (AESH) et leur impact sur la mise en œuvre de l'école inclusive en France, vingt ans après la loi fondatrice de 2005.

      Il ressort un paradoxe central : alors que les AESH sont des acteurs indispensables au fonctionnement de l'inclusion scolaire, leur profession est marquée par une précarité systémique, un manque criant de reconnaissance institutionnelle et une maltraitance latente.

      Les conditions de travail se caractérisent par des salaires inférieurs au seuil de pauvreté pour un temps partiel imposé, une absence de formation qualifiante, des missions floues qui favorisent le "bricolage" et une charge physique et émotionnelle considérable.

      Cette situation, où les AESH doivent constamment lutter pour leur place et pallier les dysfonctionnements du système, révèle que la maltraitance de ces professionnels se traduit inévitablement par une négligence envers les élèves qu'ils accompagnent, compromettant ainsi les fondements mêmes du projet d'école inclusive.

      Analyse Détaillée

      1. Le Paradoxe de la Profession d'AESH : Fierté et Maltraitance

      La profession d'AESH est traversée par une dualité profonde, identifiée par le chercheur Frédéric Grimau comme un conflit entre une "grande fierté" et une "grande maltraitance".

      Fierté et Utilité Sociale : Les AESH expriment une fierté légitime pour leur travail, conscients de leur rôle essentiel. Ils déploient une "ingéniosité" remarquable pour faire fonctionner l'inclusion, souvent "à bout de bras".

      Leur contribution est fondamentale, comme le résume la formule : "sans AESH, il n'y a pas d'école inclusive".

      Les témoignages d'élèves confirment ce rôle crucial, évoquant "la complicité" et "la confiance" établies avec leur accompagnant.

      Maltraitance Institutionnelle : En parallèle, les AESH subissent une forme de maltraitance institutionnelle qui se manifeste par une invisibilisation systématique.

      Exclusion Symbolique : Ils sont fréquemment omis des communications officielles de la hiérarchie (par exemple, les vœux de bonnes vacances).

      L'accès à des espaces communs comme la "salle des profs" leur est parfois refusé, renforçant un sentiment de mise à l'écart.

      L'appellation "salle des adultes" ou "salle des personnels" est suggérée comme un premier pas vers la reconnaissance.  

      Confusion Hiérarchique : L'organisation du travail est marquée par un "flou dans les prescriptions" et dans la chaîne de commandement, illustré par le témoignage : "dans mon école tout le monde est mon chef".

      Cette situation est source d'inconfort et de dévalorisation.

      2. Des Conditions de Travail Précaires et un Rôle Mal Défini

      La précarité matérielle et la définition imprécise du métier constituent des freins majeurs à la professionnalisation et au bien-être des AESH.

      Aspect

      Description

      Salaires et Précarité

      La rémunération est basée sur le SMIC horaire, mais les contrats sont majoritairement à temps incomplet, plaçant de nombreux AESH sous le seuil de pauvreté.

      Beaucoup sont contraints de cumuler plusieurs emplois (cantine, aide aux devoirs) pour subvenir à leurs besoins, ce qui entraîne une grande fatigabilité.

      L'accès aux primes REP/REP+, pour le travail en éducation prioritaire, n'a été accordé qu'en 2023.

      Le "Flou" Institutionnel

      Le manque de définition claire des missions est pratique pour l'institution qui peut ainsi faire des "économies".

      Cependant, ce "flou" contraint les AESH à un "bricolage" permanent, comme l'illustre la situation dégradante d'un change d'élève réalisé avec des sacs poubelles et des rideaux en guise de paravent, soulignant "l'indignité totale" pour l'enfant et les professionnels.

      Charge Physique et Émotionnelle

      Le métier comporte une pénibilité physique importante (troubles musculosquelettiques dus au port d'élèves, manque d'infrastructures adaptées).

      La charge mentale est également très lourde : les AESH travaillent avec le "risque de l'incident" constant (crise, violence, fugue), une pression comparable à celle des conducteurs de bus ou de train.

      3. Une Absence de Formation et de Reconnaissance Professionnelle

      L'un des principaux griefs concerne l'inexistence d'une véritable formation, ce qui nuit à la légitimité et à l'efficacité des accompagnants.

      Une Formation Inexistante : La "formation" initiale se résume à 60 heures d'"adaptation à l'emploi", souvent dispensées sous forme de "diaporamas" informatifs dans un amphithéâtre, sans aucune mise en pratique.

      Ce dispositif, hérité des contrats aidés de 2005, est jugé totalement inadapté à la complexité des situations de handicap.

      Les syndicats revendiquent une véritable formation diplômante de niveau Bac+2 sur concours.

      L'Autoformation comme Norme : Face à ce vide, les AESH sont contraints de "s'autoformer".

      Le personnage d'Yvan dans la bande dessinée Ulis de Fabien Toulmet, qui se rend à la bibliothèque pour se documenter sur l'autisme, illustre cette réalité.

      Myiam Sonaï témoigne avoir dû découvrir seule les spécificités des différentes pathologies (dyslexie, dysorthographie, etc.).

      La Lutte pour la Place : La reconnaissance professionnelle se gagne au quotidien dans les établissements.

      Les AESH doivent "se faire leur place" auprès d'équipes enseignantes qui peuvent initialement se montrer distantes.

      L'institution ne prévoit pas de temps dédié à la collaboration et à la concertation, pourtant essentiels pour un travail d'équipe efficace.

      De plus, les AESH sont souvent exclus des Équipes de Suivi de la Scolarisation (ESS), alors que leur parole est primordiale, étant les professionnels les plus proches de l'élève au quotidien.

      4. L'AESH au Cœur des Dysfonctionnements de l'École Inclusive

      Les AESH se retrouvent en première ligne pour gérer les contradictions et les lacunes du système.

      Le Rôle de "Tampon" : Selon Fabien Toulmet, les AESH sont dans une "strate intermédiaire" entre les élèves et les professeurs et font "tampon", absorbant les dysfonctionnements du système.

      Ils sont souvent amenés à dépasser leurs fonctions pour pallier le manque de personnel, en s'occupant de plusieurs élèves simultanément ou en surveillant l'ensemble d'une classe.

      Dépassement de Fonctions et Gestes Techniques :

      Certains se voient confier des tâches relevant du soin, voire du domaine médical (changer une trachéotomie sans formation), alors que la mission d'aide aux "gestes de la vie quotidienne" n'inclut pas les soins.

      Langage et Stigmatisation : Les AESH sont aussi des médiateurs sociaux qui luttent contre la stigmatisation.

      Ils doivent naviguer dans un univers de sigles techniques (GEVASCO, MDPH, PIAL) et faire face à un langage parfois infantilisant ("les enfants" pour des adolescents).

      Ils sont également confrontés à l'usage du mot "Ulis" comme une insulte entre élèves, reflétant la persistance des préjugés.

      5. Évolutions et Inquiétudes pour l'Avenir

      Les réformes récentes et à venir suscitent de vives inquiétudes quant à une dégradation supplémentaire des conditions de travail.

      Les Pôles Inclusifs d'Accompagnement Localisés (PIAL) : Ce dispositif a complexifié le travail en introduisant une "mutualisation" du temps qui se traduit souvent par des affectations multiples et des distances de déplacement importantes.

      Le Pôle d'Appui à la Scolarité (PAS) : Cette nouvelle structure, prévue par la loi, inquiète particulièrement.

      Elle vise à étendre les missions des AESH à l'ensemble des élèves à besoins éducatifs particuliers (incluant les élèves allophones, les enfants du voyage, etc.), et pas seulement ceux en situation de handicap.

      Cette extension des tâches, sans formation ni revalorisation salariale, risque d'accroître une "charge mentale" déjà très élevée.

      Le Problème Politique : Les intervenants s'accordent sur le fait que les difficultés rencontrées sont le symptôme d'un manque de volonté politique et d'investissement.

      L'école inclusive ne peut se construire uniquement sur le "dévouement" des personnels.

      Elle nécessite des investissements concrets dans le bâti scolaire, les manuels adaptés, et surtout, dans la reconnaissance et la formation de celles et ceux qui la rendent possible au quotidien.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      Overall, the authors show an interesting and conclusive work on the activation of ERM proteins upon TBXA2R signaling. The use of the ebBRET biosensor to assess ERM-protein activation enables elegant investigation of activation modalities. The Thromboxane A2 analogue U46619 robustly shows activation of ERM proteins in ebBRET assays as well as an increase in ERM-protein phosphorylation status. The functional effects of this signaling pathway are shown convincingly for moesin, where moesin mediates an TBXA2R mediated increase in cell motility, invasion and metastasis of triple-negative breast cancer Hs578 cells in vitro and in vivo. Nonetheless, some points need to be clarified.

      Significance

      Comment 1: In the title the authors state, that ERM-activation via TBXA2R is controlling invasion and motility of triple-negative breast cancer cells. In the manuscript, there is only data supporting this assumption for moesin (MSN). Therefore, the authors need to change the title accordingly or support additional experiments for the other two ERM-proteins radixin and ezrin. Throughout the experiments, the p-ERM antibody is used to measure ERM-protein activation. Since the effects on invasion and motility observed in Hs578 cells are mainly mediated through moesin, it would be necessary to see, at least for one experiment per cell line (HEK293T, Hs578) the detailed phosphorylation status of ezrin, radixin and moesin separately. As there are specific, phospho-detecting antibodies for this case, this could be done rather easy. Furthermore, showing specific increase of phosphorylated moesin would support the functional data shown in Figure 5 and 6. To investigate the functional effect of TBXA2R mediated activation of ezrin and radixin on cell motility and invasion, similar experiments could be done in e.g. HMC-1-8 breast cancer cells (high ezrin expression) and HCC1187 (high radixin expression).

      Comment 2: Figure 1A, C, D: The concentration of staurosporine is with 100 nM relatively high for kinase inhibition. It would be informative to see the assay with increasing staurosporine concentrations, e.g. from 1 nM to 50 nM. In general, a concentration of 1-10 nM should be sufficient for kinase inhibition, preventing unspecific effects of the drug.

      Comment 3: The citation for the p-ERM antibody is confusing, as there is only p-Moe used in the cited paper (Roubinet, 2011). There is a p-ERM antibody commercially available (Cell Signaling, Phospho-ezrin (Thr567)/radixin (Thr564)/moesin (Thr558) Antibody #3141). Could you clarify which antibody you are using?

      Comment 4: From the inhibitor experiments using C3 transferase toxin (Figure 2), the authors conclude that RhoA plays a role in TBXA2R mediated ERM activation. As mentioned in the manufacturer's description, C3 toxin is inhibiting RhoA, RhoB and RhoC. Therefore, it would be necessary to repeat those experiments under RhoA knockdown conditions (e.g. using an siRNA-based approach) to state that specifically RhoA is involved.

      Comment 5: To assess, if the findings in Figure 5 and 6 are due to the higher moesin expression in Hs578 cells or are linked to a specific function of moesin, a re-expression experiment would be informative. To achieve this, the 2D and 3D migration experiments could be redone after re-expression of moesin, ezrin and radixin separately in moesin knockdown conditions.

      Minor comments:

      • Even though U46619 is a known Thromboxane A2 analogue, including negative and positive controls would strengthen the results. In detail, this could be done by showing a known protein which gets phosphorylated downstream of TBXA2R signaling and a protein which is not affected by this signaling pathway alongside the shown effects on ERM-proteins.
      • Figure 1 J: There are no statistics comparing the conditions of SQ-29548 treated cells in presence/absence of U46619, that should be added.
      • Figure 1 G, H: How was the quantification for cell periphery performed? In detail, how were the thresholds set for cell periphery / not cell periphery?
      • Figure 3 H:
        • The labelling indicating presence of U46619 is missing.
        • Also, what is the rationale behind normalizing MB-453 for 3 cell lines and comparing the BT-549 to MB-157?
      • Suppl. Fig 4 D: Define y-axis better. Absorbance at what wave length?
      • Define FERM and ERMAD abbreviations in introduction.
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      The Ezrin, radixin, and moesin (ERM) family of proteins orchestrate morphological changes that potentiate metastatic invasion in cancer cells. In this study, Leguay et al. identify the GPCR, TBXA2R, as a key activator of the ERM proteins which promotes motility and invasion in triple-negative breast cancer (TNBC) cells. Using BRET-based sensors developed by them previously for monitoring the activation of ERM proteins and building upon their previous findings on the role of the small GTPase RhoA in the activation of ERM proteins, the authors carefully dissect the molecular pathway leading to the activation of ERM proteins upon stimulation of the TBX2AR. The authors also establish the pathological relevance of the pathway in TNBC using in vitro and in vivo models, opening up possibilities for targeting this pathway in cancer cells. Overall, the study is well-conceived and executed, and the results are clearly described and presented in the manuscript. However, the following comments must be addressed before publication.

      Major comments

      Fig 1C - Why p-ERM was normalized over Ezrin and not ERM? It would be more appropriate and consistent to normalize against the ERM signal as done in other experiments in the manuscript.

      Fig 1E and S3C - The levels of total ERM also seem to change with increasing treatment times. This must be clarified and discussed in the manuscript.

      Fig 1F - Why is the mean of all three independent experiments not presented here as in S3C?

      Fig 2E - Though SLK seems to play a dominant role in the phosphorylation of ERM in HEK293T cells, the depletion of LOK also substantially reduces the phosphorylation of ERM in the representative figure (Fig 2E), which is not reflected in the quantification (Fig 2F). Indeed, both SLK and LOK seem to be equally crucial in Hs578T cells (Fig 4I), unlike the conclusion here. The authors must check if the quantifications were affected by any white spots in the blot for total ERM as seen in the representative figure. If necessary, the authors must include additional replicates, and the model in Fig 2G should be updated accordingly. If the contributions of LOK are indeed quite minimal in HEK293T cells, then the difference in Hs578T cells must be adequately highlighted and discussed rather than broadly mentioning similar results were observed in both cell lines. The discussion mentions that SLK kinases are the only kinases needed for ERM activation, which conflicts with findings from Hs578T cells, where both SLK and LOK contribute to ERM phosphorylation (Fig 4I). The authors should revise this to reflect their data accurately.

      Minor comments

      FigS3B should cite the source dataset and not just the database. Also, details of how the extracted data was processed (if any) should be described clearly.

      When multiple treatments are involved (for, e.g. U46619 and staurosporine), the exact sequence of treatments and the overlap in timings of different treatments must be clearly mentioned. E.g. fig 1A and 1C. There are a few grammatical errors which need to be fixed. E.g. Paragraph 2 in the second section of results - We next aimed to identify (not identifying) which kinase(s) acts downstream of TBX2AR

      Significance

      Triple-negative breast cancer, which is characterized by a lack of estrogen, progesterone or HER2 receptors, is a highly metastatic and aggressive form of breast cancer with poor prognosis. Currently, there are fewer treatment options than other types of invasive breast cancer. The current study opens up the possibility of targeting the TBXA2R or the downstream signalling components in TNBC, which are still expressed in TNBC cells. However, certain TNBC sub-types express low levels of p-ERM and TBX2AR (Fig 3E, 3F), indicating a minor role for TBX2AR pathway and targeting this pathway in these subtypes may be inefficient. In addition, certain subtypes express high p-ERM and low TBX2AR indicating alternative pathways for ERM activation. Currently, it is not clear which other GPCRs can contribute to ERM activation by engaging similar downstream effectors. A comprehensive screening of different GPCR antagonists could identify alternative strategies to target the ERM-mediated metastasis in TNBC cells that show low expression of TBX2AR.

      Audience The manuscript is relevant to a broad audience, especially to cell biologists, cancer biologists and clinical scientists.

      The reviewer's field of expertise includes cell signaling, gene expression, and RNA biology in mammalian systems. Moderate expertise in cancer biology. Limited knowledge of histopathological analysis.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      This manuscript investigates the role of the thromboxane A2 receptor (TBXA2R) in activating ERM (ezrin, radixin, and moesin) proteins to promote cell motility and invasion in triple-negative breast cancer (TNBC) cells. Using TBXA2R stimulation and a series of in vitro and in vivo experiments, the authors report that ERM activation is mediated through a TBXA2R signaling pathway involving Gαq/11 and Gα12/13 subunits, RhoA, and SLK/LOK kinases. They propose that this pathway enhances cell migration, invasion, and metastatic potential in TNBC.

      General criticisms

      Experimental design and analyses are adequate, even though certain experiments lack appropriate controls or employ the wrong statistical tests. However, the study primarily relies on a single TNBC cell line and heavy use of overexpression systems and/or small molecule inhibitors, raising concerns about the generalizability and specificity of the findings. Furthermore, several conclusions appear premature and unsupported by the current data. Critical controls and additional validation experiments are necessary to support the claims about the role of TBXA2R in metastasis and to justify the strong mechanistic conclusions drawn.

      Specific criticisms

      Figure 1

      TBXA2R expression should be shown to understand whether different ebBRET signals are dependent on the overexpression levels of TBXA2R.

      E-F: As ERM levels change over time, one would like to understand whether this is due to misloading or whether there is an underlying biological event going on in the stimulated cells. Are total ERM levels really changing over time? Please add a blot for 1-2 housekeeping proteins as loading controls. This is also crucial to clarify the kinetics of ERM activation; such notable intensity variations make quantifications of non-linear WB signals not fully reliable. In F, mean and SD should be plotted.

      G: The authors need to use a PM marker if they want to claim that pERM increases at the cell cortex. TBXA2R localization should also be shown.

      Figure 2

      A: This reviewer cannot see the purported partial inhibition in Ga12/13 KO cells. Are differences between the two KOs significant? Furthermore, there are reports indicating that YM-254890 may not be specific for Gaq. Experiments on double KO cells are needed to assess the possible redundancy between the two Ga subfamilies. C-D: it is important to add a positive control for the activity of Y-27632 in these experiments. Please show that a ROCK-dependent effect is inhibited in the treated cells. G: The working model is premature as it is unknown whether ROCKi was active. While asking for ROCK1/2 KO cells would be too much, this claim is far-fetched.

      Figure 3

      B: In the legend, it is not clear what grey and light read colours mark. E-F: This reviewer finds it difficult to believe that p-ERM and TBXA2R signal intensities at the cell cortex could be reliably quantified using IHC images. The representative samples would indicate that p-ERM and TBXA2R positivity are not correlated. It would be crucial to show examples for each of the TNBC subgroups the existence of which is inferred based on p-ERM and TBXA2R staining. The conclusion that "no TNBC samples exhibited high TBXA2R expression and low levels of p-ERMs, further supporting a role for TBXA2R signalling in ERM activation in TNBC" is an overstatement.

      Figure 4

      The authors wrote that "We focused on the Hs578T cell line, which showed a median level of TBXA2R mRNA expression among the six TNBC cell lines tested". I do not understand the rationale for it as anti-TBXA2R antibodies detecting endogenous TBXA2R are available and thus why not use the median protein levels?

      Figure 5

      Effects of the knockouts are subtle, and rescue experiments would be needed to corroborate these results. The employed statistical analysis is prone to overestimating differences. The authors should use the superplots instead. The authors might also decide to use other TNBC cell lines to explore the functional relevance of this pathway in BC progression. This is particularly important because Hs578T are poorly tumorigenic, and they often do not form palpable tumours in mice.

      Figure 6

      The fact that Hs578T are poorly tumorigenic in mice is likely the reason why the authors used the experimental metastasis model. However, it is puzzling that metastases were studied in the liver but not in the lungs. Furthermore, the whole approach is rather artefactual as the TBXA2R agonist was administered for the entire duration of these experiments. What is the pathological relevance of such a study? Including a spontaneous metastasis model or alternative TNBC lines that mimic human disease more closely would help strengthen the functional relevance of this pathway in BC progression and study's translational relevance.

      Figure S2

      B-M: the pERM signal appears to be perinuclear in some of the tested cell lines. Please use a PM marker.

      Figure S3

      The authors should use the superplots to analyse the cell migration data.

      Discussion

      The claim that "our findings demonstrated that kinases of the SLK family are the only kinases needed for ERM activation by TBXA2R" should be tuned down as only 2 cell lines were tested. In this section, the authors should also discuss the proposed pro-metastatic functions of TXA2 and TXA2R in more detail, including vascular permeability. The sweeping conclusion that "TBXA2R expression correlates with phosphorylation and activation of ERMs in TNBC patient samples" clashes with the authors' own results; please stick to the data.

      Concluding remarks

      This study investigates a signaling pathway whereby TBXA2R thorugh ERM activation enhances the migratory and invasive potential of TNBC cells. However, several improvements are needed to support the main claims. The dependence on a single TNBC cell line, reliance on pharmacological inhibitors with potential off-target effects, and limited in vivo relevance detract from the generalizability of the findings. Additional TNBC models, adeguate controls, and a broader focus on natural metastasis patterns would make the conclusions more compelling. Moderating certain overstated claims would be needed to align the interpretations with the actual data.

      Cross-commenting

      I found comments in the other reviewers' reports that align with my criticisms on the mouse experiments as well as with those pertaining to the tissue culture work.

      Significance

      General comments

      The manuscript investigates the role of TBXA2R in the regulation of ERM in the context of TNBC metastasis. Much of this TBXA2R signalling axis is already known, as well as that SLK and LOK can phosphorylate ERM in other cell systems. Similarly, the positive role of ERM in cell migration/invasion and cancer progression has long been reported. The somewhat unexpected finding that ERM phosphorylation is independent of ROCK remains not fully convincing. The BC-related part is problematic as the continuous administration a TBXA2R agonist is required for key tumour metrics to show some differences in vivo. This calls into question the main conclusion of the work, namely that the TBXA2R/ERM-dependent pathway is activated during BC progression in TNBC cells.

      Audience

      Specialists interested in GPCRs and signal transduction or in the cytoskeleton.

      Expertise

      Rev: cancer cell biology, signal transduction, cytoskeleton, actin biochemistry, multiplexed imaging, mouse model of human diseases.

      Co-rev: nanoparticles, cell biology.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      Overall, the authors show an interesting and conclusive work on the activation of ERM proteins upon TBXA2R signaling. The use of the ebBRET biosensor to assess ERM-protein activation enables elegant investigation of activation modalities. The Thromboxane A2 analogue U46619 robustly shows activation of ERM proteins in ebBRET assays as well as an increase in ERM-protein phosphorylation status. The functional effects of this signaling pathway are shown convincingly for moesin, where moesin mediates an TBXA2R mediated increase in cell motility, invasion and metastasis of triple-negative breast cancer Hs578 cells in vitro and in vivo. Nonetheless, some points need to be clarified.

      Significance

      Comment 1: In the title the authors state, that ERM-activation via TBXA2R is controlling invasion and motility of triple-negative breast cancer cells. In the manuscript, there is only data supporting this assumption for moesin (MSN). Therefore, the authors need to change the title accordingly or support additional experiments for the other two ERM-proteins radixin and ezrin. Throughout the experiments, the p-ERM antibody is used to measure ERM-protein activation. Since the effects on invasion and motility observed in Hs578 cells are mainly mediated through moesin, it would be necessary to see, at least for one experiment per cell line (HEK293T, Hs578) the detailed phosphorylation status of ezrin, radixin and moesin separately. As there are specific, phospho-detecting antibodies for this case, this could be done rather easy. Furthermore, showing specific increase of phosphorylated moesin would support the functional data shown in Figure 5 and 6. To investigate the functional effect of TBXA2R mediated activation of ezrin and radixin on cell motility and invasion, similar experiments could be done in e.g. HMC-1-8 breast cancer cells (high ezrin expression) and HCC1187 (high radixin expression).

      Comment 2: Figure 1A, C, D: The concentration of staurosporine is with 100 nM relatively high for kinase inhibition. It would be informative to see the assay with increasing staurosporine concentrations, e.g. from 1 nM to 50 nM. In general, a concentration of 1-10 nM should be sufficient for kinase inhibition, preventing unspecific effects of the drug.

      Comment 3: The citation for the p-ERM antibody is confusing, as there is only p-Moe used in the cited paper (Roubinet, 2011). There is a p-ERM antibody commercially available (Cell Signaling, Phospho-ezrin (Thr567)/radixin (Thr564)/moesin (Thr558) Antibody #3141). Could you clarify which antibody you are using?

      Comment 4: From the inhibitor experiments using C3 transferase toxin (Figure 2), the authors conclude that RhoA plays a role in TBXA2R mediated ERM activation. As mentioned in the manufacturer's description, C3 toxin is inhibiting RhoA, RhoB and RhoC. Therefore, it would be necessary to repeat those experiments under RhoA knockdown conditions (e.g. using an siRNA-based approach) to state that specifically RhoA is involved.

      Comment 5: To assess, if the findings in Figure 5 and 6 are due to the higher moesin expression in Hs578 cells or are linked to a specific function of moesin, a re-expression experiment would be informative. To achieve this, the 2D and 3D migration experiments could be redone after re-expression of moesin, ezrin and radixin separately in moesin knockdown conditions.

      Minor comments:

      • Even though U46619 is a known Thromboxane A2 analogue, including negative and positive controls would strengthen the results. In detail, this could be done by showing a known protein which gets phosphorylated downstream of TBXA2R signaling and a protein which is not affected by this signaling pathway alongside the shown effects on ERM-proteins.
      • Figure 1 J: There are no statistics comparing the conditions of SQ-29548 treated cells in presence/absence of U46619, that should be added.
      • Figure 1 G, H: How was the quantification for cell periphery performed? In detail, how were the thresholds set for cell periphery / not cell periphery?
      • Figure 3 H:
        • The labelling indicating presence of U46619 is missing.
        • Also, what is the rationale behind normalizing MB-453 for 3 cell lines and comparing the BT-549 to MB-157?
      • Suppl. Fig 4 D: Define y-axis better. Absorbance at what wave length?
      • Define FERM and ERMAD abbreviations in introduction.
    5. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      The Ezrin, radixin, and moesin (ERM) family of proteins orchestrate morphological changes that potentiate metastatic invasion in cancer cells. In this study, Leguay et al. identify the GPCR, TBXA2R, as a key activator of the ERM proteins which promotes motility and invasion in triple-negative breast cancer (TNBC) cells. Using BRET-based sensors developed by them previously for monitoring the activation of ERM proteins and building upon their previous findings on the role of the small GTPase RhoA in the activation of ERM proteins, the authors carefully dissect the molecular pathway leading to the activation of ERM proteins upon stimulation of the TBX2AR. The authors also establish the pathological relevance of the pathway in TNBC using in vitro and in vivo models, opening up possibilities for targeting this pathway in cancer cells. Overall, the study is well-conceived and executed, and the results are clearly described and presented in the manuscript. However, the following comments must be addressed before publication.

      Major comments

      Fig 1C - Why p-ERM was normalized over Ezrin and not ERM? It would be more appropriate and consistent to normalize against the ERM signal as done in other experiments in the manuscript.

      Fig 1E and S3C - The levels of total ERM also seem to change with increasing treatment times. This must be clarified and discussed in the manuscript.

      Fig 1F - Why is the mean of all three independent experiments not presented here as in S3C?

      Fig 2E - Though SLK seems to play a dominant role in the phosphorylation of ERM in HEK293T cells, the depletion of LOK also substantially reduces the phosphorylation of ERM in the representative figure (Fig 2E), which is not reflected in the quantification (Fig 2F). Indeed, both SLK and LOK seem to be equally crucial in Hs578T cells (Fig 4I), unlike the conclusion here. The authors must check if the quantifications were affected by any white spots in the blot for total ERM as seen in the representative figure. If necessary, the authors must include additional replicates, and the model in Fig 2G should be updated accordingly. If the contributions of LOK are indeed quite minimal in HEK293T cells, then the difference in Hs578T cells must be adequately highlighted and discussed rather than broadly mentioning similar results were observed in both cell lines. The discussion mentions that SLK kinases are the only kinases needed for ERM activation, which conflicts with findings from Hs578T cells, where both SLK and LOK contribute to ERM phosphorylation (Fig 4I). The authors should revise this to reflect their data accurately.

      Minor comments

      FigS3B should cite the source dataset and not just the database. Also, details of how the extracted data was processed (if any) should be described clearly.

      When multiple treatments are involved (for, e.g. U46619 and staurosporine), the exact sequence of treatments and the overlap in timings of different treatments must be clearly mentioned. E.g. fig 1A and 1C. There are a few grammatical errors which need to be fixed. E.g. Paragraph 2 in the second section of results - We next aimed to identify (not identifying) which kinase(s) acts downstream of TBX2AR

      Significance

      Triple-negative breast cancer, which is characterized by a lack of estrogen, progesterone or HER2 receptors, is a highly metastatic and aggressive form of breast cancer with poor prognosis. Currently, there are fewer treatment options than other types of invasive breast cancer. The current study opens up the possibility of targeting the TBXA2R or the downstream signalling components in TNBC, which are still expressed in TNBC cells. However, certain TNBC sub-types express low levels of p-ERM and TBX2AR (Fig 3E, 3F), indicating a minor role for TBX2AR pathway and targeting this pathway in these subtypes may be inefficient. In addition, certain subtypes express high p-ERM and low TBX2AR indicating alternative pathways for ERM activation. Currently, it is not clear which other GPCRs can contribute to ERM activation by engaging similar downstream effectors. A comprehensive screening of different GPCR antagonists could identify alternative strategies to target the ERM-mediated metastasis in TNBC cells that show low expression of TBX2AR.

      Audience The manuscript is relevant to a broad audience, especially to cell biologists, cancer biologists and clinical scientists.

      The reviewer's field of expertise includes cell signaling, gene expression, and RNA biology in mammalian systems. Moderate expertise in cancer biology. Limited knowledge of histopathological analysis.

    6. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      This manuscript investigates the role of the thromboxane A2 receptor (TBXA2R) in activating ERM (ezrin, radixin, and moesin) proteins to promote cell motility and invasion in triple-negative breast cancer (TNBC) cells. Using TBXA2R stimulation and a series of in vitro and in vivo experiments, the authors report that ERM activation is mediated through a TBXA2R signaling pathway involving Gαq/11 and Gα12/13 subunits, RhoA, and SLK/LOK kinases. They propose that this pathway enhances cell migration, invasion, and metastatic potential in TNBC.

      General criticisms

      Experimental design and analyses are adequate, even though certain experiments lack appropriate controls or employ the wrong statistical tests. However, the study primarily relies on a single TNBC cell line and heavy use of overexpression systems and/or small molecule inhibitors, raising concerns about the generalizability and specificity of the findings. Furthermore, several conclusions appear premature and unsupported by the current data. Critical controls and additional validation experiments are necessary to support the claims about the role of TBXA2R in metastasis and to justify the strong mechanistic conclusions drawn.

      Specific criticisms

      Figure 1

      TBXA2R expression should be shown to understand whether different ebBRET signals are dependent on the overexpression levels of TBXA2R.

      E-F: As ERM levels change over time, one would like to understand whether this is due to misloading or whether there is an underlying biological event going on in the stimulated cells. Are total ERM levels really changing over time? Please add a blot for 1-2 housekeeping proteins as loading controls. This is also crucial to clarify the kinetics of ERM activation; such notable intensity variations make quantifications of non-linear WB signals not fully reliable. In F, mean and SD should be plotted.

      G: The authors need to use a PM marker if they want to claim that pERM increases at the cell cortex. TBXA2R localization should also be shown.

      Figure 2

      A: This reviewer cannot see the purported partial inhibition in Ga12/13 KO cells. Are differences between the two KOs significant? Furthermore, there are reports indicating that YM-254890 may not be specific for Gaq. Experiments on double KO cells are needed to assess the possible redundancy between the two Ga subfamilies. C-D: it is important to add a positive control for the activity of Y-27632 in these experiments. Please show that a ROCK-dependent effect is inhibited in the treated cells. G: The working model is premature as it is unknown whether ROCKi was active. While asking for ROCK1/2 KO cells would be too much, this claim is far-fetched.

      Figure 3

      B: In the legend, it is not clear what grey and light read colours mark. E-F: This reviewer finds it difficult to believe that p-ERM and TBXA2R signal intensities at the cell cortex could be reliably quantified using IHC images. The representative samples would indicate that p-ERM and TBXA2R positivity are not correlated. It would be crucial to show examples for each of the TNBC subgroups the existence of which is inferred based on p-ERM and TBXA2R staining. The conclusion that "no TNBC samples exhibited high TBXA2R expression and low levels of p-ERMs, further supporting a role for TBXA2R signalling in ERM activation in TNBC" is an overstatement.

      Figure 4

      The authors wrote that "We focused on the Hs578T cell line, which showed a median level of TBXA2R mRNA expression among the six TNBC cell lines tested". I do not understand the rationale for it as anti-TBXA2R antibodies detecting endogenous TBXA2R are available and thus why not use the median protein levels?

      Figure 5

      Effects of the knockouts are subtle, and rescue experiments would be needed to corroborate these results. The employed statistical analysis is prone to overestimating differences. The authors should use the superplots instead. The authors might also decide to use other TNBC cell lines to explore the functional relevance of this pathway in BC progression. This is particularly important because Hs578T are poorly tumorigenic, and they often do not form palpable tumours in mice.

      Figure 6

      The fact that Hs578T are poorly tumorigenic in mice is likely the reason why the authors used the experimental metastasis model. However, it is puzzling that metastases were studied in the liver but not in the lungs. Furthermore, the whole approach is rather artefactual as the TBXA2R agonist was administered for the entire duration of these experiments. What is the pathological relevance of such a study? Including a spontaneous metastasis model or alternative TNBC lines that mimic human disease more closely would help strengthen the functional relevance of this pathway in BC progression and study's translational relevance.

      Figure S2

      B-M: the pERM signal appears to be perinuclear in some of the tested cell lines. Please use a PM marker.

      Figure S3

      The authors should use the superplots to analyse the cell migration data.

      Discussion

      The claim that "our findings demonstrated that kinases of the SLK family are the only kinases needed for ERM activation by TBXA2R" should be tuned down as only 2 cell lines were tested. In this section, the authors should also discuss the proposed pro-metastatic functions of TXA2 and TXA2R in more detail, including vascular permeability. The sweeping conclusion that "TBXA2R expression correlates with phosphorylation and activation of ERMs in TNBC patient samples" clashes with the authors' own results; please stick to the data.

      Concluding remarks

      This study investigates a signaling pathway whereby TBXA2R thorugh ERM activation enhances the migratory and invasive potential of TNBC cells. However, several improvements are needed to support the main claims. The dependence on a single TNBC cell line, reliance on pharmacological inhibitors with potential off-target effects, and limited in vivo relevance detract from the generalizability of the findings. Additional TNBC models, adeguate controls, and a broader focus on natural metastasis patterns would make the conclusions more compelling. Moderating certain overstated claims would be needed to align the interpretations with the actual data.

      Cross-commenting

      I found comments in the other reviewers' reports that align with my criticisms on the mouse experiments as well as with those pertaining to the tissue culture work.

      Significance

      General comments

      The manuscript investigates the role of TBXA2R in the regulation of ERM in the context of TNBC metastasis. Much of this TBXA2R signalling axis is already known, as well as that SLK and LOK can phosphorylate ERM in other cell systems. Similarly, the positive role of ERM in cell migration/invasion and cancer progression has long been reported. The somewhat unexpected finding that ERM phosphorylation is independent of ROCK remains not fully convincing. The BC-related part is problematic as the continuous administration a TBXA2R agonist is required for key tumour metrics to show some differences in vivo. This calls into question the main conclusion of the work, namely that the TBXA2R/ERM-dependent pathway is activated during BC progression in TNBC cells.

      Audience

      Specialists interested in GPCRs and signal transduction or in the cytoskeleton.

      Expertise

      Rev: cancer cell biology, signal transduction, cytoskeleton, actin biochemistry, multiplexed imaging, mouse model of human diseases.

      Co-rev: nanoparticles, cell biology.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      Wang et al. studied an old, still unresolved problem: Why are reaching movements often biased? Using data from a set of new experiments and from earlier studies, they identified how the bias in reach direction varies with movement direction, and how this depends on factors such as the hand used, the presence of visual feedback, the size and location of the workspace, the visibility of the start position and implicit sensorimotor adaptation. They then examined whether a visual bias, a proprioceptive bias, a bias in the transformation from visual to proprioceptive coordinates and/or biomechanical factors could explain the observed patterns of biases. The authors conclude that biases are best explained by a combination of transformation and visual biases.

      A strength of this study is that it used a wide range of experimental conditions with also a high resolution of movement directions and large numbers of participants, which produced a much more complete picture of the factors determining movement biases than previous studies did. The study used an original, powerful, and elegant method to distinguish between the various possible origins of motor bias, based on the number of peaks in the motor bias plotted as a function of movement direction. The biomechanical explanation of motor biases could not be tested in this way, but this explanation was excluded in a different way using data on implicit sensorimotor adaptation. This was also an elegant method as it allowed the authors to test biomechanical explanations without the need to commit to a certain biomechanical cost function.

      We thank the reviewer for their enthusiastic comments.

      (1) The main weakness of the study is that it rests on the assumption that the number of peaks in the bias function is indicative of the origin of the bias. Specifically, it is assumed that a proprioceptive bias leads to a single peak, a transformation bias to two peaks, and a visual bias to four peaks, but these assumptions are not well substantiated. Especially the assumption that a transformation bias leads to two peaks is questionable. It is motivated by the fact that biases found when participants matched the position of their unseen hand with a visual target are consistent with this pattern. However, it is unclear why that task would measure only the effect of transformation biases, and not also the effects of visual and proprioceptive biases in the sensed target and hand locations. Moreover, it is not explained why a transformation bias would lead to this specific bias pattern in the first place.

      We would like to clarify two things.

      Frist, the measurements of the transformation bias are not entirely independent of proprioceptive and visual biases. Specifically, we define transformation bias as the misalignment between the internal representation of a visual target and the corresponding hand position. By this definition, the transformation error entails both visual and proprioceptive biases (see Author response image 1). Transformation biases have been empirically quantified in numerous studies using matching tasks, where participants either aligned their unseen hand to a visual target (Wang et al., 2021) or aligned a visual target to their unseen hand (Wilson et al., 2010). Indeed, those tasks are always considered as measuring proprioceptive biases assuming visual bias is small given the minimal visual uncertainty.

      Author response image 1.

      Second, the critical difference between models is in how these biases influence motor planning rather than how those biases are measured. In the Proprioceptive bias model, a movement is planned in visual space. The system perceives the starting hand position in proprioceptive space and transforms this into visual space (Vindras & Viviani, 1998; Vindras et al., 2005). As such, bias only affects the perceived starting position; there is no influence on the perceived target location (no visual bias).

      In contrast, the Transformation bias model proposes that while both the starting and target positions are perceived in visual space, movement is planned in proprioceptive space. Consequently, both positions must be transformed from visual space to proprioceptive coordinates before movement planning (i.e., where is my sensed hand and where do I want it to be). Under this framework, biases can emerge from both the start and target positions. This is how the transformation model leads to different predictions compared to the perceptual models, even if the bias is based on the same measurements.

      We now highlight the differences between the Transformation bias model and the Proprioceptive bias model explicitly in the Results section (Lines 192-200):

      “Note that the Proprioceptive Bias model and the Transformation Bias model tap into the same visuo-proprioceptive error map. The key difference between the two models arises in how this error influences motor planning. For the Proprioceptive Bias model, planning is assumed to occur in visual space. As such, the perceived position of the hand (based on proprioception) is transformed into the visual space. This will introduce a bias in the representation of the start position. In contrast, the Transformation Bias model assumes that the visually-based representations of the start and target positions need to be transformed into proprioceptive space for motor planning. As such, both positions are biased in the transformation process. In addition to differing in terms of their representation of the target, the error introduced at the start position is in opposite directions due to the direction of the transformation (see fig 1g-h).”

      In terms of the motor bias function across the workspace, the peaks are quantitatively derived from the model simulations. The number of peaks depends on how we formalize each model. Importantly, this is a stable feature of each model, regardless of how the model is parameterized. Thus, the number of peaks provides a useful criterion to evaluate different models.

      Figure 1 g-h illustrates the intuition of how the models generate distinct peak patterns. We edited the figure caption and reference this figure when we introduce the bias function for each model.

      (2) Also, the assumption that a visual bias leads to four peaks is not well substantiated as one of the papers on which the assumption was based (Yousif et al., 2023) found a similar pattern in a purely proprioceptive task.

      What we referred to in the original submission as “visual bias” is not an eye-centric bias, nor is it restricted to the visual system. Rather, it may reflect a domain-general distortion in the representation of position within polar space. We called it a visual bias as it was associated with the perceived location of the visual target in the current task. To avoid confusion, we have opted to move to a more general term and now refer to this as “target bias.”

      We clarify the nature of this bias when introducing the model in the Results section (Lines 164-169):

      “Since the task permits free viewing without enforced fixation, we assume that participants shift their gaze to the visual target; as such, an eye-centric bias is unlikely. Nonetheless, prior studies have shown a general spatial distortion that biases perceived target locations toward the diagonal axes(Huttenlocher et al., 2004; Kosovicheva & Whitney, 2017). Interestingly, this bias appears to be domain-general, emerging not only for visual targets but also for proprioceptive ones(Yousif et al., 2023). We incorporated this diagonal-axis spatial distortion into a Target Bias model. This model predicts a four-peaked motor bias pattern (Fig 1f).”

      We also added a paragraph in the Discussion to further elaborate on this model (Lines 502-511):

      “What might be the source of the visual bias in the perceived location of the target? In the perception literature, a prominent theory has focused on the role of visual working memory account based on the observation that in delayed response tasks, participants exhibit a bias towards the diagonals when recalling the location of visual stimuli(Huttenlocher et al., 2004; Sheehan & Serences, 2023). Underscoring that the effect is not motoric, this bias is manifest regardless of whether the response is made by an eye movement, pointing movement, or keypress(Kosovicheva & Whitney, 2017). However, this bias is unlikely to be dependent on a visual input as similar diagonal bias is observed when the target is specified proprioceptively via the passive displacement of an unseen hand(Yousif et al., 2023). Moreover, as shown in the present study, a diagonal bias is observed even when the target is continuously visible. Thus, we hypothesize that the bias to perceive the target towards the diagonals reflects a more general distortion in spatial representation rather than being a product of visual working memory.”

      (3) Another weakness is that the study looked at biases in movement direction only, not at biases in movement extent. The models also predict biases in movement extent, so it is a missed opportunity to take these into account to distinguish between the models.

      We thank the reviewer for this suggestion. We have now conducted a new experiment to assess angular and extent biases simultaneously (Figure 4a; Exp. 4; N = 30). Using our KINARM system, participants were instructed to make center-out movements that would terminate (rather than shoot past) at the visual target. No visual feedback was provided throughout the experiment.

      The Transformation Bias model predicts a two-peaked error function in both the angular and extent dimensions (Figure 4c). Strikingly, when we fit the data from the new experiment to both dimensions simultaneously, this model captures the results qualitatively and quantitatively (Figure 4e). In terms of model comparison, it outperformed alternative models (Figure 4g) particularly when augmented with a visual bias component. Together, these results provide strong evidence that a mismatch between visual and proprioceptive space is a key source of motor bias.

      This experiment is now reported within the revised manuscript (Lines 280-301).

      Overall, the authors have done a good job mapping out reaching biases in a wide range of conditions, revealing new patterns in one of the most basic tasks, but unambiguously determining the origin of these biases remains difficult, and the evidence for the proposed origins is incomplete. Nevertheless, the study will likely have a substantial impact on the field, as the approach taken is easily applicable to other experimental conditions. As such, the study can spark future research on the origin of reaching biases.

      We thank the reviewer for these summary comments. We believe that the new experiments and analyses do a better job of identifying the origins of motor biases.

      Reviewer #2 (Public Review):

      Summary:

      This work examines an important question in the planning and control of reaching movements - where do biases in our reaching movements arise and what might this tell us about the planning process? They compare several different computational models to explain the results from a range of experiments including those within the literature. Overall, they highlight that motor biases are primarily caused by errors in the transformation between eye and hand reference frames. One strength of the paper is the large number of participants studied across many experiments. However, one weakness is that most of the experiments follow a very similar planar reaching design - with slicing movements through targets rather than stopping within a target. Moreover, there are concerns with the models and the model fitting. This work provides valuable insight into the biases that govern reaching movements, but the current support is incomplete.

      Strengths:

      The work uses a large number of participants both with studies in the laboratory which can be controlled well and a huge number of participants via online studies. In addition, they use a large number of reaching directions allowing careful comparison across models. Together these allow a clear comparison between models which is much stronger than would usually be performed.

      We thank the reviewer for their encouraging comments.

      Weaknesses:

      Although the topic of the paper is very interesting and potentially important, there are several key issues that currently limit the support for the conclusions. In particular I highlight:

      (1) Almost all studies within the paper use the same basic design: slicing movements through a target with the hand moving on a flat planar surface. First, this means that the authors cannot compare the second component of a bias - the error in the direction of a reach which is often much larger than the error in reaching direction.

      Reviewer 1 made a similar point, noting that we had missed an opportunity to provide a more thorough assessment of reaching biases. As described above, we conducted a new experiment in which participants made pointing movements, instructed to terminate the movements at the target. These data allow us to analyze errors in both angular and extent dimensions. The transformation bias model successfully predicts angular and extent biases, outperformed the other models at both group and individual levels. We have now included this result as Exp 4 in the manuscript. Please see response to Reviewer 1 Comment 3 for details.

      Second, there are several studies that have examined biases in three-dimensional reaching movements showing important differences to two-dimensional reaching movements (e.g. Soechting and Flanders 1989). It is unclear how well the authors' computational models could explain the biases that are present in these much more common-reaching movements.

      This is an interesting issue to consider. We expect the mechanisms identified in our 2D work will generalize to 3D.

      Soechting and Flanders (1989) quantified 3D biases by measuring errors across multiple 2D planes at varying heights (see Author response image 2 for an example from their paper). When projecting their 3-D bias data to a horizontal 2D space, the direction of the bias across the 2D plane looks relatively consistent across different heights even though the absolute value of the bias varies (Author response image 2). For example, the matched hand position is generally to the leftwards and downward of the target. Therefore, the models we have developed and tested in a specific 2D plane are likely to generalize to other 2D plane of different heights.

      Author response image 2.

      However, we think the biases reported by Soechting and Flanders likely reflect transformation biases rather than motor biases. First, the movements in their study were performed very slowly (3–5 seconds), more similar to our proprioceptive matching tasks and much slower than natural reaching movements (<500ms). Given the slow speed, we suspect that motor planning in Soechting and Flanders was likely done in a stepwise, incremental manner (closed loop to some degree). Second, the bias pattern reported in Soechting and Flanders —when projected into 2D space— closely mirrors the leftward transformation errors observed in previous visuo-proprioceptive matching task (e.g., Wang et al., 2021).

      In terms of the current manuscript, we think that our new experiment (Exp 4, where we measure angular and radial error) provides strong evidence that the transformation bias model generalizes to more naturalistic pointing movements. As such, we expect these principles will generalize were we to examine movements in three dimensions, an extension we plan to test in future work.

      (2) The model fitting section is under-explained and under-detailed currently. This makes it difficult to accurately assess the current model fitting and its strength to support the conclusions. If my understanding of the methods is correct, then I have several concerns. For example, the manuscript states that the transformation bias model is based on studies mapping out the errors that might arise across the whole workspace in 2D. In contrast, the visual bias model appears to be based on a study that presented targets within a circle (but not tested across the whole workspace). If the visual bias had been measured across the workspace (similar to the transformation bias model), would the model and therefore the conclusions be different?

      We have substantially expanded the Methods section to clarify the modeling procedures (detailed below in section “Recommendations for the Authors”). We also provide annotated code to enable others to easily simulate the models.

      Here we address three points relevant to the reviewer’s concern about whether the models were tested on equal footing, and in particular, concern that the transformation bias model was more informed by prior literature than the visual bias model.

      First, our center-out reaching task used target locations that have been employed in both visual and proprioceptive bias studies, offering reasonable comprehensive coverage of the workspace. For example, for a target to the left of the body’s midline, visual biases tend to be directed diagonally (Kosovicheva & Whitney, 2017), while transformation biases are typically leftward and downward (Wang et al, 2021). In this sense, the models were similarly constrained by prior findings.

      Second, while the qualitative shape of each model was guided by prior empirical findings, no previous data were directly used to quantitatively constrain the models. As such, we believe the models were evaluated on equal footing. No model had more information or, best we can tell, an inherent advantage over the others.

      Third, reassuringly, the fitted transformation bias closely matches empirically observed bias maps reported in prior studies (Fig 2h). The strong correspondence provides convergent validity and supports the putative causality between transformation biases to motor biases.

      (3) There should be other visual bias models theoretically possible that might fit the experimental data better than this one possible model. Such possibilities also exist for the other models.

      Our initial hypothesis, grounded in prior literature, was that motor biases arise from a combination of proprioceptive and visual biases. This led us to thoroughly explore a range of visual models. We now describe these alternatives below, noting that in the paper, we chose to focus on models that seemed the most viable candidates. (Please also see our response to Reviewer 3, Point 2, on another possible source of visual bias, the oblique effect.)

      Quite a few models have described visual biases in perceiving motion direction or object orientation (e.g., Wei & Stocker, 2015; Patten, Mannion & Clifford, 2017). Orientation perception would be biased towards the Cartesian axis, generating a four-peak function. However, these models failed to account for the motor biases observed in our experiments. This is not surprising given that these models were not designed to capture biases related to a static location.

      We also considered a class of eye-centric models where biases for peripheral locations are measured under fixation. A prominent finding here is that the bias is along the radial axis in which participants overshoot targets when they fixate on the start position during the movement (Beurze et al., 2006; Van Pelt & Medendorp, 2008). Again, this is not consistent with the observed motor biases. For example, participants undershoot rightward targets when we measured the distance bias in Exp 4. Importantly, since most our tasks involved free viewing in natural settings with no fixation requirements, we considered it unlikely that biases arising from peripheral viewing play a major role.

      We note, though, that in our new experiment (Exp 4), participants observed the visual stimuli from a fixed angle in the KinArm setup (see Figure 4a). This setup has been shown to induce depth-related visual biases (Figure 4b, e.g., Volcic et al., 2013; Hibbard & Bradshaw, 2003). For this reason, we implemented a model incorporating this depth bias as part of our analyses of these data. While this model performed significantly worse than the transformation bias model alone, a mixed model that combined the depth bias and transformation bias provided the best overall fit. We now include this result in the main text (Lines 286-294).

      We also note that the “visual bias” we referred to in the original submission is not restricted to the visual system. A similar bias pattern has been observed when the target is presented visually or proprioceptively (Kosovicheva & Whitney, 2017; Yousif, Forrence, & McDougle, 2023). As such, it may reflect a domaingeneral distortion in the representation of position within polar space. Accordingly, in the revision, we now refer to this in a more general way, using the term “target bias.” We justify this nomenclature when introducing the model in the Results section (Lines 164-169). Please also see Reviewer 1 comment 2.

      We recognize that future work may uncover a better visual model or provide a more fine-grained account of visual biases (or biases from other sources). With our open-source simulation code, such biases can be readily incorporated—either to test them against existing models or to combine them with our current framework to assess their contribution to motor biases. Given our explorations, we expect our core finding will hold: Namely, that a combination of transformation and target biases offers the most parsimonious account, with the bias associated with the transformation process explaining the majority of the observed motor bias in visually guided movements.

      Given the comments from the reviewer, we expanded the discussion session to address the issue of alternative models of visual bias (lines 522-529):

      “Other forms of visual bias may influence movement. Depth perception biases could contribute to biases in movement extent(Beurze et al., 2006; Van Pelt & Medendorp, 2008). Visual biases towards the principal axes have been reported when participants are asked to report the direction of moving targets or the orientation of an object(Patten et al., 2017; Wei & Stocker, 2015). However, the predicted patterns of reach biases do not match the observed biases in the current experiments. We also considered a class of eye-centric models in which participants overestimate the radial distance to a target while maintaining central fixation(Beurze et al., 2006; Van Pelt & Medendorp, 2008). At odds with this hypothesis, participants undershot rightward targets when we measured the radial bias in Exp 4. The absence of these other distortions of visual space may be accounted for by the fact that we allowed free viewing during the task.”

      (4) Although the authors do mention that the evidence against biomechanical contributions to the bias is fairly weak in the current manuscript, this needs to be further supported. Importantly both proprioceptive models of the bias are purely kinematic and appear to ignore the dynamics completely. One imagines that there is a perceived vector error in Cartesian space whereas the other imagines an error in joint coordinates. These simply result in identical movements which are offset either with a vector or an angle. However, we know that the motor plan is converted into muscle activation patterns which are sent to the muscles, that is, the motor plan is converted into an approximation of joint torques. Joint torques sent to the muscles from a different starting location would not produce an offset in the trajectory as detailed in Figure S1, instead, the movements would curve in complex patterns away from the original plan due to the non-linearity of the musculoskeletal system. In theory, this could also bias some of the other predictions as well. The authors should consider how the biomechanical plant would influence the measured biases.

      We thank the reviewer for encouraging us on this topic and to formalize a biomechanical model. In response, we have implemented a state-of-the-art biomechanical framework, MotorNet

      (https://elifesciences.org/articles/88591), which simulates a six-muscle, two-skeleton planar arm model using recurrent neural networks (RNNs) to generate control policies (See Figure 6a). This model captures key predictions about movement curvature arising from biomechanical constraints. We view it as a strong candidate for illustrating how motor bias patterns could be shaped by the mechanical properties of the upper limb.

      Interestingly, the biomechanical model did not qualitatively or quantitatively reproduce the pattern of motor biases observed in our data. Specifically, we trained 50 independent agents (RNNs) to perform random point-to-point reaching movements across the workspace used in our task. We used a loss function that minimized the distance between the fingertip and the target over the entire trajectory. When tested on a center-out reaching task, the model produced a four-peaked motor bias pattern (Figure 6b), in contrast to the two-peaked function observed empirically. These results suggest that upper limb biomechanical constraints are unlikely to be a primary driver of motor biases in reaching. This holds true even though the reported bias is read out at 60% of the reaching distance, where biomechanical influences on the curvature of movement are maximal. We have added this analysis to the results (lines 367-373).

      It may seem counterintuitive that biomechanics plays a limited role in motor planning. This could be due to several factors. First, First, task demands (such as the need to grasp objects) may lead the biomechanical system to be inherently organized to minimize endpoint errors (Hu et al., 2012; Trumbower et al., 2009). Second, through development and experience, the nervous system may have adapted to these biomechanical influences—detecting and compensating for them over time (Chiel et al., 2009).

      That said, biomechanical constraints may make a larger contribution in other contexts; for example, when movements involve more extreme angles or span larger distances, or in individuals with certain musculoskeletal impairments (e.g., osteoarthritis) where physical limitations are more likely to come into play. We address this issue in the revised discussion.

      “Nonetheless, the current study does not rule out the possibility that biomechanical factors may influence motor biases in other contexts. Biomechanical constraints may have had limited influence in our experiments due to the relatively modest movement amplitudes used and minimal interaction torques involved. Moreover, while we have focused on biases that manifest at the movement endpoint, biomechanical constraints might introduce biases that are manifest in the movement trajectories.(Alexander, 1997; Nishii & Taniai, 2009) Future studies are needed to examine the influence of context on reaching biases.”

      Reviewer #3 (Public review):

      The authors make use of a large dataset of reaches from several studies run in their lab to try to identify the source of direction-dependent radial reaching errors. While this has been investigated by numerous labs in the past, this is the first study where the sample is large enough to reliably characterize isometries associated with these radial reaches to identify possible sources of errors.

      (1) The sample size is impressive, but the authors should Include confidence intervals and ideally, the distribution of responses across individuals along with average performance across targets. It is unclear whether the observed “averaged function” is consistently found across individuals, or if it is mainly driven by a subset of participants exhibiting large deviations for diagonal movements. Providing individual-level data or response distributions would be valuable for assessing the ubiquity of the observed bias patterns and ruling out the possibility that different subgroups are driving the peaks and troughs. It is possible that the Transformation or some other model (see below) could explain the bias function for a substantial portion of participants, while other participants may have different patterns of biases that can be attributable to alternative sources of error.

      We thank the reviewer for encouraging a closer examination of the individual-level data. We did include standard error when we reported the motor bias function. Given that the error distribution is relatively Gaussian, we opted to not show confidence intervals since they would not provide additional information.

      To examine individual differences, we now report a best-fit model frequency analysis. For Exp 1, we fit each model at the individual level and counted the number of participants that are best predicted by each model. Among the four single source models (Figure 3a), the vast majority of participants are best explained by the transformation bias model (48/56). When incorporating mixture models, the combined transformation + target bias model emerged as the best fit for almost all participants across experiments (50/56). The same pattern holds for Exp 3b, the frequency analysis is more distributed, likely due to the added noise that comes with online studies.

      We report this new analysis in the Results. (see Fig 3. Fig S2). Note that we opted to show some representative individual fits, selecting individuals whose data were best predicted by different models (Fig S2). Given that the number of peaks characterizes each model (independent of the specific parameter values), the two-peaked function exhibited for most participants indicates that the Transformation bias model holds at the individual level and not just at the group level.

      (2) The different datasets across different experimental settings/target sets consistently show that people show fewer deviations when making cardinal-directed movements compared to movements made along the diagonal when the start position is visible. This reminds me of a phenomenon referred to as the oblique effect: people show greater accuracy for vertical and horizontal stimuli compared to diagonal ones. While the oblique effect has been shown in visual and haptic perceptual tasks (both in the horizontal and vertical planes), there is some evidence that it applies to movement direction. These systematic reach deviations in the current study thus may reflect this epiphenomenon that applies across modalities. That is, estimating the direction of a visual target from a visual start position may be less accurate, and may be more biased toward the horizontal axis, than for targets that are strictly above, below, left, or right of the visual start position. Other movement biases may stem from poorer estimation of diagonal directions and thus reflect more of a perceptual error than a motor one. This would explain why the bias function appears in both the in-lab and on-line studies although the visual targets are very different locations (different planes, different distances) since the oblique effects arise independent of plane, distance, or size of the stimuli. When the start position is not visible like in the Vindras study, it is possible that this oblique effect is less pronounced; masked by other sources of error that dominate when looking at 2D reach endpoint made from two separate start positions, rather than only directional errors from a single start position. Or perhaps the participants in the Vindras study are too variable and too few (only 10) to detect this rather small direction-dependent bias.

      The potential link between the oblique effect and the observed motor bias is an intriguing idea, one that we had not considered. However, after giving this some thought, we see several arguments against the idea that the oblique effect accounts for the pattern of motor biases.

      First, by the oblique effect, perceptual variability is greater along the diagonal axes compared to the cardinal axes. These differences in perceptual variability have been used to explain biases in visual perception through a Bayesian model under the assumption that the visual system has an expectation that stimuli are more likely to be oriented along the cardinal axes (Wei & Stocker, 2015). Importantly, the model predicts low biases at targets with peak perceptual variability. As such, even though those studies observed that participants showed large variability for stimuli at diagonal orientations, the bias for these stimuli was close to zero. Given we observed a large bias for targets at locations along the diagonal axes, we do not think this visual effect can explain the motor bias function.

      Second, the reviewer suggested that the observed motor bias might be largely explained by visual biases (or what we now refer to as target biases). If this hypothesis is correct, we would anticipate observing a similar bias pattern in tasks that use a similar layout for visual stimuli but do not involve movement. However, this prediction is not supported. For example, Kosovicheva & Whitney (2017) used a position reproduction/judgment task with keypress responses (no reaching). The stimuli were presented in a similar workspace as in our task. Their results showed four-peaked bias function while our results showed a two-peaked function.

      In summary, we don’t think oblique biases make a significant contribution to our results.

      A bias in estimating visual direction or visual movement vector Is a more realistic and relevant source of error than the proposed visual bias model. The Visual Bias model is based on data from a study by Huttenlocher et al where participants “point” to indicate the remembered location of a small target presented on a large circle. The resulting patterns of errors could therefore be due to localizing a remembered visual target, or due to relative or allocentric cues from the clear contour of the display within which the target was presented, or even movements used to indicate the target. This may explain the observed 4-peak bias function or zig-zag pattern of “averaged” errors, although this pattern may not even exist at the individual level, especially given the small sample size. The visual bias source argument does not seem well-supported, as the data used to derive this pattern likely reflects a combination of other sources of errors or factors that may not be applicable to the current study, where the target is continuously visible and relatively large. Also, any visual bias should be explained by a coordinates centre on the eye and should vary as a function of the location of visual targets relative to the eyes. Where the visual targets are located relative to the eyes (or at least the head) is not reported.

      Thank you for this question. A few key points to note:

      The visual bias model has also been discussed in studies using a similar setup to our study. Kosovicheva & Whitney (2017) observed a four-peaked function in experiments in which participants report a remembered target position on a circle by either making saccades or using key presses to adjust the position of a dot. However, we agree that this bias may be attenuated in our experiment given that the target is continuously visible. Indeed, the model fitting results suggest the peak of this bias is smaller in our task (~3°) compared to previous work (~10°, Kosovicheva & Whitney, 2017; Yousif, Forrence, & McDougle, 2023).

      We also agree with the reviewer that this “visual bias” is not an eye-centric bias, nor is it restricted to the visual system. A similar bias pattern is observed even if the target is presented proprioceptively (Yousif, Forrence, & McDougle, 2023). As such, this bias may reflect a domain-general distortion in the representation of position within polar space. Accordingly, in the revision, we now refer to this in a more general way, using the term “target bias”, rather than visual bias. We justify this nomenclature when introducing the model in the Results section (Lines 164-169). Please also see Reviewer 1 comment 2 for details.

      Motivated by Reviewer 2, we also examined multiple alternative visual bias models (please refer to our response to Reviewer 2, Point 3.

      The Proprioceptive Bias Model is supposed to reflect errors in the perceived start position. However, in the current study, there is only a single, visible start position, which is not the best design for trying to study the contribution. In fact, my paradigms also use a single, visual start position to minimize the contribution of proprioceptive biases, or at least remove one source of systematic biases. The Vindras study aimed to quantify the effect of start position by using two sets of radial targets from two different, unseen start positions on either side of the body midline. When fitting the 2D reach errors at both the group and individual levels (which showed substantial variability across individuals), the start position predicted most of the 2D errors at the individual level – and substantially more than the target direction. While the authors re-plotted the data to only illustrate angular deviations, they only showed averaged data without confidence intervals across participants. Given the huge variability across their 10 individuals and between the two target sets, it would be more appropriate to plot the performance separately for two target sets and show confidential intervals (or individual data). Likewise, even the VT model predictions should differ across the two targets set since the visual-proprioceptive matching errors from the Wang et al study that the model is based on, are larger for targets on the left side of the body.

      To be clear, in the Transformation bias model, the vector bias at the start position is also an important source of error. The critical difference between the proprioceptive and transformation models is how bias influences motor planning. In the Proprioceptive bias model, movement is planned in visual space. The system perceives the starting hand position in proprioceptive space and transforms this into visual space (Vindras & Viviani, 1998; Vindras et al., 2005). As such, the bias is only relevant in terms of the perceived start position; it does not influence the perceived target location. In contrast, the transformation bias model proposes that while both the starting and target positions are perceived in visual space, movements are planned in proprioceptive space. Consequently, when the start and target positions are visible, both positions must be transformed from visual space to proprioceptive coordinates before movement planning. Thus, bias will influence both the start and target positions. We also note that to set the transformation bias for the start/target position, we referred to studies in which bias is usually referred to as proprioception error measurement. As such, changing the start position has a similar impact on the Transformation and the Proprioceptive Bias models in principle, and would not provide a stronger test to separate them.

      We now highlight the differences between the models in the Results section, making clear that the bias at the start position influences both the Proprioceptive bias and Transformation bias models (Lines 192200).

      “Note that the Proprioceptive Bias model and the Transformation Bias model tap into the same visuo-proprioceptive error map. The key difference between the two models arises in how this error influences motor planning. For the Proprioceptive Bias model, planning is assumed to occur in visual space. As such, the perceived position of the hand (based on proprioception) is transformed into visual space. This will introduce a bias in the representation of the start position. In contrast, the Transformation Bias model assumes that the visually-based representations of the start and target positions need to be transformed into proprioceptive space for motor planning. As such, both positions are biased in the transformation process. In addition to differing in terms of their representation of the target, the error introduced at the start position is in opposite directions due to the direction of the transformation (see fig 1g-h).”

      In terms of fitting individual data, we have conducted a new experiment, reported as Exp 4 in the revised manuscript (details in our response to Reviewer 1, comment 3). The experiment has a larger sample size (n=30) and importantly, examined error for both movement angle and movement distance. We chose to examine the individual differences in 2-D biases using this sample rather than Vindras’ data as our experiment has greater spatial resolution and more participants. At both the group and individual level, the Transformation bias model is the best single source model, and the Transformation + Target Bias model is the best combined model. These results strongly support the idea that the transformation bias is the main source of the motor bias.

      As for the different initial positions in Vindras et al (2005), the two target sets have very similar patterns of motor biases. As such, we opted to average them to decrease noise. Notably, the transformation model also predicts that altering the start location should have limited impact on motor bias patterns: What matters for the model is the relative difference between the transformation biases at the start and target positions rather than the absolute bias.

      Author response image 3.

      I am also having trouble fully understanding the V-T model and its associated equations, and whether visual-proprioception matching data is a suitable proxy for estimating the visuomotor transformation. I would be interested to first see the individual distributions of errors and a response to my concerns about the Proprioceptive Bias and Visual Bias models.

      We apologize for the lack of clarity on this model. To generate the T+V (Now Transformation + Target bias, or TR+TG) model, we assume the system misperceives the target position (Target bias, see Fig S5a) and then transforms the start and misperceived target positions into proprioceptive space (Fig S5b). The system then generates a motor plan in proprioceptive space; this plan will result in the observed motor bias (Fig. S5c). We now include this figure as Fig S5 and hope that it makes the model features salient.

      Regarding whether the visuo-proprioceptive matching task is a valid proxy for transformation bias, we refer the reviewer to the comments made by Public Reviewer 1, comment 1. We define the transformation bias as the discrepancy between corresponding positions in visual and proprioceptive space. This can be measured using matching tasks in which participants either aligned their unseen hand to a visual target (Wang et al., 2021) or aligned a visual target to their unseen hand (Wilson et al., 2010).

      Nonetheless, when fitting the model to the motor bias data, we did not directly impose the visual-proprioceptive matching data. Instead, we used the shape of the transformation biases as a constraint, while allowing the exact magnitude and direction to be free parameters (e.g., a leftward and downward bias scaled by distance from the right shoulder). Reassuringly, the fitted transformation biases closely matched the magnitudes reported in prior studies (Fig. 2h, 1e), providing strong quantitative support for the hypothesized causal link between transformation and motor biases.

      Recommendations for the authors:

      Overall, the reviewers agreed this is an interesting study with an original and strong approach. Nonetheless, there were three main weaknesses identified. First, is the focus on bias in reach direction and not reach extent. Second, the models were fit to average data and not individual data. Lastly, and most importantly, the model development and assumptions are not well substantiated. Addressing these points would help improve the eLife assessment.

      Reviewer #1 (Recommendations for the authors):

      It is mentioned that the main difference between Experiments 1 and 3 is that in Experiment 3, the workspace was smaller and closer to the shoulder. Was the location of the laptop relative to the participant in Experiment 3 known by the authors? If so, variations in this location across participants can be used to test whether the Transformation bias was indeed larger for participants who had the laptop further from the shoulder.

      Another difference between Experiments 1 and 3 is that in Experiment 1, the display was oriented horizontally, whereas it was vertical in Experiment 3. To what extent can that have led to the different results in these experiments?

      This is an interesting point that we had not considered. Unfortunately, for the online work we do not record the participants’ posture.

      Regarding the influence of display orientation (horizontal vs. vertical), Author response image 4 presents three relevant data points: (1) Vandevoorde and Orban de Xivry (2019), who measured motor biases in-person across nine target positions using a tablet and vertical screen; (2) Our Experiment 1b, conducted online with a vertical setup; (3) Our in-person Experiment 3b, using a horizontal monitor. For consistency, we focus on the baseline conditions with feedback, the only condition reported in Vandevoorde. Motor biases from the two in-person studies were similar despite differing monitor orientations: Both exhibited two-peaked functions with comparable peak locations. We note that the bias attenuation in Vandevoorde may be due to their inclusion of reward-based error signals in addition to cursor feedback. In contrast, compared to the in-person studies, the online study showed reduced bias magnitude with what appears to be a four peaked function. While more data are needed, these results suggest that the difference in the workspace (more restricted in our online study) may be more relevant than monitor orientation.

      Author response image 4.

      For the joint-based proprioceptive model, the equations used are for an arm moving in a horizontal plane at shoulder height, but the figures suggest the upper arm was more vertical than horizontal. How does that affect the predictions for this model?

      Please also see our response to your public comment 1. When the upper limb (or the lower limb) is not horizontal, it will influence the projection of the upper limb to the 2-D space. Effectively in the joint-based proprioceptive model, this influences the ratio between L1 and L2 (see  Author response image 5b below). However, adding a parameter to vary L1/L2 ratio would not change the set of the motor bias function that can be produced by the model. Importantly, it will still generate a one-peak function. We simulated 50 motor bias function across the possible parameter space. As shown by  Author response image 5c-d, the peak and the magnitude of the motor bias functions are very similar with and without the L1/L2 term. We characterize the bias function with the peak position and the peak-to-valley distance. Based on those two factors, the distribution of the motor bias function is very similar ( Author response image 5e-f). Moreover, the L1/L2 ratio parameter is not recoverable by model fitting ( Author response image 5c), suggesting that it is redundant with other parameters. As such we only include the basic version of the joint-based proprioceptive model in our model comparisons.

      Author response image 5.

      It was unclear how the models were fit and how the BIC was computed. It is mentioned that the models were fit to average data across participants, but the BIC values were based on all trials for all participants, which does not seem consistent. And the models are deterministic, so how can a log-likelihood be determined? Since there were inter-individual differences, fitting to average data is not desirable. Take for instance the hypothetical case that some participants have a single peak at 90 deg, and others have a single peak at 270 deg. Averaging their data will then lead to a pattern with two peaks, which would be consistent with an entirely different model.

      We thank the reviewer for raising these issues.

      Given the reviewers’ comments, we now report fits at both the group and individual level (see response to reviewer 3 public comment 1). The group-level fitting is for illustration purposes. Model comparison is now based on the individual-level analyses which show that the results are best explained by the transformation model when comparing single source models and best explained by the T+V (now TG+TR) model when consider all models. These new results strongly support the transformation model.

      Log-likelihoods were computed assuming normally distributed motor noise around the motor biases predicted by each model.

      We updated the Methods section as follows (lines 841-853):

      “We used the fminsearchbnd function in MATLAB to minimize the sum of loglikelihood (LL) across all trials for each participant. LL were computed assuming normally distributed noise around each participant’s motor biases:

      [11] LL = normpdf(x, b, c)

      where x is the empirical reaching angle, b is the predicted motor bias by the model, c is motor noise, calculated as the standard deviation of (x − b). For model comparison, we calculated the BIC as follow:

      [12] BIC = -2LL+k∗ln(n)

      where k is the number of parameters of the models. Smaller BIC values correspond to better fits. We report the sum of ΔBIC by subtracting the BIC value of the TR+TG model from all other models.

      For illustrative purposes, we fit each model at the group level, pooling data across all participants to predict the group-averaged bias function.”

      What was the delay of the visual feedback in Experiment 1?

      The visual delay in our setup was ~30 ms, with the procedure used to estimate this described in detail in Wang et al (2024, Curr. Bio.). We note that in calculating motor biases, we primarily relied on the data from the no-feedback block.

      Minor corrections

      In several places it is mentioned that movements were performed with proximal and distal effectors, but it's unclear where that refers to because all movements were performed with a hand (distal effector).

      By 'proximal and distal effectors,' we were referring to the fact that in the online setup, “reaching movements” are primarily made by finger and/or wrist movements across a trackpad, whereas in the inperson setup, the participants had to use their whole arm to reach about the workspace. To avoid confusion, we now refer to these simply as 'finger' versus 'hand' movements.

      In many figures, Bias is misspelled as Bais.

      Fixed.

      In Figure 3, what is meant by deltaBIC (*1000) etc? Literally, it would mean that the bars show 1,000 times the deltaBIC value, suggesting tiny deltaBIC values, but that's probably not what's meant.

      ×1000' in the original figure indicates the unit scaling, with ΔBIC values ranging from approximately 1000 to 4000. However, given that we now fit the models at the individual level, we have replaced this figure with a new one (Figure 3e) showing the distribution of individual BIC values.

      Reviewer #2 (Recommendations for the authors):

      I have concerns that the authors only examine slicing movements through the target and not movements that stop in the target. Biases create two major errors - errors in direction and errors in magnitude and here the authors have only looked at one of these. Previous work has shown that both can be used to understand the planning processes underlying movement. I assume that all models should also make predictions about the magnitude biases which would also help support or rule out specific models.

      Please see our response to Reviewer 1 public review 3.

      As discussed above, three-dimensional reaching movements also have biases and are not studied in the current manuscript. In such studies, biomechanical factors may play a much larger role.

      Please see our response to your public review.

      It may be that I am unclear on what exactly is done, as the methods and model fitting barely explain the details, but on my reading on the methods I have several major concerns.

      First, it feels that the visual bias model is not as well mapped across space if it only results from one study which is then extrapolated across the workspace. In contrast, the transformation model is actually measured throughout the space to develop the model. I have some concerns about whether this is a fair comparison. There are potentially many other visual bias models that might fit the current experimental results better than the chosen visual bias model.

      Please refers to our response to your public review.

      It is completely unclear to me why a joint-based proprioceptive model would predict curved planned movements and not straight movements (Figure S1). Changes in the shoulder and elbow joint angles could still be controlled to produce a straight movement. On the other hand, as mentioned above, the actual movement is likely much more complex if the physical starting position is offset from the perceived hand.

      Natural movements are often curved, reflecting a drive to minimize energy expenditure or biomechanical constraints (e.g., joint and muscle configuration). This is especially the case when the task emphasizes endpoint precision (Codol et al., 2024) like ours. Trajectory curvature was also observed in a recent simulation study in which a neural network was trained to control a biomechanical model (2-limb, 6muscles) with the cost function specified to minimize trajectory error (reach to a target with as straight a movement as possible). Even under these constraints, the movements showed some curvature. To examined whether the endpoint reaching bias somehow reflects the curvature (or bias during reaching), we included the prediction of this new biomechanical model in the paper to show it does not explain the motor bias we observed.

      To be clear, while we implemented several models (Joint-based proprioceptive model and the new biomechanical model) to examine whether motor biases can be explained by movement curvature, our goal in this paper was to identify the source of the endpoint bias. Our modeling results reveal a previously underappreciated source of motor bias—a transformation error that arises between visual and proprioceptive space—plays a dominant role in shaping motor bias patterns across a wide range of experiments, including naturalistic reaching contexts where vision and hand are aligned at the start position. While the movement curvature might be influenced by selectively manipulating factors that introduce a mismatch between the visual starting position and the actual hand position (such as Sober and Sabes, 2003), we think it will be an avenue for future work to investigate this question.

      The model fitting section is barely described. It is unclear how the data is fit or almost any other aspects of the process. How do the authors ensure that they have found the minimum? How many times was the process repeated for each model fit? How were starting parameters randomized? The main output of the model fitting is BIC comparisons across all subjects. However, there are many other ways to compare the models which should be considered in parallel. For example, how well do the models fit individual subjects using BIC comparisons? Or how often are specific models chosen for individual participants? While across all subjects one model may fit best, it might be that individual subjects show much more variability in which model fits their data. Many details are missing from the methods section. Further support beyond the mean BIC should be provided.

      We fit each model 150 times and for each iteration, the initial value of each parameter was randomly selected from a uniform distribution. The range for each parameter was hand tuned for each model, with an eye on making sure the values covered a reasonable range. Please see our response to your first minor comment below for the range of all parameters and how we decide the iteration number for each model.

      Given the reviewers’ comments in the individual difference, we now fit the models at individual level and report a frequency analysis, describing the best fitting model for each participant. In brief, the data for a vast majority of the participants was best explained by the transformation model when comparing single source models and by the T+V (TR+TG) model when consider all models. Please see response to reviewer 3 public comment 1 for the updated result.

      We updated the method session, and it reads as follows (lines 841-853):

      _“_We used the fminsearchbnd function in MATLAB to minimize the sum of loglikelihood (LL) across all trials for each participant. LL were computed assuming normally distributed noise around each participant’s motor biases:

      [11]       𝐿𝐿 = 𝑛𝑜𝑟𝑚𝑝𝑑𝑓(𝑥, 𝑏, 𝑐)

      where x is the empirical reaching angle, b is the predicted motor bias by the model, c is motor noise, calculated as the standard deviation of x-b.

      For model comparison, we calculated the BIC as follows:

      [12] BIC = -2LL+k∗ln(n)

      where k is the number of parameters of the models. Smaller BIC values correspond to better fits. We report the sum of ΔBIC by subtracting the BIC value of the TR+TG model from all other models.

      Line 305-307. The authors state that biomechanical issues would not predict qualitative changes in the motor bias function in response to visual manipulation of the start position. However, I question this statement. If the start position is offset visually then any integration of the proprioceptive and visual information to determine the start position would contain a difference from the real hand position. A calculation of the required joint torques from such a position sent through the mechanics of the limb would produce biases. These would occur purely because of the combination of the visual bias and the inherent biomechanical dynamics of the limb.

      We thank the reviewer for this comment. We have removed the statement regarding inferences about the biomechanical model based on visual manipulations of the start position. Additionally, we have incorporated a recently proposed biomechanical model into our model comparisons to expand our exploration of sources of bias. Please refer to our response to your public review for details.

      Measurements are made while the participants hold a stylus in their hand. How can the authors be certain that the biases are due to the movement and not due to small changes in the hand posture holding the stylus during movements in the workspace. It would be better if the stylus was fixed in the hand without being held.

      Below, we have included an image of the device used in Exp 1 for reference. The digital pen was fixed in a vertical orientation. At the start of the experiment, the experimenter ensured that the participant had the proper grip alignment and held the pen at the red-marked region. With these constraints, we see minimal change in posture during the task.

      Author response image 6.

      Minor Comments

      Best fit model parameters are not presented. Estimates of the accuracy of these measures would also be useful.

      In the original submission, we included a Table S1 that presented the best-fit parameters for the TR+TG (Previously T+V) model. Table S1 now shows the parameters for the other models (Exp 1b and 3b, only). We note the parameter values from these non-optimal models are hard to interpret given that core predictions are inconsistent with the data (e.g., number of peaks).

      We assume that by "accuracy of these measures," the reviewers are referring to the reliability of the model fits. To assess this, we conducted a parameter recovery analysis in which we simulated a range of model parameters for each model and then attempted to recover them through fitting. Each model was simulated 50 times, with the parameters randomly sampled from distributions used to define the initial fitting parameters. Here, we only present the results for the combined models (TR+TG, PropV+V, and PropJ+V), as the nested models would be even easier to fit.

      As shown in Fig. S4, all parameters were recovered with high accuracy, indicating strong reliability in parameter estimation. Additionally, we examined the log-likelihood as a function of fitting iterations (Fig. S4d). Based on this curve, we determined that 150 iterations were sufficient given that the log-likelihood values were asymptotic at this point. Moreover, in most cases, the model fitting can recover the simulated model, with minimal confusion across the three models (Fig. S4e).

      What are the (*1000) and (*100) in the Change in BIC y-labels? I assume they indicate that the values should be multiplied by these numbers. If these indicate that the BIC is in the hundreds or thousands it would be better the label the axes clearly, as the interpretation is very different (e.g. a BIC difference of 3 is not significant).

      ×1000' in the original figure indicates the unit scaling, with ΔBIC values ranging from approximately 1000 to 4000. However, given that we now fit the models at the individual level, we have replaced this figure with a new one showing the distribution of individual BIC values.

      Lines 249, 312, and 315, and maybe elsewhere - the degree symbol does not display properly.

      Corrected.

      Line 326. The authors mention that participants are unaware of their change in hand angle in response to clamped feedback. However, there may be a difference between sensing for perception and sensing for action. If the participants are unaware in terms of reporting but aware in terms of acting would this cause problems with the interpretation?

      This is an interesting distinction, one that has been widely discussed in the literature. However, it is not clear how to address this in the present context. We have looked at awareness in different ways in prior work with clamped feedback. In general, even when the hand direction might have deviated by >20d, participants report their perceived hand position after the movement as near the target (Tsay et al, 2020). We also have used post-experiment questionnaires to probe whether they thought their movement direction had changed over the course of the experiment (volitionally or otherwise). Again, participants generally insist they moved straight to the target throughout the experiment. So it seems that they unaware of any change in action or perception.

      Reaction time data provide additional support that participants are unaware of any change in behavior. The RT function remains flat after the introduction of the clamp, unlike the increases typically observed when participants engage in explicit strategy use (Tsay et al, 2024).

      Figure 1h: The caption suggests this is from the Wang 2021 paper. However, in the text 180-182 it suggests this might be the map from the current results. Can the authors clarify?

      Fig 1e is the data from Wang et al, 2021. We formalized an abstract map based on the spatial constrains observed in Fig 1e, and simulated the error at the start and target position based on this abstraction (Fig 1h). We have revised the text to now read (Lines 182-190):

      “Motor biases may thus arise from a transformation error between these coordinate systems. Studies in which participants match a visual stimulus to their unseen hand or vice-versa provide one way to estimate this error(Jones et al., 2009; Rincon-Gonzalez et al., 2011; van Beers et al., 1998; Wang et al., 12/2020). Two key features stand out in these data: First, the direction of the visuo-proprioceptive mismatch is similar across the workspace: For right-handers using their dominant limb, the hand is positioned leftward and downward from each target. Second, the magnitude increases with distance from the body (Fig 1d). Using these two empirical constraints, we simulated a visual-proprioceptive error map (Fig. 1h) by applying a leftward and downward error vector whose magnitude scaled with the distance from each location to a reference point.”

      Reviewer #3 (Recommendations for the authors):

      The central idea behind the research seems quite promising, and I applaud the efforts put forth. However, I'm not fully convinced that the current model formulations are plausible explanations. While the dataset is impressively large, it does not appear to be optimally designed to address the complex questions the authors aim to tackle. Moreover, the datasets used to formulate the 3 different model predictions are SMALL and exhibit substantial variability across individuals, and based on average (and thus "smoothed") data.

      We hope to have addressed these concerns with the two major changes to revised manuscript: 1) The new experiment in which we examine biases in both angle and extent and 2) the inclusion in the analyses of fits based on individual data sets.

    1. Najczęściej zadawane pytania
      1. FAQ SECTION (Zmniejszenie obaw)

      Musi odpowiadać na 5-10 najczęstszych pytań. Dla optyki:

      Best practice: Sekcja FAQ na dnie (scroll depth = 70%+), ale musi być

    2. Dodaj do koszyka
      1. PRIMARY CTA - CALL-TO-ACTION (Konwersja)

      Button design

      • Text: “Kupię teraz” (nie “Submit”, nie “Kup”)
      • Color: High contrast na tłem (np. zielony na białym, ciemny na jasnym)
      • Size: Duży - minimum 50px wysokość
      • Position: Sticky na mobile (zawsze widoczny)
      • Feedback: Zaraz jak kliknę, button pokazuje “Adding…” + zmienia się na “In cart ✓”

      Drugi CTA (optional but recommended)

      “+ DODAJ DO LISTY ŻYCZEŃ” (heart icon) - nie odwraca flow, ale buduje retargeting list

    3. Lekkie, polaryzacyjne okulary w ikonicznym stylu retro. Chronią oczy 100% UV. Idealne do codziennego noszenia - nie będziesz czuł zmęczenia nawet po 8 godzinach pracy przy komputerze. Włoskie materiały, trwają lata.

      Short overview (50-80 słów, benefit-focused) ❌ “Frame material: TAC. Lens type: Polarized. Weight: 15g. Fit: Regular.”
 ✅ “Lekkie, polaryzacyjne okulary w ikonicznym stylu retro. Chronią oczy 100% UV. Idealne do codziennego noszenia - nie będziesz czuł zmęczenia nawet po 8 godzinach pracy przy komputerze. Włoskie materiały, trwają lata.”

      Dlaczego druga? Odpowiada na pytanie: “Co dla mnie robią?” (emocja + logika)

    4. Strona główna Carrera CARRERA GLORY – Okulary przeciwsłoneczne
      1. ABOVE THE FOLD (Co widać bez scrollowania) - SEKCJA NAJKRYTYCZNIEJSZA To first 3-5 sekund na stronie. Jeśli tu klient się nie orientiuje, bounce rate = 50%. Musi być widoczne (bez scrollowania):

    5. HERO IMAGE & GALERIA

      Fakty: • 75% konsumentów bazuje decyzję na zdjęciach • Lifestyle photos zwiększają konwersję o 25-50% • Bez lifestyle photos: “To jest produkt”, z lifestyle photos: “To mogę być ja” Co musi być w galerii (minimum):

    1. EFFSAFE1 == 1 ~ 6, # strongly disagree = 1

      I have check the survey you did, and is already in the correct order strongly disagree(1), disagree(2), ......, strongly agree(6).

      And you recode the strongly disagreee(1) to strongly disagreee(6). So the order is reverse.

      So the correct code is only recode -50 and -99 to NA is fine, and keep everything else as the orginal form.

    1. le pendant du féminisme

      En réalité, lutte ardente contre le féminisme vu comme l'ennemi idéologique. Dans la plupart du cas, le masculinisme n'est pas là pour défendre les hommes en cohabitant avec le féminisme

    1. We thank the editor and reviewers for their thoughtful comments. We believe they will substantially strengthen the manuscript and clarify our arguments.

      We appreciate R1’s positive assessment of the quality and relevance of our work. Regarding the closing remark on inadequate documentation and guidance, we agree that this has been a critical issue. Fortunately, CONICET has recently issued additional guidance (available here in Spanish), which we view as a positive development; we look forward to observing its effects on evaluation practices.

      R2 raises important and thought-provoking points. On IRB approval: in Argentina, document-based research in the social sciences does not require IRB review. Nevertheless, before accessing the materials we discussed the ethical implications of the study with CONICET authorities and signed an agreement specifying how the data would be used and committing to anonymize any excerpts made public. We also limited our analysis to closed promotion cases (final decisions, no further appeals possible) to ensure our research would not affect ongoing processes.

      We thank R2 for directing us to the COREQ guideline. We understand it is intended for reporting focus groups and interviews; we will review it and consider which elements can be adapted to improve the reporting of our document-based study.

      We also agree that a more explicit articulation between the qualitative and quantitative components will help present the results in a more integrated way, and we will work toward that aim. Concerning the structure of the discussion, our intention is not to introduce new concepts but to connect our findings to prior scholarship already cited in previous sections and to situate them within the broader global conversation. Finally, we concur that the study’s limitations should be stated more clearly, given that our analysis is restricted to a single career system and three disciplinary fields.

    Annotators

    1. L'école dont nous rêvons : Synthèse de la consultation des acteurs de l'éducation

      Résumé

      Ce document de synthèse résume les points clés de la consultation "L'école dont nous rêvons", organisée par l'Institut de France et l'Académie des Sciences, avec un événement local piloté par l'INSPÉ de l'Académie de Lille et l'Université de Lille.

      La consultation vise à mener une réflexion prospective et structurelle sur l'avenir de l'école en France, en s'éloignant d'une simple liste de doléances pour se concentrer sur les défis à relever et les leviers de transformation.

      L'initiative nationale s'articule autour de cinq grands thèmes : l'élève, le métier d'enseignant, l'organisation des établissements, la mixité sociale et scolaire, et l'inclusion des élèves à besoins spécifiques.

      La méthodologie repose sur une double consultation : des auditions institutionnelles et des rencontres de terrain pour valoriser les initiatives existantes et recueillir des propositions concrètes.

      L'objectif final est de proposer des scénarios de transformation chiffrés et échelonnés dans le temps, destinés à éclairer le débat public sans imposer de solution unique, avec un horizon fixé à 2050.

      L'Académie de Lille, caractérisée par sa grande diversité de territoires et une forte proportion d'élèves en éducation prioritaire, met en avant son engagement dans la lutte contre les déterminismes et le développement de réponses locales adaptées.

      Elle souligne l'importance d'un cadre scolaire bienveillant, d'une cohésion de la communauté éducative, de partenariats territoriaux forts et d'une dynamique d'innovation et d'expérimentation soutenue par la recherche.

      Les initiatives inspirantes présentées lors de la consultation illustrent des leviers d'action concrets :

      La collaboration professionnelle (AEPS) pour transformer le métier et diffuser les bonnes pratiques.

      La valorisation du plurilinguisme (CASNAV) comme une richesse pour toute la communauté scolaire.

      Le croisement des savoirs (ATD Quart Monde) entre école, familles et quartier pour une meilleure compréhension mutuelle et la réussite de tous.

      L'ancrage territorial (Cités éducatives) pour rompre l'isolement des établissements et créer des dynamiques collaboratives.

      La coopération interprofessionnelle (PIA3) entre les secteurs scolaire et médico-social pour une inclusion réussie.

      La pratique artistique (CFMI) comme outil de cohésion, de plaisir d'enseigner et de développement interdisciplinaire.

      Ensemble, ces perspectives dessinent les contours d'une école plus agile, collaborative, inclusive et ancrée dans son territoire, capable de s'adapter aux transformations sociales et de redonner du pouvoir d'agir à l'ensemble de ses acteurs.

      --------------------------------------------------------------------------------

      1. Contexte et objectifs de la consultation

      L'événement "L'école dont nous rêvons" s'inscrit dans le cadre d'une grande consultation nationale initiée par l'Institut de France et l'Académie des Sciences.

      L'étape lilloise a été co-pilotée par la Maison pour la science de l'INSPÉ de l'Académie de Lille et la direction culture de l'Université de Lille.

      La démarche se veut prospective, visant à envisager les possibles pour la construction future de l'école.

      Elle a pour but de dépasser les constats sur les dysfonctionnements pour se concentrer sur les défis majeurs auxquels l'institution scolaire est et sera confrontée, parmi lesquels :

      • L'intégration de l'intelligence artificielle.

      • Le choc démographique à venir.

      • La nécessaire prise en compte des évolutions sociales et de leur impact sur les élèves, les enseignants et les familles.

      L'ambition est de bâtir l'école de demain de manière "offensive et positive", plutôt que de réagir défensivement à des évolutions qui auraient déjà dépassé l'institution.

      2. Le projet national "L'école dont nous rêvons"

      2.1. Philosophie et méthodologie

      Lancé il y a deux ans, le projet national ne cherche pas à dresser la liste de ce qui ne fonctionne pas, mais à identifier les défis que l'école doit relever.

      L'objectif est de proposer des modifications structurelles au système éducatif pour lui conférer plus de flexibilité, d'agilité et de capacité d'adaptation au terrain et aux élèves, tout en redonnant du "pouvoir d'agir" aux acteurs.

      La consultation se déroule en deux volets parallèles :

      1. Auditions institutionnelles : Rencontres avec des syndicats, des conférences de recteurs, des associations d'élus (maires de France, maires ruraux), des réseaux de parents, de chercheurs, et des partenaires comme les MDPH.

      2. Rencontres de terrain : Déplacements dans deux à trois lieux par région, en recherchant une grande diversité géographique et sociologique. Ces rencontres ont un double objectif :

      Parler en bien de l'école : Mettre en lumière les réussites et reconnaître le talent et l'énergie des acteurs de terrain.   

      Réflexion collective : Faire remonter les bonnes idées et identifier les freins, au-delà des simples listes de doléances.

      2.2. Les cinq axes de réflexion

      La consultation est structurée autour de cinq thèmes principaux, axés sur la structure du système plutôt que sur des questions purement pédagogiques (comme le choix d'une méthode de lecture).

      Axe de réflexion

      Contenu et questions clés

      1. L'élève

      Mettre l'élève au centre. Réflexion sur les rythmes (annuels, hebdomadaires), la mise en œuvre effective des cycles, la personnalisation des parcours et l'éducation au choix pour rendre l'élève acteur de son orientation.

      2. Le métier d'enseignant

      Définir les contours du métier au-delà des heures de cours. Intégrer le tutorat, le travail en équipe, la formation continue et l'évolution de carrière pour renforcer l'attractivité de la profession.

      3. L'organisation de l'établissement

      Renforcer l'ancrage territorial et le travail partenarial (collectivités, familles, associations, secteur médico-social). Penser l'autonomie en termes de subsidiarité pour mieux s'adapter au contexte local.

      4. La mixité sociale et scolaire

      Assurer la mixité, porter une ambition commune pour tous les élèves et mettre en place des dispositifs de remédiation efficaces pour les plus fragiles.

      5. L'inclusion des élèves à besoins spécifiques

      Imaginer des parcours adaptés et une continuité de prise en charge pour les élèves en situation de handicap, mais aussi ceux présentant des troubles de l'apprentissage, de l'attention ou des problèmes de santé mentale.

      2.3. Finalité du projet

      Le projet aboutira à la création d'un groupement (potentiellement un Groupement d'Intérêt Public) réunissant des institutions prestigieuses (Collège de France, ENS de Paris, Lyon et Rennes, Académie des Sciences, CNAM, etc.). Ce groupement aura pour mission de :

      • Identifier des axes prioritaires pour chaque thème.

      • Proposer plusieurs scénarios de transformation.

      Chiffrer ces scénarios en termes de moyens financiers et humains.

      • Définir une trajectoire de transformation à long terme.

      Le résultat sera un document accessible à tous, visant à éclairer le débat public pour que la société puisse s'emparer de la question de l'école. Le choix final relèvera d'une décision démocratique.

      3. Perspectives de l'Académie de Lille

      3.1. Un territoire de défis et d'engagements

      L'Académie de Lille est marquée par une forte densité de population et une grande diversité de territoires, allant de zones urbaines très peuplées à des zones rurales.

      Chiffres clés :

      • Plus de 750 000 élèves.

      • Plus de 3 000 écoles et 667 établissements du second degré.

      • Près de 60 000 enseignants.

      • Un tiers des élèves relève de l'éducation prioritaire (41 REP+, 158 quartiers prioritaires).

      La lutte contre les déterminismes est une priorité historique de l'académie, qui s'efforce d'apporter des réponses locales adaptées aux enjeux territoriaux, comme les Territoires Éducatifs Ruraux (TER) ou le projet "Calais territoire bilingue".

      3.2. Conditions de la réussite et dynamique d'innovation

      Pour l'académie, l'école doit être un lieu d'émancipation où l'élève se sent bien et en confiance. Plusieurs conditions sont jugées nécessaires pour y parvenir :

      La cohésion de la communauté éducative : Implication de tous les personnels (enseignants, direction, CPE, pôle santé-social, AED, etc.) et des parents.

      Un partenariat fort avec les acteurs du territoire : Élus, tissu associatif.

      Une orientation scolaire et professionnelle menée en lien avec le supérieur et le monde de l'entreprise.

      L'académie se caractérise par une forte dynamique d'innovation, avec l'appui de chercheurs pour évaluer et améliorer les projets :

      25 projets expérimentaux dérogatoires dans le second degré (environ 50 établissements).

      Plus de 300 projets innovants suivis dans le premier et le second degré.

      • Des laboratoires intégrés (plus de 100) en mathématiques, français, musique, etc., qui sont à la croisée de l'innovation et de la formation.

      3.3. La formation comme levier essentiel

      La formation continue est considérée comme un pilier pour accompagner les évolutions.

      L'École Académique de la Formation Continue (EAFC) a mis en place près de 5 000 formations pour 2024-2025, à destination de près de 40 000 personnels de tous corps. Deux exemples illustrent cet investissement :

      Intelligence Artificielle : Un plan a permis de former plus de 5 000 personnes.

      Compétences Psychosociales (CPS) : Création d'un Diplôme Universitaire avec l'INSPÉ pour former 50 formateurs d'ici fin 2026, avec l'objectif d'irriguer tous les lieux et temps de l'enfant, et pas seulement la classe.

      4. Présentation d'initiatives inspirantes

      Plusieurs initiatives locales ont été présentées pour illustrer des pistes concrètes et "ouvrir le champ des possibles".

      4.1. AEPS : La force du collectif pour le métier d'enseignant

      L'Association pour l'Enseignement de l'Éducation Physique (AEPS) agit comme un réseau national pour diffuser les connaissances en EPS.

      Elle favorise le lien et la transformation du métier en permettant aux enseignants de se former et d'échanger sur leur temps personnel.

      L'association, reconnue jusqu'à l'inspection générale, démontre l'importance des collectifs professionnels pour faire évoluer les pratiques.

      Citation clé : "Vaut la peine d'être enseigné ce qui unit et ce qui libère." - Olivier Reboul

      4.2. CASNAV : Le plurilinguisme comme levier éducatif

      Le Centre Académique pour la Scolarisation des élèves allophones Nouvellement Arrivés (CASNAV) souligne une évolution majeure : l'inclusion en classe ordinaire est désormais vue comme la condition de l'apprentissage, et non plus comme un objectif après la maîtrise du français.

      L'initiative phare est la "Feuille de route" du Conseil de l'Europe, expérimentée dans un collège lillois. Elle vise à :

      • Réaliser une "photographie" de toutes les langues présentes dans un établissement (langues enseignées et langues familiales).

      • Associer tous les acteurs (élèves, enseignants, direction, parents, personnels non-enseignants).

      • Valoriser la diversité linguistique comme une richesse et une compétence centrale pour tous.

      4.3. ATD Quart Monde : Croiser les savoirs pour la réussite de tous

      L'association propose une démarche de "croisement des savoirs et des pratiques" pour tisser des liens entre l'école, les familles (notamment en situation de grande précarité) et le quartier.

      La méthode repose sur la reconnaissance que chacun détient un savoir utile :

      • Savoir académique (école).

      • Savoir d'action (professionnels).

      • Savoir d'expérience de vie (parents).

      Un point crucial de la démarche est de commencer le travail en groupes de pairs avant de rassembler tout le monde, afin de garantir une parole plus égale.

      Cela permet de lever les malentendus, de construire la confiance et de prendre conscience des "dimensions cachées" de la précarité qui freinent les apprentissages.

      4.4. Cités Éducatives : L'intelligence territoriale en action

      L'expérience des Cités Éducatives montre comment l'ancrage territorial peut dynamiser les établissements.

      En faisant de chaque collège le chef de file d'une thématique liée aux forces de son territoire (santé, mathématiques, arts/langues), le projet a permis de :

      Rompre l'isolement des équipes et des établissements.

      Augmenter l'implication des enseignants en les décentrant de leur seule classe pour les faire agir à l'échelle du réseau.

      • Créer une émulation et une synergie où "les forces de l'un viennent au secours des faiblesses de l'autre".

      • Donner un sens concret au rôle de coordinateur de discipline.

      4.5. PIA3 : Coordonner scolaire et médico-social pour l'inclusion

      Face à une législation sur l'éducation inclusive qui évolue rapidement, le projet PIAL "100% IDT" a développé des formations interprofessionnelles partagées entre les personnels de l'Éducation nationale et ceux du secteur médico-social.

      L'objectif est de décloisonner les cultures et de faire collaborer ces acteurs pour mieux accompagner la scolarisation des élèves à besoins éducatifs particuliers.

      La démarche, basée sur la recherche et l'évaluation, répond à un besoin fort du territoire.

      4.6. CFMI : La musique comme ADN de la co-construction

      Le Centre de Formation de Musiciens Intervenants (CFMI) a pour ADN la co-construction de projets entre artistes et enseignants.

      L'initiative "Cœur ressource interprofessionnelle" a rassemblé pendant près de 10 ans des enseignants de tous niveaux, des artistes et des professeurs de conservatoire pour chanter ensemble.

      Ce projet a permis de :

      • Tisser des liens entre les établissements et les ressources culturelles du territoire.

      • Favoriser le plaisir d'enseigner, considéré comme une condition essentielle.

      • Montrer comment la pratique artistique peut irriguer l'ensemble des disciplines.

      Le rêve porté par le CFMI est celui d'une école "où on chante tous les jours".

    1. Former les Futurs Enseignants à une Approche Sensible de l'Espace : Synthèse et Analyse

      Résumé

      Ce document synthétise les arguments clés d'une recherche sur la nécessité de former les futurs enseignants à une approche sensible, incarnée et interdisciplinaire de l'espace.

      La thèse centrale est que l'éducation scolaire, qui a historiquement cherché à neutraliser et contraindre le corps des élèves, doit évoluer pour faire de ce dernier un outil fondamental d'apprentissage et de compréhension de l'environnement proche.

      En créant un pont entre l'architecture et la géographie, deux disciplines qui partagent un intérêt pour l'espace vécu mais restent peu intégrées dans les cursus, il est possible de développer une pédagogie plus riche et émancipatrice.

      Une expérimentation menée à l'INSPÉ de Bordeaux auprès de futurs enseignants a servi de cas d'étude.

      En mobilisant des dispositifs comme le "parcours augmenté" ou la "carte mentale", l'étude a révélé une tendance des participants à privilégier des représentations conceptuelles et objectives de l'espace (vue de dessus, absence de corps), conformes aux normes scolaires traditionnelles.

      Ce constat démontre l'urgence de former les enseignants à dépasser l'approche purement cartographique pour intégrer la dimension vécue, sensorielle et émotionnelle.

      La conclusion préconise l'intégration de modules de formation basés sur l'expérience sensible, capables de croiser les disciplines et de donner aux élèves les moyens de devenir des acteurs conscients et engagés de leur environnement.

      --------------------------------------------------------------------------------

      1. Le Paradigme de l'Espace Scolaire : Corps et Pédagogie

      L'analyse de l'espace scolaire révèle des tensions profondes entre les cadres physiques, les modèles pédagogiques et la place accordée au corps de l'élève.

      1.1. Le Déterminisme Spatial en Question

      Une idée reçue suggère que modifier l'espace d'apprentissage (mobilier, lieu) suffit à transformer la pédagogie.

      Une expérimentation de terrain contredit ce "déterminisme spatial". Sur trois enseignants invités à faire classe dans la cour de récréation, deux ont répliqué leur modèle frontal et contrôlé, réorganisant les élèves en rangs.

      Seul l'enseignant qui pratiquait déjà une pédagogie différenciée (en îlots) en classe a permis aux élèves une plus grande liberté corporelle.

      Constat : Le changement spatial ne garantit pas un changement pédagogique.

      Le Triptyque de Pascal Clerc : Cette observation illustre la persistance du modèle où "une classe qui fait classe dans une classe" se reproduit, même en extérieur.

      Nécessité d'accompagnement : Il est crucial de former et d'accompagner les enseignants pour qu'ils puissent exploiter différemment les potentiels pédagogiques des espaces, intérieurs comme extérieurs.

      1.2. La Neutralisation du Corps à l'École

      L'institution scolaire a historiquement cherché à neutraliser le corps des élèves, le contraignant par le mobilier et les règles.

      Cette mise à l'écart du corps, relégué aux cours d'Éducation Physique et Sportive, entre en paradoxe avec l'ambition d'une éducation intégrale qui devrait englober les dimensions physique, sensible et intellectuelle.

      Le Corps comme Outil de Mesure et de Perception : L'expérience personnelle de la chercheuse, Maylis Leuret, en tant qu'étudiante en architecture, illustre comment le corps peut devenir un outil premier pour comprendre et représenter l'espace (mesurer avec ses pas, évaluer les proportions avec son regard).

      Corps, Intimité et Espace : Faire exister le corps en classe revient à aborder la question de l'intime et du rapport à soi et aux autres.

      Des sujets comme l'éducation à la vie affective et relationnelle (EVARS) ou l'aménagement des toilettes scolaires sont des extensions de cette problématique, soulignant un rapport au corps souvent tabou ou négligé.

      Passer d'un Espace Subi à un Espace Habité : L'enjeu central de la recherche est de transformer l'espace scolaire d'un cadre subi en un lieu réellement habité par les élèves, porteur d'apprentissages et d'expériences.

      2. Un Pont entre Architecture et Géographie

      La recherche propose de créer un "rapport fécond" entre l'architecture et la géographie pour développer une éducation à l'espace plus complète, ancrée dans le vécu.

      2.1. Des Notions et Enjeux Communs

      Bien que relevant de champs académiques distincts, ces deux disciplines convergent sur plusieurs points :

      Notions partagées : Le corps, l'espace, l'habiter, l'environnement proche.

      Évolution commune : Elles intègrent de plus en plus la dimension du vécu, de l'expérience sensible et des appropriations pour analyser la manière dont les individus trouvent leur place sur un territoire.

      2.2. Une Faible Intégration Curriculaire

      Malgré leurs synergies potentielles, la relation entre géographie et architecture est faible dans les parcours de formation :

      • La géographie occupe une place restreinte dans les 21 écoles d'architecture françaises.

      • L'architecture n'est pas un objet d'étude explicite dans les programmes de géographie du primaire, du secondaire ou des licences universitaires.

      2.3. Vers une "Géographie Expérientielle"

      Un courant de la géographie scolaire, la "géographie expérientielle", promeut une approche alignée sur ces principes.

      Portée par des chercheuses comme Sophie Gojal et Caroline Lingénère-Frésal, elle valorise une géographie ancrée dans l'expérience spatiale des élèves.

      "La géographie expérientielle permet aux élèves de penser l'espace, de se penser dans l'espace et de faire le lien entre leur pratique spatiale et le cours de géographie." - Caroline Lingénère-Frésal (2020)

      Cette approche mobilise des démarches actives comme les sorties de terrain, les jeux de rôle et les dispositifs sensibles pour rendre les enfants enquêteurs et acteurs de leur environnement.

      3. L'Expérimentation de l'INSPÉ de Bordeaux : Méthodologie et Constats

      Une séance de formation a été menée le 26 mars 2025 à l'INSPÉ de Bordeaux auprès de 11 étudiants en Master 2, en collaboration avec la géographe Julie Picard, pour tester l'impact de dispositifs sensibles.

      3.1. Objectifs et Hypothèses

      Objectif : Analyser comment l'introduction de dispositifs interdisciplinaires et sensibles transforme le rapport des futurs enseignants à leur environnement proche.

      Hypothèse : Ces démarches, croisant architecture et géographie, diversifient les représentations de l'espace et constituent des outils pédagogiques transférables.

      3.2. Dispositifs Pédagogiques Déployés

      L'expérimentation s'est articulée autour de trois dispositifs principaux dans la cour de l'INSPÉ :

      Dispositif

      Description

      Objectif

      Parcours Commenté

      Les participants s'enregistrent seuls (téléphone) en décrivant oralement ce qu'ils voient, observent et ressentent dans l'espace.

      Verbaliser la perception, identifier les lieux emblématiques ou marquants.

      Parcours Augmenté

      Expérimentation de l'espace avec des sens altérés ou mis en avant (ex: guidé les yeux fermés, marcher pieds nus).

      Solliciter des sens autres que la vue (toucher, ouïe) pour générer de nouvelles perceptions et émotions.

      Parcours Iconographique

      Prise de photographies représentant un lieu caractéristique, un endroit de bien-être ou de malaise.

      Capturer une représentation subjective et visuelle du rapport à l'espace.

      3.3. Analyse des Résultats : La Carte Mentale comme Révélateur

      À l'issue des parcours, les participants devaient réaliser une carte mentale de l'INSPÉ de mémoire.

      L'analyse de ces productions a révélé plusieurs tendances :

      Prédominance de la Vue de Dessus : Les représentations s'apparentaient majoritairement à des plans cartographiques, conformes aux normes scolaires.

      Injonction à l'Objectivité : Les cartes étaient souvent structurées, légendées et zonées, dans une quête de clarté et d'objectivité.

      Absence quasi totale du Vivant : Les participants ne se sont pas représentés, ni n'ont représenté les autres. Les corps étaient absents, une distance étant mise avec l'expérience vécue.

      Les participants ont justifié cela par le mouvement constant des corps, difficile à "fixer" sur une carte.

      Freins à la Représentation : Une difficulté à "bien dessiner" et l'absence de modèle ont freiné l'expressivité.

      3.4. Analyse des Perceptions Sensorielles

      L'analyse des retours a montré que les expériences de déambulation ont principalement sollicité trois sens :

      La vue : Omniprésente et indispensable à l'orientation.

      L'ouïe : Devenue dominante lorsque la vue était réduite, souvent associée à des sons apaisants (chant des oiseaux).

      Le toucher : Émergeant dans des situations spécifiques (marche pieds nus), provoquant des ressentis variés (curiosité, inconfort, fraîcheur, humidité, dégoût).

      Le fait d'être dehors a été majoritairement ressenti comme positif, renforçant le bien-être (calme, relaxation).

      4. Conclusions et Recommandations pour la Formation

      L'ensemble de cette démarche souligne l'importance de former les enseignants à une éducation à l'espace qui dépasse le cadre conceptuel pour intégrer le corps et le sensible.

      4.1. Vers une Application Pédagogique Concrète

      L'utilisation de ces dispositifs avec des élèves nécessite un cadrage précis pour être efficace. Interrogée sur la possibilité de faire dessiner aux élèves leur "école idéale", Maylis Leuret met en garde contre une consigne trop ouverte qui mène souvent à des imaginaires stéréotypés (piscine, dromadaires) sans portée concrète.

      Pour viser une amélioration réelle du cadre de vie, l'enjeu doit être précis et s'appuyer sur un récit commun construit à partir de l'expérience quotidienne.

      4.2. La Nécessité d'une Pédagogie Interdisciplinaire

      Une approche sensible de l'espace constitue un levier puissant pour l'interdisciplinarité, permettant de croiser de multiples compétences et domaines :

      Géographie et Mathématiques : Relation à l'environnement, structuration de l'espace.

      Français et Arts Plastiques : Mise en récit, représentation sensible.

      Éducation Physique et Sportive : Balade sensible, mesure de l'espace par le corps.

      4.3. Propositions pour la Formation des Enseignants

      Malgré un cadrage institutionnel qui tend à prioriser le français et les mathématiques, il est essentiel d'intégrer cette dimension dans la formation :

      Intégrer des modules dédiés aux méthodes de recherche ancrées dans l'expérience sensible au sein des maquettes de formation des INSPÉ.

      Organiser des sessions pratiques : balades sensibles, parcours augmentés, ateliers utilisant la carte, la photo, le son ou la maquette.

      Renforcer la coordination institutionnelle et valoriser les liens entre la géographie expérientielle et les pratiques d'atelier de projet des écoles d'architecture.

      En définitive, faire place au corps dans l'enseignement de l'espace ouvre la voie à une éducation plus émancipatrice, attentive aux vécus et capable de former des citoyens plus conscients des enjeux liés à leur environnement.

    1. Les Espaces Scolaires : Analyse d'un Milieu de Vie et d'Apprentissage

      Résumé

      Ce document de synthèse analyse la nature complexe des espaces d'apprentissage scolaires, en s'appuyant sur les recherches de Guilhem Labinal.

      L'analyse révèle que la "forme scolaire" traditionnelle, caractérisée par une salle de classe fermée et une disposition frontale des élèves (type "bus" ou "wagon"), reste prédominante malgré son inadéquation reconnue avec les pédagogies actives et le bien-être des élèves.

      Un paradoxe central émerge : une majorité d'enseignants, notamment dans le secondaire, utilise quotidiennement une configuration spatiale qu'ils ne jugent pas idéale pour leurs pratiques pédagogiques.

      Le maintien de ce modèle s'explique par de multiples contraintes systémiques : la gestion du temps (cours de 45-50 minutes), les effectifs élevés, les normes sociales entre collègues, les impératifs de sécurité et l'inertie d'un modèle historiquement ancré.

      L'analyse souligne qu'il n'existe pas de déterminisme spatial : changer le mobilier ne suffit pas à transformer la pédagogie.

      Une transformation durable requiert une approche écosystémique qui intègre les dimensions matérielle, pédagogique et relationnelle (l'espace vécu).

      L'étude des espaces scolaires doit être multiscalaire (de la salle de classe à l'établissement) et pluridisciplinaire, en mobilisant la géographie, la sociologie et la psychologie environnementale.

      Des méthodes qualitatives comme les "parcours commentés" se révèlent particulièrement efficaces pour documenter la singularité de l'expérience spatiale des différents acteurs (élèves, enseignants, CPE, personnel non enseignant), démontrant que l'espace de l'un n'est pas l'espace de l'autre.

      En fin de compte, l'aménagement des espaces scolaires est le produit d'un arbitrage constant entre des tensions et des besoins variés, nécessitant une réflexion globale et concertée.

      --------------------------------------------------------------------------------

      1. La Forme Scolaire Traditionnelle en Question

      La conférence de Guilhem Labinal s'ouvre sur une critique fondamentale de la "forme scolaire" traditionnelle, un modèle spatial et temporel qui sépare l'école du reste de la société et impose un cadre normatif rigide à l'apprentissage.

      1.1. Caractéristiques du Modèle Dominant

      Le modèle de la salle de classe conventionnelle est décrit comme un "dispositif carré qui est quand même fermé".

      Il est le fruit d'une histoire et de normes anciennes, comme les instructions de Marcelin Berthelot au XIXe siècle (1,25 m² par élève, salles de 40-50 m²). Ce modèle induit :

      La Fixité et l'Immobilité : Les chaises, parfois "arrimées au sol", contraignent les corps et limitent les mouvements, ce qui pose des questions sur le développement moteur et le bien-être des élèves.

      La Séparation : L'école est conçue comme un lieu clos, séparé de la société, des familles et même des objets du quotidien comme le smartphone.

      L'architecture de certains établissements, avec des "double herse de château fort", renforce cette image d'enfermement sécuritaire.

      L'Organisation Frontale : À partir du CP, l'organisation spatiale bascule vers un "ordre magistraux-centré", avec une disposition privilégiée dite en "bus" ou "wagon".

      1.2. Le Paradoxe de l'Usage par les Enseignants

      Une enquête menée auprès d'enseignants d'histoire-géographie de l'académie de Versailles révèle un paradoxe frappant.

      Dispositif Spatial

      Usage Quotidien

      Dispositif Jugé Idéal

      Type "Bus" ou "Wagon" (Frontal)

      • Plus de 80 % (74 % des 32 répondants)

      Majoritaire

      Autres (îlots, U, etc.)

      • Moins de 25 % (25 % des 32 répondants)

      Minoritaire

      Ce décalage significatif soulève une question centrale : "pourquoi les enseignants utilisent un dispositif qui a priori ne correspond pas du tout au dispositif idéal pour mettre en œuvre leur séquence ?".

      Le faible taux de réponse au questionnaire suggère également que ce dispositif est si "normé qu'il a été intégré dans la forme scolaire sans qu'on la questionne beaucoup", surtout dans le secondaire.

      2. Les Contraintes Empêchant la Transformation des Espaces

      Le maintien du modèle traditionnel n'est pas simplement le fait d'un choix individuel mais le résultat d'un ensemble de contraintes systémiques qui pèsent sur les acteurs éducatifs.

      La Contrainte Temporelle : La durée des cours (45 à 50 minutes) est citée comme un obstacle majeur.

      Un enseignant interrogé déclare ne pas avoir "le temps ou le courage de modifier la disposition" et de la remettre en place, qualifiant la démarche de "trop chronophage".

      Les Normes Sociales et la Pression des Pairs : Les salles étant partagées, modifier l'agencement peut créer des tensions.

      La nécessité de remettre la salle "dans l'ordre que les collègues sont susceptibles d'attendre" sous peine d'une "pause café assez désagréable" est une puissante force d'inertie.

      Les Effectifs : Gérer une classe de 35 élèves rend toute réorganisation logistiquement complexe et difficile à mettre en œuvre.

      L'Appropriation de l'Espace : Les enseignants ont besoin de s'approprier leur espace de travail.

      L'expérience post-Covid, où les enseignants devaient changer de salle à la place des élèves, a montré que beaucoup se sentaient "pas chez eux".

      Cette appropriation est essentielle pour installer une organisation "didactiquement finalisée".

      L'Absence de Déterminisme Spatial : L'idée qu'il suffit de changer le mobilier pour changer la pédagogie est un postulat erroné.

      Guilhem Labinal insiste : "il n'y a pas de déterminisme par le dispositif spatial de la pédagogie".

      L'aménagement doit accompagner un projet pédagogique, et non le précéder.

      Les Impératifs de Sécurité : La logique de sécurisation, exacerbée depuis les attentats, conduit à un renforcement de l'enfermement (portiques, vitres peintes).

      Cette approche est parfois paradoxale, car elle peut créer des attroupements dangereux devant les entrées.

      3. Approches Théoriques et Cadres d'Analyse

      Pour comprendre la complexité des espaces d'apprentissage, une approche pluridisciplinaire et un cadre conceptuel robuste sont nécessaires.

      3.1. Le Triptyque : Ordre Distributionnel, Fonctionnel et Transactionnel

      Guilhem Labinal propose un modèle d'analyse en trois dimensions pour appréhender la salle de classe comme un "microsystème" :

      1. L'Ordre Distributionnel : Il s'agit du cadre matériel et architectural, de la "façon dont sont disposés des objets dans l'espace". C'est la matérialité brute.

      2. L'Ordre Fonctionnel : C'est la disposition pédagogique, "la manière dont on organise les différents éléments dans l'espace en relation avec une finalité pédagogique".

      3. L'Ordre Transactionnel : C'est l'espace vécu, qui reconnaît que les lieux "sont vécus différemment [...] par les uns et par les autres".

      Il intègre les régulations, les transgressions et les relations interpersonnelles.

      3.2. L'Apport des Différentes Disciplines

      L'étude des espaces scolaires est un champ de recherche où convergent plusieurs disciplines :

      La Géographie : Longtemps focalisée sur les échelles macro (quartier, ville), elle s'intéresse désormais à l'échelle micro, en appliquant des concepts comme la distance, la proximité ou l'itinéraire aux interactions dans une salle de classe.

      L'analyse porte sur la place du corps, du regard et des gestes.

      La Psychologie Environnementale : Elle a été précurseur dans l'étude de l'influence de l'architecture et du design sur les états psychologiques et les comportements sociaux.

      La Didactique : Ce champ a longtemps été un "impensé", se concentrant sur le triangle savoir-élève-enseignant ou les supports technologiques, mais rarement sur "les effets de la matérialité du dispositif architectural".

      La Sociologie et l'Anthropologie : Ces disciplines analysent les relations entre pairs, les relations de pouvoir, les dynamiques de genre et de régulation au sein des espaces comme la cour de récréation ou la salle de classe.

      4. Données Empiriques et Méthodes de Recherche

      Différentes études, quantitatives et qualitatives, permettent de documenter l'impact des espaces sur l'apprentissage et le bien-être.

      4.1. Les Études Expérimentales et Quantitatives

      Ces études analysent les effets de variables isolées sur les apprentissages : luminosité, température, qualité de l'air, acoustique.

      Une étude de grande ampleur menée par Peter Barret a identifié que 16 paramètres spécifiques (lumière, température, appropriation, complexité, couleur, flexibilité...) pouvaient expliquer 16 % de la variation des progrès scolaires des élèves sur un an.

      Cependant, ces approches présentent des limites :

      Difficulté d'isoler les variables dans un environnement complexe.

      "Effet établissement" (exposition au soleil, localisation des salles) qui rend les comparaisons difficiles.

      "Effet maître" : l'efficacité pédagogique d'un enseignant est une variable majeure difficile à neutraliser.

      Excès de codification dans les questionnaires, qui empêche l'expression d'un vécu singulier.

      4.2. Les Approches Qualitatives et Phénoménologiques

      Pour surmonter ces limites, les approches qualitatives se concentrent sur l'expérience vécue ("le vécu").

      Le Concept d' "Habiter" : Il ne s'agit pas seulement d'être présent dans un lieu, mais de le "vivre [...] dans la diversité des modes d'habiter", en fonction des moments, des personnes rencontrées et des actions menées.

      Les Parcours Commentés : Cette méthode consiste à se promener dans l'établissement avec un acteur (enseignant, CPE, élève...) et à le laisser commenter son vécu des lieux.

      Elle fait émerger la "singularité de la relation qu'on entretient avec les lieux" et révèle une approche multisensorielle (sons, bruits, circulation).

      Les Cartes Mentales : Elles permettent d'exprimer la relation subjective à un lieu, mais nécessitent d'être triangulées avec d'autres méthodes (entretiens d'explicitation) pour éviter les omissions.

      5. L'Établissement Scolaire : Un Écosystème Multiscalaire

      L'analyse ne peut se limiter à la salle de classe.

      L'établissement dans son ensemble fonctionne comme un écosystème complexe où les espaces sont vécus et appropriés différemment par chaque acteur.

      5.1. L'Espace de l'Un n'est pas l'Espace de l'Autre

      Les parcours commentés révèlent des "régimes d'habiter" distincts :

      L'enseignant fréquente principalement la salle des profs, le CDI, la salle de reprographie et sa propre salle.

      Le CPE (Conseiller Principal d'Éducation) a un parcours beaucoup plus large, de la grille d'entrée à son bureau, traversant la cour où il est constamment interpellé par les élèves (un trajet qui peut prendre "entre 15 et 20 minutes").

      Le personnel de cuisine ou d'entretien a ses propres itinéraires et temporalités, souvent méconnus des autres acteurs, ce qui peut générer des incompréhensions (ex: propreté des toilettes pendant la pause déjeuner).

      5.2. Une Typologie des Espaces Vécus

      Au sein de l'établissement, différents types d'espaces coexistent :

      Espaces privatisés : L'espace derrière le bureau du professeur, où aucun élève ne s'aventure par convention.

      Espaces à accès conditionnel : L'administration.

      Espaces sacralisés : Un banc ou un recoin qui devient un "haut lieu de la vie particulière de l'élève".

      Espaces genrés : La cour de récréation, où les usages varient fortement (jeux de ballon vs autres activités).

      Espaces de sociabilité : Les "cabinets" de physique ou d'histoire-géographie, qui peuvent renforcer la cohésion disciplinaire au détriment de l'interdisciplinarité.

      La gestion de ces espaces est un "arbitrage" constant entre les besoins de différents acteurs, illustrant que "l'espace c'est le fruit d'une équation de tension entre acteurs sociaux".

    1. Author response:

      The following is the authors’ response to the original reviews

      Reviewer 1:

      Strengths:

      The innovation on the task alone is likely to be impactful for the field, extending recent continuous report (CPR) tasks to examine other aspects of perceptual decision-making and allowing more naturalistic readouts. One interesting and novel finding is the observation of dyadic convergence of confidence estimates even when the partner is incidental to the task performance, and that dyads tend to be more risk-seeking (indicating greater confidence) than when playing solo. The paper is well-written and clear.”

      We thank reviewer 1 for this encouraging evaluation. Below we address the identified weaknesses and recommendations.

      (1) Do we measure metacognitive confidence?

      One concern with the novel task is whether confidence is disambiguated from a tracking of stimulus strength or coherence. […] But in the context of an RDK task, one simple strategy here is to map eccentricity directly to (subjective) motion coherence - such that the joystick position at any moment in time is a vector with motion direction and strength. This would still be an interesting task - but could be solved without invoking metacognition or the need to estimate confidence in one's motion direction decision. […] what the subjects might be doing is tracking two features of the world - motion strength and direction. This possibility needs to be ruled out if the authors want to claim a mapping between eccentricity and decision confidence […].”

      We thank reviewer 1 for pointing out that the joystick tilt responses of our subjects could potentially be driven by stimulus coherence instead of metacognitive decision confidence. Below, we present four arguments to address this point of concern:

      (1.1) Similar physical coherence between high and low confidence states

      Nominal motion coherence is a discrete value, but the random noisiness in the stimulus causes the actual frame-by-frame coherence to be distributed around this nominal value. Because of this, subjects might scale their joystick tilt report according to the coherence fluctuations around the nominal value. To check if this was the case, we use a median split to separate stimulus states into states with large versus small joystick tilt, individually for each nominal coherence. For each stimulus state, we extracted the actual instantaneous (frame-to-frame) motion coherence, which is based on the individual movements of dots in the stimulus patch between two frames, recorded in our data files.

      First, we compared the motion coherence between stimulus states with large versus small joystick tilt. For each stimulus state, we calculated average instantaneous motion coherence, and analyzed the difference of the medians for the large versus small tilt distributions for each subject and each coherence level. The resulting histograms show the distribution of differences across all 38 subjects for each nominal coherence, and are, except for the coherence of 22%, not significantly different from zero across subjects (Author response image 1). For the 22% coherence condition, the difference amounts to 0.19% – a very small, non-perceptible difference. Thus, we do no find systematic differences between the average motion coherence in states with high versus low joystick tilt.

      Author response image 1.

      Histograms of within-subject difference between medians of average coherence distributions with large and small joystick tilt for all subjects. Coherence is color-coded (cyan – 0%, magenta – 98%). On top, the title of each panel illustrates the number of significant differences (Ranksum test in each subject) without correction for multiple comparisons (see Author response table 1 below). In the second row of the title, we show the result of the population t-test against zero. Only 22% coherence shows a significant bias. Positive values indicate higher average coherence for large joystick tilt.  

      Author response table 1.

      List of all individual significantly different coherence distributions between high and low tilt states, without correction for multiple comparisons. Median differences do not show a consistent bias (i.e. positive values) that would indicate higher average coherence for the large tilts.

      (1.2) Short-term stimulus fluctuations have no effect

      […] But to fully characterise the task behaviour it also seems important to ask how and whether fluctuations in motion energy (assuming that the RDK frames were recorded) during a steady state phase are affecting continuous reporting of direction and eccentricity, prior to asking how social information is incorporated into subjects' behaviour.

      In addition to the analysis of stimulus coherence and tilt averaged across each stimulus state (1.1), we analyzed moment-to-moment relationship between instantaneous coherence and ongoing reports of accuracy and tilt. Below, we provide evidence that short-term fluctuations in the instantaneous coherence (i.e. the motion energy of the stimulus) do not result in correlated changes in joystick responses, neither for tilt nor accuracy. For each continuous stimulus state, we calculated cross-correlation functions between the instantaneous coherence, tilt and accuracy, and then averaged the cross-correlation across all states of the same nominal coherence, and then across subjects. The resulting average cross-correlation functions are essentially flat. This further supports our interpretation that the joystick reports do not reflect short-term fluctuations of motion energy.

      Author response image 2.

      Cross-correlation between the length of the resultant vector with joystick accuracy (left) and tilt (right). Coherence is color-coded. Shaded background illustrates 95% confidence intervals.

      (1.3) Joystick tilt changes over time despite stable average stimulus coherence

      If perceptual confidence is derived from evidence integration, we should see changes over time even when the stimulus is stable. Here, we have analyzed the average slope of the joystick tilt as a function of time within each stimulus state for each subject and each coherence, to verify if our participants tilted their joystick more with additional evidence. This is illustrated with a violin plot below (Author response image 3). The linear slopes of the joystick tilt progression over the course of stimulus states are different between coherence levels. High coherence causes more tilt over time, resulting in positive slopes for most subjects. In contrast, low/no coherence results mostly in flat or negative slopes. This tilt progression over time indicates that low coherence results in lower confidence, as subjects do not wager more with weak evidence. In contrast, high coherence causes subjects to exhibit more confidence, indicated by positive slope of the joystick tilt.

      Author response image 3.

      Violin plots showing the fitted slopes of the joystick tilt time course in the last 200 samples (1667 ms) leading up to a next stimulus direction (cf. Figure 2D). Positive values signify an increase in joystick tilt over time. Each dot shows the average slope for one subject. Coherence is color-coded. The dashed line at zero indicates unchanged joystick tilt over the analyzed time window.

      (1.4) Cross-correlation between response accuracy and joystick tilt

      Similar to 1.2 above, we have cross-correlated the frame-by-frame changes of joystick accuracy and tilt for each individual stimulus state and each subject. Across subjects, changes in tilt occur later than changes in accuracy, indicating that changes in the quality of the report are followed by changes in the size of the wager. Given that this process is not driven by short-term changes in the motion energy of the stimulus (see 1.2 above), we interpret this as additional evidence for a metacognitive assessment of the quality of the behavioral report (i.e. accuracy) reflected in the size of the wager (our measure for confidence). (See Figure 2E).

      (2) Peri-decision wagering is different to post-decision wagering

      […] One route to doing this would be to ask whether the eccentricity reports show statistical signatures of confidence that have been established for more classical punctate tasks. Here a key move has been to identify qualitative patterns in the frame of reference of choice accuracy - with confidence scaling positively with stimulus strength for correct decisions, and negatively with stimulus strength for incorrect decisions (the so-called X-pattern, for instance Sanders et al. 2016 Neuron […].

      We thank reviewer 1 for the constructive feedback. Our behavioral data do not show similar signatures to the previously reported post-decision confidence expression (Desender et al., 2021; Sanders et al., 2016). The previously described patterns show, first of all, that confidence for the incorrect type1 decisions diverges from the correct type1 decisions, declining with stimulus strength (e.g. coherence), as compared to increase for correct decisions. In our task, there is a graded accuracy and (putative) confidence expression, but there are no correct or incorrect decisions – instead, there are hits and misses of the reward targets presented at nominal directions. Instead of a decline for misses, we observe an equally positive scaling with coherence for the confidence, both for hits and misses (Author response image 4A). This is because in our peri-decision wagering task, the expression of confidence causally determines the binary hit or miss outcome. The outcome in our task is a function of the two-dimensional joystick response: higher tilt (confidence) requires a more accurate response to successfully hit a target. Thus, a subject can display a high (but not high enough) level of accuracy and confidence but still remain unsuccessful. If we instead median-split the confidence reports by high and low accuracy (Author response image 4C), we observe a slight separation, especially for higher coherences, but still no clear different in slopes.

      We do observe the other two dynamic signatures of confidence (Desender et al., 2021): signature 2 – monotonically increasing accuracy as a function of confidence (Author response image 4), and signature 3 – steeper type 1 psychometric performance (accuracy) for high versus low confidence (Author response image 4D).

      Author response image 4.

      Confidence (i.e., joystick tilt, left column) and accuracy reports (right column) for different stimulus coherence, sorted by discrete outcome (hit versus miss, upper row) and the complementary joystick dimension (lower row, based on median split).

      Author response image 5.

      Accuracy reports correlate positively with confidence reports. For each stimulus state, we averaged the joystick response in the time window between 500 ms (60 samples) after a direction change until the first reward target appearance. If there was no target, we took all samples until the next RDP direction change into account. This corresponds to data snippets averaged in Figure 2D. Thus, for each stimulus state, we extracted a single value for joystick accuracy and for tilt (confidence). Subsequently, we fitted a linear regression to the accuracy-confidence scatter within each subject and within each coherence level. The plot above shows the average linear regression between accuracy and confidence across all subjects (i.e., the slopes and intercepts were averaged across n=38 subjects). Coherence is color-coded.

      (3)  Additional analyses regarding the continuous nature of our data

      I was surprised not to see more analysis of the continuous report data as a function of (lagged) task variables. […]

      Reviewer 1 requested more analyses regarding the continuous nature of our data. We agree that this is a useful addition to our paper, and thank reviewer 1 for this suggestion. To address this point, we revised main Figure 2 and provided additional panels. Panel D illustrates the continuous ramp-up of both accuracy and tilt (confidence) for high coherence levels, suggesting ongoing evidence integration and meta-cognitive assessment. Panel E shows the cross-correlation between frame-by-frame changes in accuracy and tilt (see 1.4 above). Here, we demonstrate that changes in the accuracy precede changes in joystick tilt, characterizing the continuous nature of the perceptual decision-making process.

      (4) Explicit motivation regarding continuous social experiments

      This paper is innovating on a lot of fronts at once - developing a new CPR task for metacognition, and asking exploratory questions about how a social setting influences performance on this novel task. However, the rationale for this combination was not made explicit. Is the social manipulation there to help validate the new task as a measure of confidence as dissociated from other perceptual variables? (see query 1 below). Or is the claim that the social influence can only be properly measured in the naturalistic CPR task, and not in a more established metacognition task?

      Our rationale for the combination of real-time decision making and social settings was twofold:

      i. Primates, including humans, are social species. Naturally, most behavior is centered around a social context and continuously unfolds in real-time. We wanted to showcase a paradigm in which distinct aspects of continuous perceptual decision-making could be assessed over time in individual and social environments.

      ii. Human behavior is susceptible to what others think and do. We wanted to demonstrate that the sheer presence of a co-acting social partner affects continuous decision-making, and quantify the extent and direction of social modulation.

      We agree that the motivation for combining the new task and this specific type of social co-action should be more clear. We have clarified this aspect in the Introduction, line 92-109. In brief, the continuous, free-flowing nature of the CPR task and real-time availability of social information made this design a very suitable paradigm for assessing unconstrained social influences. We see this study as the first step into disentangling the neural basis of social modulation in primates. See also the response to reviewer 2, point 2, below.

      (5) Response to minor points

      (5.1)  Clarification on behavioral modulation patterns

      Lines 295-298, isn't it guaranteed to observe these three behavioral patterns (both participants improving, both getting worse, only one improving while the other gets worse) even in random data?

      The reviewer is correct. We now simply illustrate these possibilities in Figure 4B and how these patterns could lead to divergence or convergence between the participants (see also line 282). Unlike random data, our results predominantly demonstrate convergence.

      (5.2) Clarification on AUC distributions

      Lines 703-707, it wasn't clear what the AUC values referred to here (also in Figure 3) - what are the distributions that are being compared? I think part of the confusion here comes from AUC being mentioned earlier in the paper as a measure of metacognitive sensitivity (correct vs. incorrect trial distributions), whereas my impression here is that here AUC is being used to investigate differences in variables (e.g., confidence) between experimental conditions.

      We apologize for the confusion. Indeed, the AUC analysis was used for the two purposes:

      (i) To assess the metacognitive sensitivity (line 175, Supplementary Figure 2).

      (ii) To assess the social modulation of accuracy and confidence (starting at line 232, Figures 3-6). 

      We now introduce the second AUC approach for assessing social modulation, and the underlying distributions of accuracy and confidence derived from each stimulus state, separately in each subject, in line 232.

      (5.3) Clarification of potential ceiling effects

      Could the findings of the worse solo player benefitting more than the better solo player (Figure 4c) be partly due to a compressive ceiling effect - e.g., there is less room to move up the psychometric function for the higher-scoring player?

      We thank the reviewer for this insight. First, even better performing participants were not at ceiling most of the times, even at the highest coherence (cf. Figure 2 and Supplementary Figure 3C). To test for the potential ceiling effect in the better solo players, we correlated their social modulation (expressed as AUC as in Figure 4) to the solo performance. There was no significant negative correlation for the accuracy (p > 0.063), but there was a negative correlation for the confidence (r = - 0.39, p = 0.0058), indicating that indeed low performing “better players in a dyad” showed more positive social modulation. We note however that this correlation was driven mainly by few such initially low performing “better” players, who mostly belonged to the dyads where both participants improved in confidence (green dots, Figure 4B), and that even the highest solo average confidence was at ceiling (<0.95). To conclude, the asymmetric social modulation effect we observe is mainly due to the better players declining (orange and red dots, Figure 4B), rather than due to both players improving but the better player improving less (green dots, Figure 4B).

      Reviewer 2:

      Strengths:

      There are many things to like about this paper. The visual psychophysics has been undertaken with much expertise and care to detail. The reporting is meticulous and the coverage of the recent previous literature is reasonable. The research question is novel.

      We thank reviewer 2 for this positive evaluation. Below we address the identified weaknesses and recommendations.

      (1) Streamlining the text to make the paper easier to read

      The paper is difficult to read. It is very densely written, with little to distinguish between what is a key message and what is an auxiliary side note. The Figures are often packed with sometimes over 10 panels and very long captions that stick to the descriptive details but avoid clarity. There is much that could be shifted to supplementary material for the reader to get to the main points.

      We thank reviewer 2 for the honest assessment that our article was difficult to read and understand, and for providing specific examples of confusion. We substantially improved the clarity:

      We added a Glossary that defines key terms, including Accuracy and Hit rate. 

      We replaced the confusing term “eccentricity” with joystick “tilt”.

      We simplified Figures 3 and 5, moving some panels into supplementary figures.

      We substantially redesigned and simplified our main Figure 4, displaying the data in a more straightforward, less convoluted way, and removing several panels. This change was accompanied by corresponding changes in the text (section starting at line 277).

      More generally, we shortened the Introduction, substantially revised the Results and the figure legends, and streamlined the Discussion.

      (2) Dyadic co-action vs joint dyadic decision making

      A third and very important one is what the word "dyadic" refers to in the paper. The subjects do not make any joint decisions. However, the authors calculate some "dyadic score" to measure if the group has been able to do better than individuals. So the word dyadic sometimes refers to some "nominal" group. In other places, dyadic refers to the social experimental condition. For example, we see in Figure 3c that AUC is compared for solo vs dyadic conditions. This is confusing.

      […] my key criticism is that the paper makes strong points about collective decision-making and compares its own findings with many papers in that field when, in fact, the experiments do not involve any collective decision-making. The subjects are not incentivized to do better as a group either. […]

      The reviewer is correct to highlight these important aspects. We did, in fact, not investigate a situation where two players had to reach a joint decision with interdependent payoff and there was no incentive to collaborate or even incorporate the information provided by the other player. To make the meaning of “dyadic” in our context more explicit, we have clarified the nature of the co-action and independent payoff (e.g. lines 107, 211, 482, 755 - Glossary), and used the term “nominal combined score” (line 224) and “nominal “average accuracy” within a dyad” (line 439).

      Concerning the key point about embedding our findings into the literature on collective decision-making, we would like to clarify our motivation. Outside of the recent study by Pescetelli and Yeung, 2022, we are not aware of any perceptual decision-making studies that investigated co-action without any explicit joint task. So naturally, we were stimulated by the literature on collective decisions, and felt it is appropriate to compare our findings to the principles derived from this exciting field.  Besides developing continuous – in time and in “space” (direction) – peri-decision wagering CPR game, the social co-action context is the main novel contribution of our work. Although it is possible to formulate cooperative or competitive contexts for the CPR, we leveraged the free-flowing continuous nature of the task that makes it most readily amendable to study spontaneously emerging social information integration.

      We now more explicitly emphasize that most prior work has been done using the joint decision tasks, in contrast to the co-action we study here, in Introduction and Discussion.

      (3) Addition of relevant literature to Discussion

      […] To see why this matters, look at Lorenz et al PNAS (https://www.pnas.org/doi/10.1073/pnas.1008636108) and the subsequent commentary that followed it from Farrell (https://www.pnas.org/doi/full/10.1073/pnas.1109947108). The original paper argued that social influence caused herding which impaired the wisdom of crowds. Farrell's reanalysis of the paper's own data showed that social influence and herding benefited the individuals at the expense of the crowd demonstrating a form of tradeoff between individual and joint payoff. It is naive to think that by exposing the subjects to social information, we should, naturally, expect them to strive to achieve better performance as a group.

      Another paper that is relevant to the relationship between the better and worse performing members of the dyad is Mahmoodi et al PNAS 2015 (https://www.pnas.org/doi/10.1073/pnas.1421692112). Here too the authors demonstrate that two people interacting with one another do not "bother" figuring out each others' competence and operate under "equality assumption". Thus, the lesser competent member turns out to be overconfident, and the more competent one is underconfident. The relevance of this paper is that it manages to explain patterns very similar to Schneider et al by making a much simpler "equality bias" assumption.

      We thank reviewer 2 for pointing out these highly relevant references, which we have now integrated in the Discussion (lines 430 and 467). Regarding the debate of Lorenz et al and Farell, although it is about very different type of tasks – single-shot factual knowledge estimation, it is very illuminating for understanding the differing perspectives on individual vs group benefit. We fully agree that it is naïve to assume that during independent co-action in our highly demanding task participants would strive to achieve better performance as a group – if anything, we expected less normative and more informational, reliability-driven effects as a way to cope with task demands.

      Mahmoodi et al. is a particularly pertinent and elegant study, and the equality bias they demonstrate may indeed underlie the effects we see. We admit that we did not know this paper at the time of our initial writing, but it is encouraging to see the convergence [pun intended] despite task and analysis differences. As highlighted above (2), our novel contributions remain that we observe mutual alignment, or convergence, in real-time without explicitly formulated collective decision task and associated social pressure, and that we separate asymmetric social effects on accuracy and confidence.

      Other reviewer-independent changes:

      Additional information: Angular error in Figure 2

      In panel A of the main Figure 2, we have added the angular error of the solo reports (blue dashed line) to give readers an impression about the average deviation of subjects’ joystick direction from the nominal stimulus direction. We have pointed out that angular error is the basis for accuracy calculation.

      Data alignment

      In the previous version of the manuscript, we have presented data with different alignments: Accuracy values were aligned to the appearance of the first target in a stimulus state (target-alignment) to avoid the predictive influence of target location within the remaining stimulus state, while the joystick tilt was extracted at the end of each stimulus state (state-alignment) to allow subjects more time to make a deliberate, confidence-guided report (Methods). We realized that this is confusing as it compares the social modulation of the two response dimensions at different points in time. In the revision, we use state-aligned data in most figures and analyses and clearly indicate which alignment type has been used. We kept the target-alignment for the illustration of the angular error in the solo-behavior (Figure 2). Specifically, this has only changed the reporting on accuracy statistics. None of the results have changed fundamentally, but the social modulation on accuracy became even stronger in state-aligned data.

      In summary, we hope that these revisions have resulted in an easier-to-understand and convincing article, with clear terminology and concise and important takeaway messages.

      We thank both reviewers and the editors again for their time and effort, and look forward to the reevaluation of our work.

      References

      Desender K, Donner TH, Verguts T. 2021. Dynamic expressions of confidence within an evidence accumulation framework. Cognition 207:104522. doi:10.1016/j.cognition.2020.104522

      Pescetelli N, Yeung N. 2022. Benefits of spontaneous confidence alignment between dyad members. Collective Intelligence 1. doi:10.1177/26339137221126915

      Sanders JI, Hangya B, Kepecs A. 2016. Signatures of a Statistical Computation in the Human Sense of Confidence. Neuron 90:499–506. doi:10.1016/j.neuron.2016.03.025

    1. Reviewer #2 (Public review):

      Summary:

      The authors of this manuscript performed a fascinating set of zebrafish mutant analysis on hox cluster deletion and pinpoint the cause of the pectoral fin loss in one combinatorial hox cluster mutant of hoxba and hoxbb. I support the publication of this manuscript.

      Strengths:

      The study is based on a variety of existing experimental tools that enabled the authors' past construction of hox cluster mutants and is well-designed. The manuscript is well written to report the author's findings on the mechanism that positions the pectoral fin.

      Weaknesses:

      The study does not focus on the other hox clusters than ba and bb, and is confined to the use of zebrafish, as well as the comparison with existing reports from mouse experiments.

      Comments on revisions:

      The authors have sufficiently addressed the concerns raised in my previous review. The revised manuscript substantially strengthens the original work.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      Summary: 

      The authors have used gene deletion approaches in zebrafish to investigate the function of genes of the hox clusters in pectoral fin "positioning" (but perhaps more accurately pectoral fin "formation"). 

      Strengths: 

      The authors have employed a robust and extensive genetic approach to tackle an important and unresolved question. The results are largely presented in a very clear way. 

      We thank the reviewer for the positive summary and for recognizing the strengths of our genetic approach and presentation.

      Weaknesses: 

      The Abstract suggests that no genetic evidence exists in model organisms for a role of Hox genes in limb positioning. There are, however, several examples in mouse and other models (both classical genetic and other) providing evidence for a role of Hox genes in limb position, which is elaborated on in the Introduction.

      It would perhaps be more accurate to state that several lines of evidence in a range of model organisms (including the mouse) support a role for Hox genes in limb positioning. The author's work is not weakened by a more inclusive introduction that cites the current literature more comprehensively. 

      Thank you for this constructive comment. We agree that our Abstract implied an absence of genetic evidence across model organisms and could be misleading. We have revised the Abstract to acknowledge that multiple lines of evidence—including classical and molecular studies in mouse and other models—support a role for Hox genes in limb/fin positioning. We have also expanded the Introduction to cite this literature more comprehensively. These changes clarify the current state of knowledge while preserving the novelty of our zebrafish genetic findings.

      It would be helpful for the authors to make a clear distinction between "positioning" of the limb/fin and whether a limb/fin "forms" at all, independent of the relative position of this event along the body axis.

      We thank the reviewer for pointing this out. In the revised manuscript, we now make a distinction between these two aspects: we describe “positioning” as being specified by the expression domains of Hox genes along the anterior–posterior axis, while the “formation” of pectoral fins reflects the functional requirement of Hox genes to induce tbx5a expression and thereby initiate fin development. We have clarified this distinction in the text to better separate these related but distinct roles of Hox genes.

      Discussion of why the zebrafish is sensitive to Hoxb loss with reference to the fin, but mouse Hoxb mutants do make a limb?  

      We thank the reviewer for this important comment. Our interpretation is that paired fins first appeared in vertebrates that already possessed four Hox clusters. It is likely that novel functions related to pectoral fin positioning emerged within the HoxB cluster at that time, contributing to the origin of pectral fins. In zebrafish, we found that these functions remain largely restricted to the hoxba and hoxbb clusters, such that loss of both results in complete absence of pectoral fins. In contrast, mice exhibit a high degree of functional redundancy across Hox clusters. For example, deletion of all HoxB genes except Hoxb13 does not result in forelimb loss (Medina-Martinez et al., 2000), and forelimbs are still present in Hoxa5;Hoxb5;Hoxc5 triple knockouts (Xu et al., 2013). Thus, although we cannot fully explain why HoxB cluster deletions alone do not abolish forelimb formation in mice, it is plausible that overlapping functions from other Hox clusters compensate for the loss of HoxB genes, consistent with the general robustness of the mammalian Hox system. We have revised the Discussion to clarify this point.

      Is this down to exclusive expression of Hoxbs in the zebrafish pectoral fin forming region rather than a specific functional role of the protein? This is important as it has implications for the interpretation of results throughout the paper and could explain some apparently conflicting results.  

      We thank the reviewer for this insightful comment. To address this point, we newly analyzed the expression patterns of PG4–8 genes in the hoxba and hoxbb clusters. Our in situ hybridization results revealed that only hoxb4a, hoxb5a, and hoxb5b are detectably expressed in the pectoral fin buds (Figure 5C, 5E, Figure 7M-R). While we cannot completely exclude the possibility of functional differences among Hox proteins, our data strongly suggest that the loss of pectoral fins in hoxba;hoxbb cluster mutants is primarily due to the expression domains of these specific Hox genes in the fin-forming region, rather than to unique biochemical properties of the proteins. We have added these new data as a figure in the revised manuscript (Figure 7M-R) and clarified this point in the text (lines 312-316).

      Why is Hoxba more potent than Hoxbb? Is this because Hoxba has Hox4/5 present, while Hoxbb has only Hoxb5? Hoxba locus has retained many more Hox genes in cluster than hoxbb; therefore, one might expect to see greater redundancy in this locus).  

      We thank the reviewer for raising this important point. At present, we do not know the precise reason why hoxba appears more potent than hoxbb. The possibility raised by the reviewer—that differences in retained gene content (e.g., Hox4/5 in hoxba versus only Hoxb5 in hoxbb) may underlie this discrepancy—is certainly plausible. However, our previous study on the formation of dorsal and anal fins showed a similar situation: although PG11–13 Hox genes are present in both hoxca and hoxcb clusters, deletion of hox            genes in hoxca cluster had a more pronounced effect on median fin development (Adachi et al., 2024). This suggests that, following the teleost-specific whole-genome duplication, duplicated Hox clusters are not functionally equivalent, and asymmetric retention or deployment of functions may occur. The mechanistic basis of such bias remains unclear and warrants further investigation.

      Deletion of either Hoxa or Hoxd in the background of the Hoxba mutant does have some effect. Is this a reflection of protein function or expression dynamics of Hoxa/Hoxd genes?  

      We appreciate the reviewer’s comment and the opportunity to clarify this point. In Figure 2, we compared several double mutants with the hoxba single mutant. Among thesm, only the hoxba;hoxbb mutant exhibited a complete loss of tbx5a expression, whereas other combinations did not differ substantially from the hoxba mutant alone. Therefore, we consider that additional deletions such as hoxaa, hoxab, and hoxda do not have a strong effect beyond the hoxba deletion itself, and it is unlikely that Hoxa or Hoxd proteins functionally compensate for Hoxba in regulating tbx5a expression. Consistent with this interpretation, in our previous study we did not detect abnormalities in tbx5a expression in the hoxaa;hoxab;hoxda triple mutant (Ishizaka et al., 2024). Taken together, these observations support our view that the hoxba and hoxbb clusters are specifically required for the induction of tbx5a in the pectoral fin field.

      Can we really be confident that there is a "transformation of pectoral fin progenitor cells into cardiac cells"? 

      The failure to repress Nkx2.5 in the posterior (pelvic fin) domain is clear, but have these cells actually acquired cardiac identity? They would be expected to express Tbx5a (or b) as cardiac precursors, but this domain does not broaden. There is no apparent expansion of the heart (field)/domain or progenitors beyond the 16 somite stage. The claimed "migration" of heart precursors in the mutant is not clear. The heart/cardiac domain that does form in the mutant is not clearly expanded in the mutant. The domain of cmlc2 looks abnormal in the mutant, but I am not convinced it is "enlarged" as claimed by the authors. The authors have not convincingly shown that "the cells that should form the pectoral fin instead differentiate into cardiac cells."  The only clear conclusion is the loss of pectoral fin-forming cells rather than these fin-forming cells being "transformed" into a new identity. It would be interesting to know what has happened to the cells of the pectoral fin-forming region in these double mutants. 

      We sincerely thank the reviewer for this important comment. We agree that our data do not yet allow us to conclude with certainty that the presumptive pectoral fin progenitor cells in hoxba;hoxbb cluster mutants are fully “transformed” into cardiac cells. Our intention was to describe the striking posterior expansion of nkx2.5 expression and the altered morphology of the cmlc2-positive cardiac field in the mutants, which suggested a shift in cell fate. However, as the reviewer correctly points out, we did not directly demonstrate that the missing fin progenitors acquire bona fide cardiac identity.

      To address this, we have revised the text to clarify that the most robust conclusion from our current dataset is the loss of pectoral fin-forming cells in hoxba;hoxbb cluster mutants. We have softened or removed the claim of “transformation” and instead emphasize that our observations are consistent with an expansion of cardiac marker expression domains into the region where fin progenitors normally arise. We also acknowledge that the cmlc2 domain is abnormal rather than unequivocally enlarged, and have adjusted our wording accordingly.

      It is not clear what the authors mean by a "converse" relationship between forelimb/pectoral fin and heart formation. The embryological relationship between these two populations is distinct in amniotes.  

      We thank the reviewer for pointing this out. Our intention was to highlight the reciprocal balance between pectoral fin and cardiac progenitors in zebrafish. In particular, Waxman et al. (2008) demonstrated that retinoic acid signaling promotes pectoral fin formation while restricting the expansion of cardiac progenitors, thereby illustrating this reciprocal relationship. To avoid confusion, we have revised the text to explicitly state that this applies to zebrafish.

      The authors show convincing data that RA cannot induce Tbx5a in the absence of Hob clusters, but I am not convinced by the interpretation of this result. The results shown would still be consistent with RA acting directly upstream of tbx5a, but merely that RA acts in concert with hox genes to activate tbx5a. In the absence of one or the other, Tbx5a would not be expressed. It is not necessary that RA and hoxbs act exclusively in a linear manner (i.e., RA regulates hoxb that in turn regulates tbx5a).  

      We appreciate the reviewer’s thoughtful comment. We agree that our original wording in the Results section implied a strictly linear model of RA→Hox→tbx5a. In response, we have revised the Results to state only the experimental observation, namely that RA-dependent induction of tbx5a does not occur in the absence of the hoxba and hoxbb clusters.

      We have moved the broader interpretation to the Discussion, where we now emphasize that  our data are compatible with multiple models. One possibility is a linear pathway in which RA induces Hox expression that subsequently activates tbx5a. Alternatively, it is also plausible that RA induces Hox expression and that RA and Hox proteins act cooperatively to induce tbx5a. Our findings do not distinguish between these possibilities, and both models remain consistent with the data. We believe this restructuring addresses the reviewer’s concern by keeping the Results factual and limiting mechanistic interpretation to the Discussion.

      The authors have carried out a functional test for the function of hoxb6 and hoxb8 in the hemizygous hoxb mutant background. What is lacking is any expression analysis to demonstrate whether Hoxb6b or Hoxb8b are even expressed in the appropriate pectoral fin territory to be able to contribute to pectoral fin development, either in this assay or in normal pectoral fin development. 

      We thank the reviewer for emphasizing the importance of expression analyses. In response, we performed a comprehensive whole-mount in situ hybridization survey of all eight PG4–8 Hox genes from the hoxba and hoxbb clusters (hoxb4a, hoxb5a, hoxb5b, hoxb6a, hoxb6b, hoxb7a, hoxb8a, and hoxb8b) during pectoral fin development (18–30 hpf). Among these, only hoxb4a, hoxb5a, and hoxb5b displayed detectable expression in the developing pectoral fin buds. In contrast, hoxb6a, hoxb6b, hoxb7a, hoxb8a, and hoxb8b were not expressed in this territory. These new data have been incorporated into the revised manuscript (Fig. 7M-R). We believe that this dataset provides a more complete and systematic picture of which Hoxb genes are available to function in pectoral fin development, and we are grateful to the reviewer for this valuable suggestion, which significantly strengthened our study.

      (The term "compensate" used in this section is confusing/misleading.) 

      We thank the reviewer for this helpful remark. We agree that the term “compensate” was misleading in this context, as it could be confused with genetic compensation mechanisms such as transcriptional adaptation. To avoid this ambiguity, we have revised the wording.

      Specifically, we replaced “compensate for” with “mimic the effect of” or “phenocopy” depending on the context. We believe this revision improves clarity and prevents misunderstanding.

      The authors' confounding results described in Figures 6-7 are consistent with the challenges faced in other model organisms in trying to explore the function of genes in the hox cluster and the known redundancy that exists across paralogous groups and across individual clusters.  Given the experimental challenges in deciphering the actual functions of individual or groups of hox genes, a discussion of the normal expression pattern of individual and groups of hox genes (and how this may change in different mutant backgrounds) could be helpful to make conclusions about likely normal function of these genes and compensation/redundancy in different mutant scenarios.  

      We appreciate the reviewer’s thoughtful comment. We agree that functional analyses of Hox genes are often complicated by redundancy within and across clusters. In this revision, we have included additional expression data of PG4–8 genes from the hoxba and hoxbb clusters, showing that only hoxb4a, hoxb5a, and hoxb5b are expressed in the fin buds. Although we did not analyze expression changes across mutant backgrounds in this study, we consider this an important direction for future experiments.

      Reviewer #2 (Public review): 

      Summary: 

      The authors of this manuscript performed a fascinating set of zebrafish mutant analyses on hox cluster deletion and pinpointed the cause of the pectoral fin loss in one combinatorial hox cluster mutant of Hoxba and Hoxbb. 

      Strengths: 

      The study is based on a variety of existing experimental tools that enabled the authors' past construction of hox cluster mutants, and is well-designed. The manuscript is well written to report the authors' findings on the mechanism that positions the pectoral fin. 

      Weaknesses: 

      The study does not focus on the other hox clusters other than ba and bb, and is confined to the use of zebrafish, as well as the comparison with existing reports from mouse experiments.  

      We thank the reviewer for the thoughtful and encouraging evaluation of our manuscript. We are pleased that the strengths of our study design and clarity of writing were recognized. We also acknowledge the noted limitations, and while our focus here is on zebrafish hoxba and hoxbb clusters, we agree that future studies should expand to other hox clusters and additional models. Below, we provide individual responses to the specific points raised.

      Reviewer #1 (Recommendations for the authors): 

      (1) Some additional expression analyses of Hoxb6/b8 etc, could be carried out to address some issues raised in the main review.  

      We thank the reviewer for this suggestion. In response, we performed additional whole-mount in situ hybridization analyses of PG4–8 genes from the hoxba and hoxbb clusters, including hoxb6b and hoxb8b. These experiments showed that only hoxb4a, hoxb5a, and hoxb5b are expressed in the developing fin buds, whereas hoxb6a, hoxb6b, hoxb7a, hoxb8a, and hoxb8b are not. We have incorporated these new data into the revised manuscript (Figure 7M-R), which we believe clarify why functional tests of hoxb6b and hoxb8b did not uncover specific requirements in fin development.

      (2) The discussion section, particularly the more speculative section on evolutionary significance, could be reduced. Discussion of pelvic fin could be removed also, as this has not and could not be addressed with the current experimental design.  

      We thank the reviewer for this helpful suggestion. In line with the recommendation, we have reduced the speculative section on evolutionary significance in the Discussion to make it more concise and focused. We have also removed the discussion of pelvic fins, as these were not directly addressed by our current experimental design. We believe these changes improve the clarity and focus of the Discussion section.

      (3) The conclusions on transformation to cardiac identity could be reevaluated and presented differently.  

      We appreciate the reviewer’s insightful comment. In the revised manuscript, we have toned down our interpretation regarding a transformation to cardiac identity. Instead, we now describe the findings more cautiously, emphasizing the clear loss of fin precursors rather than a definitive acquisition of cardiac fate. We believe this revision presents a more balanced interpretation of the data.

      (4) Minor typographical - I would suggest removing "Genetic Evidence:" from the title.  

      We appreciate the reviewer’s suggestion. In accordance with this comment, we have revised the title to: “HoxB-derived hoxba and hoxbb clusters are essential for the anterior-posterior positioning of zebrafish pectoral fins”.

      Reviewer #2 (Recommendations for the authors): 

      (1) The authors mention the redundancy (between the a type and b type) of Hox clusters derived from an additional whole genome duplication in the teleost fish lineage. But, they do not refer to whether the zebrafish Tbx5 ortholog has an additional copy. This information helps the readers' interpretation of the data presented. First of all, tbx5a suddenly appears on line 143 without introducing its relationship with Tbx5, which needs to be explained in a revised manuscript.  

      We thank the reviewer for highlighting this important point. In zebrafish, there are indeed two Tbx5 orthologs, tbx5a and tbx5b. In the revised manuscript, we have modified the text around line 124 to introduce tbx5a in the context of its orthology to Tbx5, ensuring that its appearance in the Results is clear to the readers.

      (2) I did not readily get whether the limb/fin 'positioning' that the authors focus on in this study is 'anteroposterior' positioning, but not anything else. If it is what is meant, the word 'anteroposterior' should just be inserted at the first appearance of the word 'positioning'.  

      We thank the reviewer for pointing this out. Our study specifically addresses the anteroposterior positioning of paired appendages, that is, how the initial site of pectoral fin formation is defined along the anterior–posterior axis of the body. To clarify this, we have revised the text to insert the word “anteroposterior” at the first appearance of the term “positioning” in both the Abstract and Introduction (lines 26 and 53). We believe this change resolves the ambiguity and makes the focus of our study explicit.

      (3) Figure 5B also shows the remarkable reduction of hoxc1a expression, which the authors do not mention at all. I wonder how this is explained and how the authors justify no remark on this throughout the manuscript. 

      We thank the reviewer for this insightful comment. As correctly noted, we did observe a marked reduction of hoxc1a expression in Figure 5B. However, based on our genetic analyses, we consider that the causal genes underlying the phenotype are most likely located in hoxba and hoxbb clusters. Therefore, although the change in hoxc1a expression is indeed a notable phenomenon, we did not emphasize it in the manuscript in order to maintain focus on the primary clusters responsible for the observed phenotype (lines 240-241). We agree that this point should be acknowledged, and we have now added a brief note in the Results to clarify our findings.

      (4) Figure 1 consists of multiple panels (A-M) but lacks panel D.  

      We apologize for the oversight. We have corrected it.

      (5) Line 85 - precise role -> exact role.  

      We have corrected it (line 95).

      (6) Line 87 - the vertebrate class Actinopterygii & the class Sarcopterygii. 

      Thank the reviewer for pointing out. We have corrected it (line 98-99).

      (7) Line 90 - homologous -> orthologous. 

      We have corrected it (line 102).

      (8) Figure 5 - For interpretability of the data, I suggest writing 'Paralogous groups' on the top of the panels A and B, and 'Cluster' vertically on the left.  

      We thank the reviewer for this helpful suggestion. As recommended, we have added

      “Paralogous groups” at the top of panels A and B, and “Clusters” vertically on the left side of Figure 5 to facilitate interpretation of the data.

      (9) Some subheading titles are too long. They can be shortened into 'hoxb5a and -b5b expression in pectoral fin buds are RA-dependent' instead of 'Expression patterns of hoxb5a and hoxb5b in pectoral fin buds are dependent on RA', for example.  

      We appreciate the reviewer’s suggestion regarding the length of the subheading titles. In response, we have shortened the relevant subheadings in both the Results and Discussion sections to make them more concise while retaining their scientific meaning. For example, the subheading originally written as “Expression patterns of hoxb5a and hoxb5b in pectoral fin buds are dependent on RA” has been revised to “hoxb5a/b5b expression in pectoral fin buds is

      RA-dependent.” Similar adjustments have been made to other subheadings throughout these sections. We believe these changes improve readability and consistency without altering the intended content.

      (10) Line 408 - why tetrapods, instead of cartilaginous fishes, which are thought of as natural in this context? 

      We appreciate the reviewer’s careful reading and insightful comment. However, in response to Reviewer 1’s suggestion, we have substantially reduced the speculative section on evolutionary significance in the Discussion. As a result, this specific part of the text has now been deleted. We thank the reviewer for raising this point.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      This manuscript uses optical coherence tomography (OCT) to visualize tissue microstructures about 1-2 mm under the finger pad skin surface. Their geometric features are tracked and used to generate tissue strains upon skin surface indentation by a series of transparent stimuli both normal and tangential to the surface. Then movements of the stratum corneum and the upper portion of the viable epidermis are evaluated. Based upon this data, across a number of participants and ridges, around 300 in total, the findings report upon particular movements of these tissue microstructures in various loading states. A better understanding of the mechanics of the skin microstructures is important to understand how surface forces propagate toward the locations of mechanoreceptive end organs, which lie near the edge of the epidermis and dermis, from which tactile responses of at least two peripheral afferents originate. Indeed, the microstructures of the skin are likely to be important in shaping how neural afferents respond and enhance their sensitivity, receptive field characteristics, etc. 

      Strengths: 

      The use of OCT in the context of analyzing the movements of skin microstructures is novel. Also novel and powerful is the use of distinct loading cases, e.g., normal, tangential, and stimulus features, e.g., edges, and curves. I am unaware of other empirical visualization studies of this sort. They are state-of-the-art in this field.

      Moreover, in addition to the empirical imaging observations, strain vectors in the tissues are calculated over time. 

      Weaknesses: 

      The interpretation of the results and their framing relative to the overall hypotheses/questions and prior works could be articulated more clearly. In particular, the major findings of the manuscript are in newly describing a central concept regarding "ridge flanks," but such structures are neither anatomically nor mechanistically defined in a clear fashion. For example, "... it appears that the primary components of ridge deformation and, potentially, neural responses are deformations of the ridge flanks and their relative movement, rather than overall bending of the ridges themselves." From an anatomical perspective, I think what the authors mean by "ridge flanks" is a differential in strain from one lateral side of a papillary ridge to the other. But is it unclear what about the continuous layers of tissue would cause such behaviors. Perhaps a sweat duct or some other structure (not visible to OCT) would subdivide the "flanks" of a papillary ridge somehow? If not due to particular anatomy, then is the importance of the "ridge flank" due to a mechanistic phenomenon of some sort? Given that the findings of the manuscript center upon the introduction of this new concept, I think a greater effort should be made to define what exactly are the "ridge flanks." It is clear from the results, especially the sliding case, that there is something important that the manuscript is getting at with this concept. 

      We apologize for the confusion around our use of ‘ridge flanks’. To recap the overall goal briefly, we wanted to measure the deformation of papillary ridges and their associated sub-surface structures to different tactile stimuli. Capturing these deformations and comparing them against different proposed ideas, for example bending (horizontal shear) of the entire ridge versus differential deformations of different sub-parts, constrains neural activation mechanisms, has implications for how well tactile stimuli can be spatially resolved on the skin, and for whether sub-surface deformations can be easily predicted from surface movements alone. Our mesh was dense enough to compare the stratum corneum and the viable epidermis directly, where we expected some differences due to their previously documented mechanical differences, as well as the ridge flanks, which refers to the two (proximal and distal) sides of a single papillary ridge and their associated structure in the SC and VE (as correctly surmised by the reviewer). Differential behaviour across ridge flanks might be seen, because various observations of the surface of the stratum corneum had suggested mechanical differences between the papillary ridges and the grooves dividing them, potentially leading to differential deformations of these two halves depending on which direction they were facing tissue with different mechanical properties.

      We now provide a clearer definition of ridge flanks in Figure 1 and in the main text. Importantly, existing prior research is better connected to our own investigation in the Introduction and we now specifically explain why we investigate ridge flanks.

      The OCT used herein cannot visualize deep and fully into what the manuscript refers to as a "ridge"(note others have previously broken apart this concept apart into "papillary", "intermediate" and "limiting" ridges) near locations of the mechanoreceptive end organs lie at the epidermal-dermal border. Therefore, the OCT must make inferences about the movements of these deeper tissues, but cannot see them directly, and it is the movements of these deeper tissues that are likely driving the intricacies of neural firing. Note the word "ridge" is used often in the manuscript's abstract, introduction, and discussion but the definition in Fig. 1 and elsewhere differs in important ways from prior works of Cauna (expert in anatomy). Therefore, the manuscript should clarify if "ridge" refers to the papillary ridge (visible at the exterior of the skin), intermediate ridge (defined by Cauna as what the authors refer to as the primary ridge), and limiting ridge (defined by Cauna as what the authors refer to as the secondary ridge). What the authors really mean (I think) is some combination of the papillary and intermediate ridge structures, but not the full intermediate ridge. The manuscript acknowledges this in the "Limitations and future work" section, stating that these ridges cannot be resolved. This is important because the manuscript is oriented toward tracking this structure. It sets up the narrative and hypotheses to evaluate the prior works of Cauna, Gerling, Swensson, and others who all directly addressed the movement of this anatomical feature which is key to understanding ultimately how stresses at these locations might move the peripheral end organs (i.e., Merkel cells, Meissner corpuscles). 

      Thank you for these observations. Indeed, our terminology was not consistent. We have now switched to Cauna’s terminology and added additional labels in Figure 1, explaining all mentioned structures in the main text. We have also changed the language in many instances in the main text to make it clearer whether we are referring to individual anatomical ridges (papillary, limiting, etc.) or the whole structure. Additionally, it is now clearer from the start which features are tracked, and we specifically state  that intermediate ridges are excluded from our tracking.

      Regarding the intermediate ridge, it indeed plays a big role in Cauna’s lever hypothesis. Given the intermediate ridge is excluded from our analysis, we can neither prove nor disprove this hypothesis in our current work. However, there are many mechanical mysteries to solve regarding the structures directly above, which are the main focus of this paper. We have rewritten the introduction to make these questions clearer. For example, Cauna observed pliability of the papillary ridges in surface experiments. Swensson found differential expression patterns of keratin in epidermis tissue in and above the intermediate ridges, but the direct mechanical consequences that are proposed in their paper concern the behaviour of papillary ridges, rather than relying on a mechanical role of intermediate ridges. Even Cauna’s lever idea implies specific deformation of the stratum corneum, which would be measurable in our study, as the upper handle of the ‘lever’ needs turning. We observed little movement in accordance with this idea, putting the lever mechanism into question. While this does not rule out a mechanical role of the intermediate ridge, these findings constrain its potential mechanisms.

      Reviewer #2 (Public Review): 

      Summary: 

      The authors investigate sub-skin surface deformations to a number of different, relevant tactile stimuli, including pressure and moving stimuli. The results demonstrate and quantify the tension and compression applied from these types of touch to fingerprint ridges, where pressure flattens the ridges. Their study further revealed that on lateral movement, prominent vertical shearing occurred in ridge deformation, with somewhat inconsistent horizontal shear. This also shows how much the deeper skin layers are deformed in touch, meaning the activation of all cutaneous mechanoreceptors, as well as the possibility of other deeper non-cutaneous mechanoreceptors. 

      Strengths: 

      The paper has many strengths. As well as being impactful scientifically, the methods are sound and innovative, producing interesting and detailed results. The results reveal the intricate workings of the skin layers to pressure touch, as well as sliding touch over different conditions. This makes it applicable to many touch situations and provides insights into the differential movements of the skin, and thus the encoding of touch in regards to the function of fingerprints. The work is very clearly written and presented, including how their work relates to the literature and previous hypotheses about the function of fingerprint ridges. The figures are very well-presented and show individual and group data well. The additional supplementary information is informative and the video of the skin tracking demonstrates the experiments well. 

      Weaknesses: 

      There are very few weaknesses in the work, rather the authors detail well the limitations in the discussion. Therefore, this opens up lots of possibilities for future work. 

      We thank the reviewer for these encouraging comments.

      Impact/significance: 

      Overall, the work will likely have a large impact on our understanding of the mechanics of the skin. The detail shown in the study goes beyond current understanding, to add profound insights into how the skin actually deforms and moves on contact and sliding over a surface, respectively. The method could be potentially applied in many other different settings (e.g. to investigate more complex textures, and how skin deformation changes with factors like dryness and aging). This fundamental piece of work could therefore be applied to understand skin changes and how these impact touch perception. It can further be applied to understand skin mechanoreceptor function better and model these. Finally, the importance of fingertip ridges is well-detailed, demonstrating how these play a role in directly shaping our touch perception and how they can shape the interactions we have with surfaces. 

      Reviewer #3 (Public Review): 

      Summary: 

      The publication presents unique in-vivo images of the upper layer of the epidermis of the glabrous skin when a flat object compresses or slides on the fingertip. The images are captured using OCT, and are the process of recovering the strain that fingerprints experience during the mechanical stimulation. 

      The most important finding is, in my opinion, that fingerprints undergo pure compression/tension without horizontal shear, hinting at the fact that the shear stress caused by the tangential load is transferred to the deeper tissues and ultimately to the mechanoreceptors (SA-I / RA-I). 

      Strengths: 

      Fascinating new insights into the mechanics of glabrous skin. To the best of my knowledge, this is the first experimental evidence of the mechanical deformation of fingerprints when subjected to dynamic mechanical stimulation. The OCT measurement allows an unprecedented measurement of the depth of the skin whereas previous works were limited to tracking the surface deformation.  - The robust data analysis reveals the continuum mechanics underlying the deformation of the fingerprint ridges. 

      Weaknesses: 

      I do not see any major weaknesses. The work is mainly experimental and is rigorously executed. Two points pique my curiosity, however: 

      (1) How do the results presented in this study compare with previous finite element analysis? I am curious to know if the claim that the horizontal shear strain is transferred to the previous layer is also captured by these models. The reason is that the FEA models typically use homogeneous materials and whether or not the behavior in-silico and in-vivo matches would offer an idea of the nature of the stratum corneum. 

      Very few modeling studies have examined combined normal and tangential loading of the fingertip. Additionally, results are often expressed in terms of Von Mises stresses, and not deformation [1,2], making direct comparison challenging. Nevertheless, one multilayered study [3] supports our finding that the largest deformations are found in deeper tissues.

      (1) Shao, F., Childs, T. H. C., Barnes, C. J. & Henson, B. Finite element simulations of static and sliding contact between a human fingertip and textured surfaces. Tribology International 43, 2308–2316 (2010).

      (2) Tang, W. et al. Investigation of mechanical responses to the tactile perception of surfaces with different textures using the finite element method. Advances in Mechanical Engineering 8, (2016).

      (3) Amaied, E., Vargiolu, R., Bergheau, J. M. & Zahouani, H. Aging effect on tactile perception: Experimental and modelling studies. Wear 332–333, 715–724 (2015). 

      (2) Was there a specific reason why the authors chose to track only one fingerprint? From the method section, it seems that nothing would have prevented tracking a denser point cloud and reconstructing the stain on a section of the skin rather than just one ridge. With such data, the author could extend their analysis to multiple ridges interaction and get a better sense of the behavior of the entire strip of skin. 

      We apologise for the confusion regarding this point. While in our illustration and the accompanying videos, we only show a single tracked ridge for clarity, we do indeed track all visible ridges in every frame. As imaging slices were 4 mm wide, often 8-9 ridges were visible concurrently. However, during the sliding experiments the skin was sometimes dragged along with the stimulus, causing some ridges to disappear from view for certain periods and then re-enter the frame. This would make it difficult to expand the analysis to multiple ridges, but in any case, we found neighbouring ridges to behave very consistently within a given trial, so that their mechanical behaviour (relative to the tactile feature, if any) could be averaged in the analysis.

      Reviewer #1 (Recommendations For The Authors): 

      Discussion, line 213, "Thus, the primary mechanism through which the ridge conforms to the object involves the relative movement and shearing of the ridge flanks, rather than relying on the groves as articulated joints." I don't see this as definitely proven in the imaging and analysis. This could be a hypothesis to come from this work for further evaluation but is a quite strong statement not obviously supported by the evidence. 

      We have rephrased this statement as a proposal for further testing:

      “Therefore, we propose that the primary mechanism through which a ridge conforms to an object might involve the relative movement and shearing of the ridge flanks, rather than relying on the grooves as articulated joints.”

      Discussion, line 220, "Our findings strongly indicate that the majority of the surface movement of the skin was observed by deeper tissue rather than surface layers of the skin." But since there are no measurements of such tissues, or of collagen bundle tightening, etc. it is not obvious to me how this can be proven as it is not directly observable and was not modeled. 

      We have reworded this paragraph to be more cautious and have included potential avenues for future testing of this idea:

      “It is possible that the majority of the surface movement of the skin was absorbed by deeper tissues rather than the surface layers of the skin imaged in the present study. If that is the case, recent modeling work has suggested that tissue deformations are highly dependent on the orientation of collagen fibers in these tissues (Duprez et al., 2024), which might be amenable to tracking in future OCT work to test this idea directly. Additionally, previous work investigating tactile afferent responses to tangential skin movements has reported strong activation of SA-2 receptors, thought to measure skin stretch mainly in deeper tissues (Saal et al., 2025), providing further indirect evidence.”

      Figure 1, A. As noted elsewhere, there are issues with the naming of the anatomy, and there is no definition of the concept of "ridge flanks." Also, it does not indicate the depth point to which OCT can resolve. 

      We have updated and expanded the labels in Figure 1A to clarify the anatomy (along with changes in the text described above). Figure 1C now includes a sentence about the resolvability of features below the mesh:

      “Detail view of a single OCT frame showing ridged skin structure and clear boundary between the stratum corneum and viable epidermis. A mesh covering the stratum corneum and the upper part of the viable epidermis (without the intermediate ridge) is overlaid spanning a single papillary ridge. The border between the viable epidermis and dermis is less clearly delineated, but some deeper features are resolved less well.”

      The concept of a ridge flank is now illustrated in Figure 1B(i) and Figure 1B(iv), and referred to in both the caption and main text. Updated figure caption text:

      “These deformations need not apply to the whole ridge structure but might affect different parts separately, e.g. via shearing in different directions across both ridge flanks  as shown on the far right

      (see darker shading to highlight a single ridge flank).”

      Updated text in the main manuscript:

      “Additionally, if there are indeed mechanical differences between papillary ridges and their neighbouring grooves at the level of the stratum corneum, this might result in differential movements of the two sides of each papillary ridge, here referred to as ridge flanks (see Figure 1B-iv, right, for a potential example).”

      Note that Figure 4B also includes an illustration of this concept.

      Figure 1, B. This mechanical representation does not capture the entirety of the papillary-intermediate ridge unit in question, as set up by the authors in the introduction. Also, in the caption it is not ridge deformation, but upper SC and VE deformation. And the OCT cannot resolve the whole ridge. 

      We have reworded the figure caption”

      “Potential deformations of the tracked ridge structure, including the stratum corneum and the bulk of the viable epidermis, during tactile interactions, with arrows indicating the directions of relative deformation. [...]”

      Importantly, the main manuscript text has been rewritten in the introduction section to clarify our research question and how much of the sub-surface ridge structure is tracked:

      “From a mechanical standpoint, these conflicting interpretations raise the question of how the outermost two skin layers typically deform at the resolution of single papillary ridges, whether by tension, compression, or shear (see examples in Figure 1B). Additionally, such deformations might apply to individual papillary ridges and all their sub-surface structures equally, for example horizontal shearing that bends the papillary ridge in a certain direction, while levering its sub-surface aspects in the opposite direction. Conversely, individual parts of the ridge structure might deform differently. For example, the viable epidermis might deform to a different extent or in different directions due to its lower stiffness and different morphology. Additionally, if there are indeed mechanical differences between papillary ridges and their neighbouring grooves at the level of the stratum corneum, this might result in differential movements of the two sides of each papillary ridge, here referred to as ridge flanks (see Figure 1B-iv, right, for a potential example). To empirically address these questions, we employed Optical Coherence Tomography (OCT) to precisely measure the sub-surface deformation of individual fingerprint ridges in response to a variety of mechanical events. Specifically, we focused on the stratum corneum and the bulk of the viable epidermis (excluding intermediate ridges), which could be robustly resolved and tracked by our setup.”

      Figure 1, C: While it is noted in the caption that the locations of the intermediate and limiting ridges, as well as the collagen bundles, are clearly visible, it is not clear to me, although the caption uses these words. This is especially the case below the orange mesh. From the picture, and because this is not labeled, it leaves it up to my interpretation, it seems like the secondary ridge (limiting) is larger than the primary (intermediate). 

      We have reworded the caption as follows:

      “Detail view of a single OCT frame showing ridged skin structure and clear boundary between the stratum corneum and viable epidermis. A mesh covering the stratum corneum and the upper part of the viable epidermis (without the intermediate ridge) is overlaid spanning a single papillary ridge. The border between the viable epidermis and dermis is less clearly delineated.”

      Indeed, while the intermediate ridge was often visible in the OCT images, its size was rather inconsistent and it could appear as larger or smaller than the limiting ridge, while in histological images it is generally shown as larger (however note that there is somewhat limited data). This difference might be due to imaging artifacts, e.g. limited visibility into the deeper tissues, might reflect individual differences between participants, or could indicate that intermediate ridges are not of a consistent height in the (out-of-plane) direction along a given ridge. We have clarified this in the Limitations section of the Discussion:

      “[...] while we could confidently track landmarks associated with the stratum corneum, we could not reliably identify intermediate ridges in the viable epidermis, though they were visible in some of the frames, limiting the depth of the fitted mesh. We hypothesize that the additional depth of these ridges combined with their slender morphology might have degraded the signal. 3D OCT imaging (see below) might help to resolve these features in future work and settle open questions regarding their precise morphology.”

      Figure 1, D, and E: How do these measurements compare with the literature? They seem reasonable to me based on a cursory review, but there is a need to directly compare, especially since measurements in this context with the OCT are novel and could be valuable. 

      We have clarified this in the main text and added more references to the existing literature:

      “We measured an average ridge width of 0.47 mm across participants (Figure 1D), consistent with previous studies (Moore, 1989; Ohler and Cummins, 1942). Average skin layer thickness was 0.38 mm for the stratum corneum and 0.12 mm for the viable epidermis across our dataset (Figure 1E), again in agreement with previous studies using both in vivo imaging and ex vivo histology (Fruhstorfer et al., 2000; Lintzeri et al., 2022; Maiti et al., 2020).”

      Abstract 4th sentence's structure makes me think that hundreds of individual fingerprint ridges can be tracked at the same time. Perhaps it could be tweaked to clearly indicate that hundreds were tracked between trials between participants. 

      We have changed the sentence to now read:

      “Here, we used optical coherence tomography to image and track sub-surface deformations of hundreds of individual fingerprint ridges across ten participants and four individual contact events at high spatial resolution in vivo.”

      Introduction, 1st sentence, the fingertip per se is not an organ, though the skin is an organ. 

      Changed the wording from “organ” to “structure”.

      Introduction, 1st sentence, "... that convert skin deformations ..." Need to add word skin to be clear. 

      Done.

      Introduction, 3rd paragraph, "Alternately, the grooves may be stiffer or less ...". In this paragraph, and this sentence in particular, Cauna is cited and the words groves and ridges are used. But this is not adequately explained. Cauna had distinct terminology, where he referred to papillary, intermediate, and limiting ridges, that exist in addition to ready ridges. It is important because the manuscript uses the word "ridges" in a non-specific way. This is done not just here but throughout the manuscript, and is central to the questions which can be addressed with OCT. 

      Anatomy has been better defined and more extensively labelled in Figure 1A, including labels for ‘papillary ridges’ and ‘grooves’. We have reworded this paragraph to better explain the concepts and how they relate to the subsequent analyses in the paper

      “Consequently, the mechanical response of the skin below its immediate surface remains largely unknown, leading to conflicting interpretations in the literature. For instance, it has been proposed that the papillary ridges are stiffer than the neighbouring grooves (Swensson et al., 1998), which might imply that normal loading of the skin might not affect the ridges’ profile appreciably. Conversely, other observations have suggested that the grooves are relatively stiff, allowing the papillary ridges to deform considerably (Cauna, 1954; Johansson and LaMotte, 1983). However, the sub-surface consequences of this putative pliability during object contact or stick-to-slip transitions (see e.g. Delhaye et al., 2016) are unclear: the whole ridge structure might bend as proposed in Cauna’s lever mechanism (Cauna, 1954), but this view has proved controversial (see e.g. Gerling and Thomas, 2008), with direct empirical evidence lacking.”

      Figure 1. Avoid red-green dots for colorblind accessibility. PMMA is not in the caption. 

      We have switched the colors of the mechanoreceptors in panel A to a colorblind-friendly scheme. We now also specify the material of the plates in the figure 1 caption.

      Results, line 102. "... papillary ridge structure...." Is this the ridge to which is being referred? 

      In conjunction with the updated labeling in Figure 1A, we have updated the terminology throughout the paper to be more consistent.

      Results, line 99. "We noted a small increase in the area of the strateum corneum, which was likely an artifact due to the fit of the mesh to the ridge's curvature ..." There is very little discussion of Fig. F's finding related to an increase in area in the SC and decrease in the VE. It makes me question if this finding in this panel is an artifact. With stiff tissue like stratum corneum, how would the area increase? 

      This finding could be a measurement artifact or it could be the result of skin from neighbouring regions pushing into the imaged space. We have reworded the brief description in the Results:

      “We noted a small increase in the area of the stratum corneum, which was possibly an artifact due to the imperfect fit of the mesh to the ridge's curvature (but see Discussion for an alternative explanation).”

      Additionally, we have added a short section in the Discussion in the Limitations section:

      “Some of our tactile interactions might have caused skin deformations out-of-plane that were thus not measurable. For example, the slight increase in thickness of the stratum corneum under normal load might be explained as a measurement artifact due to the coarse nature of the mesh fitted, but could alternatively reflect tissue from out-of-plane regions pushing into the imaged space. Indeed, recent surface measurements of the skin's behaviour during initial object contact have reported compression of the skin in the plane parallel to its surface (Doumont et al., 2025), which would result in increasing thickness, assuming that the stratum corneum is incompressible. Future studies could consider creating three-dimensional reconstructions of the fingerprint structure to study such effects.”

      Figure 3. The colors used in slip and stick are not colorblind accessible. 

      We have changed the background colors in Figure 3A,B,C to a colorblind accessible version.

      Results, line 151, "Thus, most of this shearing must be sustained by deeper tissues." But there are no direct observations as such. Also, in the next sentence, "collagen fiber bundles" are referred to in a non-specific way. This section is highly speculative with no systematic visualization of these structures, and should probably be moved to the discussion. 

      We have reworded this sentence to be more cautious. We have now also highlighted collagen fiber bundles visible in the figure. Systematic analysis of these is beyond the scope of the present study, as these were not tracked, but might be possible in future studies. The reworded sentence reads as follows:

      “Thus, it is possible that shearing is sustained by deeper tissues, an effect that could be tested in future studies by directly tracking the angle and orientation of collagen fiber bundles anchoring the epidermis to deeper tissues (see highlighted examples in Figure 3B).”

      Results, line 161, " Horizontal shear ..." do you mean surface shear, per the Fig. 1 definition? 

      For consistency, we have changed the labels to ‘Horizontal shear’ and ‘Vertical shear’ in Figure 1A(iii) and Figure 1A(iv) as these are the terms used throughout the paper.

      Discussion, line 198, "... flatten even at relatively low forces." This is an interesting point and it would be useful to note how low exactly. 

      We have reworded this sentence to better reflect the findings described earlier:

      “We found that individual ridges tended to flatten considerably at relatively low forces of 0.5 N, with higher forces increasing deformations only moderately.”

      Reviewer #2 (Recommendations For The Authors): 

      Minor comments that could improve the paper even further 

      In the abstract, it may be good to specify that the stimuli were all applied to the finger, this was not an active, self-generated tactile interaction, e.g. change 'in response to a variety of tactile stimuli' to 'in response to a variety of passively-applied tactile stimuli'. 

      Done.

      Comment on the grey/blue colours in the figures. I like the combination of blue/orange for different conditions, but sometimes the blue is very difficult to see against the grey background. Is there any way of making the grey background shading lighter and/or the blue darker/more vivid?

      We have changed the color of the SC mesh to a darker shade of blue, which is more easily distinguished from the grey background. This applies to figures 2B/C, 3D, 4A/B/D/E, and all supplementary figures.

      Methods. Could you please add a little more detail about exactly where the images were taken, e.g. in the exact middle of the fingerpad, at the fingertip? Did you line up the skin fingerprint ridges to be in a plane? It is just to better understand how the stimulus moved against the skin, which itself is rounded, and whether it was at a point where the ridges were relatively linear or curved. 

      We have added the following text in the “Experimental set-up” section of the Methods:

      “The participant's finger was secured in a finger holder, which was positioned in such a way that the flat part of the fingertip distal to the whorl made initial contact with the plate as it was lowered onto the fingertip. The scanner was positioned such that its scan path aligned with the distal-proximal axis of the plate, targeting the centre line of the fingerpad so that the fingerprint ridges were oriented orthogonally to the line scan.”

      and

      “For these experiments, imaging focused on the central flat part of the contact area, such that all fingerprint ridges visible in the imaged region were in contact with the plate throughout the trial.”

      Methods. There is no section about statistics, yet you do use them in the paper. It may be good to add a few details in the methods to outline the package you used to do the statistics, as well as why you chose the tests you carried out. 

      We have added a new Statistics section at the end of the Methods:

      “Statistical tests were run in Python using the scipy.stats package. As distributions were skewed, we used non-parametric analyses throughout the study. Bonferroni corrections were used when multiple comparisons were made.”

      A very minor point. Discussion, line 210: 'In this study...' is vague, which study exactly? It is preferable to be more precise, e.g. 'In the present/current study...'. 

      Fixed.

      Discussion. One point you may want to add is the possibility of looking at other skin regions. For example, would this approach work on the palm, on border glabrous/hairy skin, on various hairy skin sites, and on the foot? The possibilities could be endless if it could be applied anywhere, but it may depend on the technical positioning and skin itself. However, it would be interesting to know. 

      We have added the following text at the end of the Discussion section:

      “Finally, while we focused on the fingertip only, many other skin regions present interesting mechanical challenges waiting to be explored. The general ridged structure observed on the fingertip is common to all glabrous skin, but the local ridge mechanics might still differ: glabrous skin on the foot sole exhibits some morphological differences in order to support large weights that might well influence its mechanical response (Boyle et al., 2019). For example, the morphology of transverse ridges (running orthogonal to and connecting limiting with intermediate ridges) differs across regions on the foot sole (Nagashima and Tsuchida, 2011) and very likely from the hand (Yamada et al., 1996). Our method should be directly applicable to study deformations of these ridges, though three-dimensional observations might be needed to resolve some of the open questions. Hairy skin in contrast differs from glabrous skin in that the stratum corneum is much thinner. It also lacks the clearly organised ridge structure, but exhibits more loosely oriented skin folds instead, which very likely also serve a mechanical function (Leyva-Mendivil et al., 2015) and in principle are amenable to study using OCT.”

      In the last lines of the discussion, you mention the possible effects of skin moisturization. The Tomlinson et al. paper refers to the hydration of the skin with regard to water, which I would say is a slightly different factor. I think you can mention this paper and talk about the water level of the skin/hydration, but also add specifically that moisturization (i.e. by an emollient, humectant, or occlusive substance) is another factor to consider (e.g. effects found by Dione et al, 2023 Sci Rep). Overall, these two points relate to the dryness of the skin and the humidity of surfaces being contacted, therefore you could expand on both. 

      Thank you for the correction! We now mention both skin hydration and moisturization separately in this section.

    1. The diagram

      creo que la presentación de los dos diagramas no está resultando bien si la idea es comparar las dos cohortes, sugiero probar solo con uno, y agregar los números para la 2da cohorte de otro color

    1. Cette forme d’empathie peut être appelée aussi extimisante dans la mesure où elle met en jeu le désir d’extimité, qui suppose, rappelons-le, de reconnaître à autrui le pouvoir de nous informer sur nous.

      L’auteur invente le mot “empathie extimisante”. Cela veut dire que le regard de l’autre nous aide à mieux nous connaître. Il y a donc une réciprocité dans l’échange.

    2. Le discours sur l’intimité est inséparable de la possibilité d’établir une relation empathique [27].

      Tisseron relie l’extimité à l’empathie : pour oser se dévoiler, il faut penser que l’autre peut comprendre ce qu’on ressent. Internet crée de nouvelles formes d’empathie.

    3. l’intimité partagée avec un grand nombre a pu être désignée comme intimité «?light?». Sa fonction est de maintenir un lien social léger

      Tisseron parle ici d’“intimité light”. C’est une manière d’être en lien sans vraiment créer de relations profondes. Ça fait penser aux “liens faibles” sur les réseaux.

    4. Le pseudonyme permet parfois la dissimulation, mais il est d’autres fois un masque permettant une forme d’authenticité [17].

      Paradoxalement, se cacher derrière un pseudo peut permettre d’être plus sincère. C’est une idée intéressante : l’anonymat peut libérer la parole.

    5. Internet est d’abord un espace dans lequel on explore des identités multiples.

      Sur Internet, on peut “tester” différentes versions de soi. C’est un moyen d’exploration, surtout à l’adolescence. Ça rejoint l’idée du jeu et de l’imaginaire en développement.

    6. On a besoin d’intimité pour construire les fondations de l’estime de soi, mais la construction complète de celle-ci passe ensuite par le désir d’extimité.

      Tisseron montre que l’intimité et l’extimité vont ensemble. Se cacher permet de se construire, mais se montrer aide à se confirmer dans le regard des autres.

    7. il est pour nous le processus par lequel des fragments du soi intime sont proposés au regard d’autrui afin d’être validés.

      C’est la définition clé du mot “extimité”. On partage une partie de soi pour avoir un retour des autres. Ce n’est pas du tout la même chose que “se montrer pour attirer l’attention”.

    8. Les sujets ne seraient plus victimes de la discipline décrite par M. Foucault dans Surveiller et Punir [6], ils seraient devenus des acteurs de la construction de leur propre prison spéculaire et panoptique.

      Intéressant : il montre qu’on n’est plus “surveillé” par les autres comme avant, mais qu’on s’expose nous-mêmes. On devient acteur de notre propre visibilité.

    1. Reviewer #3 (Public review):

      Summary

      This paper investigates how disinformation affects reward learning processes in the context of a two-armed bandit task, where feedback is provided by agents with varying reliability (with lying probability explicitly instructed). They find that people learn more from credible sources, but also deviate systematically from optimal Bayesian learning: They learned from uninformative random feedback and updated too quickly from fully credible feedback (especially following low-credibility feedback). People also appeared to learn more from positive feedback and there is tentative evidence that this bias is exacerbated for less credible feedback.

      Overall, this study highlights how misinformation could distort basic reward learning processes, without appeal to higher order social constructs like identity.

      Strengths - The experimental design is simple and well-controlled; in particular, it isolates basic learning processes by abstracting away from social context - Modeling and statistics meet or exceed standards of rigor - Limitations are acknowledged where appropriate, especially those regarding external validity and challenges in dissociating positivity bias from perseveration - The comparison model, Bayes with biased credibility estimates, is strong; deviations are much more compelling than e.g. a purely optimal model - The conclusions are of substantial interest from both a theoretical and applied perspective

      Weaknesses

      The authors have done a great job addressing my concerns with the two previous submission. The one issue that they were not able to truly address is the challenge of dissociating positivity bias from perseveration; this challenge weakens evidence for the conclusion that less credible feedback yields a stronger positivity bias. However, the authors have clearly acknowledged this limitation and tempered their conclusions accordingly. Furthermore, the supplementary analyses on this point are suggestive (if not fully conclusive) and do a better job of at least trying to address the confound than most work on positivity/confirmation bias.

      I include my previous review describing the challenge in more detail for reference. I encourage interested readers to see the author response as well. It has convinced me that this weakness is not a reflection of the work, but is instead a fundamental challenge for research on positivity bias.

      Absolute or relative positivity bias?

      The conclusion of greater positivity bias for lower credible feedback (Fig 5) hinges on the specific way in which positivity bias is defined. Specifically, we only see the effect when normalizing the difference in sensitivity to positive vs. negative feedback by the sum. I appreciate that the authors present both and add the caveat whenever they mention the conclusion. However, without an argument that the relative definition is more appropriate, the fact of the matter is that the evidence is equivocal.

      There is also a good reason to think that the absolute definition is more appropriate. As expected, participants learn more from credible feedback. Thus, normalizing by average learning (as in the relative definition) amounts to dividing the absolute difference by increasingly large numbers for more credible feedback. If there is a fixed absolute positivity bias (or something that looks like it), the relative bias will necessarily be lower for more credible feedback. In fact, the authors own results demonstrate this phenomenon (see below). A reduction in relative bias thus provides weak evidence for the claim.

      It is interesting that the discovery study shows evidence of a drop in absolute bias. However, for me, this just raises questions. Why is there a difference? Was one just a fluke? If so, which one?

      Positivity bias or perseveration?

      Positivity bias and perseveration will both predict a stronger relationship between positive (vs. negative) feedback and future choice. They can thus be confused for each other when inferred from choice data. This potentially calls into question all the results on positivity bias.

      The authors clearly identify this concern in the text and go to considerable lengths to rule it out. However, the new results (in revision 1) show that a perseveration-only model can in fact account for the qualitative pattern in the human data (the CA parameters). This contradicts the current conclusion:

      Critically, however, these analyses also confirmed that perseveration cannot account for our main finding of increased positivity bias, relative to the overall extent of CA, for low-credibility feedback.

      Figure 24c shows that the credibility-CA model does in fact show stronger positivity bias for less credible feedback. The model distribution for credibility 1 is visibly lower than for credibilities 0.5 and 0.75.

      The authors need to be clear that it is the magnitude of the effect that the perseveration-only model cannot account for. Furthermore, they should additionally clarify that this is true only for models fit to data; it is possible that the credibility-CA model could capture the full size of the effect with different parameters (which could fit best if the model was implemented slightly differently).

      The authors could make the new analyses somewhat stronger by using parameters optimized to capture just the pattern in CA parameters (for example by MSE). This would show that the models are in principle incapable of capturing the effect. However, this would be a marginal improvement because the conclusion would still rest on a quantitative difference that depends on specific modeling assumptions.

      New simulations clearly demonstrate the confound in relative bias

      Figure 24 also speaks to the relative vs. absolute question. The model without positivity bias shows a slightly stronger absolute "positivity bias" for the most credible feedback, but a weaker relative bias. This is exactly in line with the logic laid out above. In standard bandit tasks, perseveration can be quite well-captured by a fixed absolute positivity bias, which is roughly what we see in the simulations (I'm not sure what to make of the slight increase; perhaps a useful lead for the authors). However, when we divide by average credit assignment, we now see a reduction. This clearly demonstrates that a reduction in relative bias can emerge without any true differences in positivity bias.

      Given everything above, I think it is unlikely that the present data can provide even "solid" evidence for the claim that positivity bias is greater with less credible feedback. This confound could be quickly ruled out, however, by a study in which feedback is sometimes provided in the absence of a choice. This would empirically isolate positivity bias from choice-related effects, including perseveration.

      Comments on revisions:

      Great work on this. The new paper is very interesting as well. I'm delighted to see that the excessive amount of time I spent on this review has had a concrete impact.

    2. Author response:

      The following is the authors’ response to the previous reviews

      eLife Assessment

      This study provides an important extension of credibility-based learning research with a well-controlled paradigm by showing how feedback reliability can distort reward-learning biases in a disinformation-like bandit task. The strength of evidence is convincing for the core effects reported (greater learning from credible feedback; robust computational accounts, parameter recovery) but incomplete for the specific claims about heightened positivity bias at low credibility, which depend on a single dataset, metric choices (absolute vs relative), and potential perseveration or cueing confounds. Limitations concerning external validity and task-induced cognitive load, and the use of relatively simple Bayesian comparators, suggest that incorporating richer active-inference/HGF benchmarks and designs that dissociate positivity bias from choice history would further strengthen this paper.

      We thank the editors and reviewers for a careful assessment.

      In response, we have toned down our claims regarding heightened positivity biases, explicitly stating that the findings are equivocal and depend on the scale (i.e., metric) and study (whereas previously we stated our hypothesis was supported). We have also clarified which aspects of the findings extend beyond perseveration. We believe the evidence now presented provides convincing support for this more nuanced claim.

      We wish to emphasize that dissociating positivity bias from perseveration is a challenge not just for our work, but for the entire field of behavioral reinforcement learning. In fact, in a recent preprint (Learning asymmetry or perseveration? A critical re-evaluation and solution to a pervasive confound, Vidal-Perez et al., 2025; https://osf.io/preprints/psyarxiv/xdse5_v1) we argue that, to date, all studies claiming evidence for positivity bias beyond perseveration suffered flaws, and that there are currently no robust, behavioral, model-agnostic signatures that dissociate effects of positivity bias from perseveration. While this remains a limitation, we would stress that, relative to the state of the art in the field, our work goes beyond what has previously been reported. We believe this should also be reflected in the assessment of our work.

      We elaborate more on these issues in our responses to R3 below.

      Public Reviews:

      Reviewer #1 (Public review):

      Comments on revisions:

      In their updated version the authors have made some edits to address my concerns regarding the framing of the 'normative' bayesian model, clarifying that they utilized a simple bayesian model which is intended to adhere in an idealized manner to the intended task structure, though further simulations would have been ideal.

      The authors, however, did not take my recommendation to explore the symptoms in the symptom scales they collected as being a potential source of variability. They note that these were for hypothesis generation and were exploratory, fair enough, but this study is not small and there should have been sufficient sample size for a very reasonable analysis looking at symptom scores.

      However, overall the toned down claims and clarifications of intent are adequate responses to my previous review.

      We thank the reviewer. We remain convinced that targeted hypotheses tested using betterpowered designs is the most effective way to examine how our findings relate to symptom scales, something we hope to pursue in future studies.

      Reviewer #2 (Public review):

      This important paper studies the problem of learning from feedback given by sources of varying credibility. The convincing combination of experiment and computational modeling helps to pin down properties of learning, while opening unresolved questions for future research.

      Summary:

      This paper studies the problem of learning from feedback given by sources of varying credibility. Two bandit-style experiments are conducted in which feedback is provided with uncertainty, but from known sources. Bayesian benchmarks are provided to assess normative facets of learning, and alternative credit assignment models are fit for comparison. Some aspects of normativity appear, in addition to possible deviations such as asymmetric updating from positive and negative outcomes.

      Strengths:

      The paper tackles an important topic, with a relatively clean cognitive perspective. The construction of the experiment enables the use of computational modeling. This helps to pinpoint quantitatively the properties of learning and formally evaluate their impact and importance. The analyses are generally sensible, and advanced parameter recovery analyses (including cross-fitting procedure) provide confidence in the model estimation and comparison. The authors have very thoroughly revised the paper in response to previous comments.

      Weaknesses:

      The authors acknowledge the potential for cognitive load and the interleaved task structure to play a meaningful role in the results, though leave this for future work. This is entirely reasonable, but remains a limitation in our ability to generalize the results. Broadly, some of the results obtain in cases where the extent of generalization is not always addressed and remains uncertain.

      We thank the reviewer once more for a thoughtful assessment of our work.

      Reviewer #3 (Public review):

      Summary

      This paper investigates how disinformation affects reward learning processes in the context of a twoarmed bandit task, where feedback is provided by agents with varying reliability (with lying probability explicitly instructed). They find that people learn more from credible sources, but also deviate systematically from optimal Bayesian learning: They learned from uninformative random feedback, learned more from positive feedback, and updated too quickly from fully credible feedback (especially following low-credibility feedback). Overall, this study highlights how misinformation could distort basic reward learning processes, without appeal to higher order social constructs like identity.

      Strengths

      • The experimental design is simple and well-controlled; in particular, it isolates basic learning processes by abstracting away from social context

      • Modeling and statistics meet or exceed standards of rigor

      • Limitations are acknowledged where appropriate, especially those regarding external validity - The comparison model, Bayes with biased credibility estimates, is strong; deviations are much more compelling than e.g. a purely optimal model

      • The conclusions are of substantial interest from both a theoretical and applied perspective

      Weaknesses

      The authors have addressed most of my concerns with the initial submission. However, in my view, evidence for the conclusion that less credible feedback yields a stronger positivity bias remains weak. This is due to two issues.

      Absolute or relative positivity bias?

      The conclusion of greater positivity bias for lower credible feedback (Fig 5) hinges on the specific way in which positivity bias is defined. Specifically, we only see the effect when normalizing the difference in sensitivity to positive vs. negative feedback by the sum. I appreciate that the authors present both and add the caveat whenever they mention the conclusion. However, without an argument that the relative definition is more appropriate, the fact of the matter is that the evidence is equivocal.

      We thank the reviewer for an insightful engagement with our manuscript. The reviewer’s comments on the subtle interplay between perseveration and learning asymmetries were so thought-provoking that they have inspired a new article that delves deeply into how gradual choice-perseveration can lead to spurious conclusions about learning asymmetries in Reinforcement Learning (Learning asymmetry or perseveration? A critical re-evaluation and solution to a pervasive confound, Vidal-Perez et al., 2025; https://osf.io/preprints/psyarxiv/xdse5_v1).

      To the point- we agree with the reviewer the evidence for this hypothesis is equivocal, and we took on board the suggestion to tone down our interpretation of the findings. We now state explicitly, both in the results section (“Positivity bias in learning and credibility”) and in the Discussion, that the results provide equivocal support for our hypothesis:

      RESULTS

      “However, we found evidence for agent-based modulation of positivity bias when this bias was measured in relative terms. Here we calculated, for each participant and agent, a relative Valence Bias Index (rVBI) as the difference between the Credit Assignment for positive feedback (CA+) and negative feedback (CA-), relative to the overall magnitude of CA (i.e., |CA+| + |CA-|) (Fig. 5c). Using a mixed effects model, we regressed rVBIs on their associated credibility (see Methods), revealing a relative positivity bias for all credibility levels [overall rVBI (b=0.32, F(1,609)=68.16), 50% credibility (b=0.39, t(609)=8.00), 75% credibility (b=0.41, F(1,609)=73.48) and 100% credibility (b=0.17, F(1,609)=12.62), all p’s<0.001]. Critically, the rVBI varied depending on the credibility of feedback (F(2,609)=14.83, p<0.001), such that the rVBI for the 3-star agent was lower than that for both the 1-star (b=-0.22, t(609)=-4.41, p<0.001) and 2-start agent (b=-0.24, F(1,609)=24.74, p<0.001). Feedback with 50% and 75% credibility yielded similar rVBI values (b=0.028, t(609)=0.56,p=0.57). Finally, a positivity bias could not stem from a Bayesian strategy as both Bayesian models predicted a negativity bias (Fig. 5b-c; Fig. S8; and SI 3.1.1.3 Table S11-S12, 3.2.1.1, and 3.2.1.2). Taken together, this provides equivocal support for our initial hypothesis, depending on the measurement scale used to assess the effect (absolute or relative).”

      “Previous research has suggested that positivity bias may spuriously arise from pure choice-perseveration (i.e., a tendency to repeat previous choices regardless of outcome) (49–51). While our models included a perseveration-component, this control may not be perfect. Therefore, in additional control analyses, we generated (using ex-post simulations based on best fitting parameters) synthetic datasets using models including choice-perseveration but devoid of feedback-valence bias, and fitted them with our credibilityvalence model (see SI 3.6.1). These analyses confirmed that a pure perseveration account can masquerade as an apparent positivity bias and even predict the qualitative pattern of results related to credibility (i.e., a higher relative positivity bias for low-credibility feedback). Critically, however, this account consistently predicted a reduced magnitude of credibility-effect on relative positivity bias as compared to the one we observed in participants, suggesting some of the relative amplification of positivity bias goes above and beyond a contribution from perseveration.”

      DISCUSSION

      “Previous reinforcement learning studies, report greater credit-assignment based on positive compared to negative feedback, albeit only in the context of veridical feedback (43,44,63). Here, we investigated whether a positivity bias is amplified for information of low credibility, but our findings are equivocal and vary as a function of scaling (absolute or relative) and study. We observe selective absolute amplification of a positivity bias for information of low and intermediate credibility in the discovery study alone. In contrast, we find a relative (to the overall extent of CA) amplification of confirmation bias in both studies. Importantly, the magnitude of these amplification effects cannot be reproduced in ex-post simulations of a model incorporating simple choice perseveration without an explicit positivity bias, suggesting that at least part of the amplification reflects a genuine increase in positivity bias.”

      There is also a good reason to think that the absolute definition is more appropriate. As expected, participants learn more from credible feedback. Thus, normalizing by average learning (as in the relative definition) amounts to dividing the absolute difference by increasingly large numbers for more credible feedback. If there is a fixed absolute positivity bias (or something that looks like it), the relative bias will necessarily be lower for more credible feedback. In fact, the authors own results demonstrate this phenomenon (see below). A reduction in relative bias thus provides weak evidence for the claim.

      We agree with the reviewer that absolute and relative measures can yield conflicting impressions. To some extent, this is precisely why we report both (i.e., if the two would necessarily agree, reporting both would be redundant). However, we are unconvinced that one measure is inherently more appropriate than the other. In our view, both are valid as long as they are interpreted carefully and in the right context. To illustrate, consider salary changes, which can be expressed on either an absolute or a relative scale. If Bob’s £100 salary increases to £120 and Alice’s £1000 salary increases to £1050, then Bob’s raise is absolutely smaller but relatively larger. Is one measure more appropriate than the other? Economists would argue not; rather, the choice of scale depends on the question at hand.

      In the same spirit, we have aimed to be as clear and transparent as possible in stating that 1) in the main study, there is no effect in the absolute sense, and 2) framing positivity bias in relative terms is akin to expressing it as a percentage change.

      It is interesting that the discovery study shows evidence of a drop in absolute bias. However, for me, this just raises questions. Why is there a difference? Was one a just a fluke? If so, which one?

      We are unsure why we didn’t find absolute amplification effect within the main studies. However, we don’t think the results from the preliminary study were just a ‘fluke’. We have recently conducted two new studies (in preparation for publication), where we have been able to replicate the finding of increased positivity bias for lower-credibility sources in both absolute and relative terms. We agree current results leave unresolved questions and we hope to follow up on these in the near future.

      Positivity bias or perseveration?

      Positivity bias and perseveration will both predict a stronger relationship between positive (vs. negative) feedback and future choice. They can thus be confused for each other when inferred from choice data. This potentially calls into question all the results on positivity bias.

      The authors clearly identify this concern in the text and go to considerable lengths to rule it out. However, the new results (in revision 1) show that a perseveration-only model can in fact account for the qualitative pattern in the human data (the CA parameters). This contradicts the current conclusion:

      Critically, however, these analyses also confirmed that perseveration cannot account for our main finding of increased positivity bias, relative to the overall extent of CA, for low-credibility feedback.

      Figure 24c shows that the credibility-CA model does in fact show stronger positivity bias for less credible feedback. The model distribution for credibility 1 is visibly lower than for credibilities 0.5 and 0.75.

      The authors need to be clear that it is the magnitude of the effect that the perseveration-only model cannot account for. Furthermore, they should additionally clarify that this is true only for models fit to data; it is possible that the credibility-CA model could capture the full size of the effect with different parameters (which could fit best if the model was implemented slightly differently).

      The authors could make the new analyses somewhat stronger by using parameters optimized to capture just the pattern in CA parameters (for example by MSE). This would show that the models are in principle incapable of capturing the effect. However, this would be a marginal improvement because the conclusion would still rest on a quantitative difference that depends on specific modeling assumptions.

      We thank the reviewer for raising this important point. We agree our original wording could have been more carefully formulated and are grateful for this opportunity to refine this. The reviewer is correct that a model with only perseveration can qualitatively reproduce the pattern of increased relative positivity bias for less credible feedback in the main study (but not in the discovery study), and our previous text did not acknowledge this. As stated in the previous section, we have revised the manuscript (in the Results, Discussion, and SI) to ensure we address this in full. Our revised text now makes it explicit that while a pure perseveration account predicts the qualitative pattern, it does not predict the magnitude of the effects we observe in our data.

      RESULTS

      “Previous research has suggested that positivity bias may spuriously arise from pure choice-perseveration (i.e., a tendency to repeat previous choices regardless of outcome) (49–51). While our models included a perseveration-component, we acknowledge this control is not perfect. Therefore, in additional control analyses, we generated (using ex-post simulations based on best fitting parameters) synthetic datasets using models including choice-perseveration, but devoid of feedback-valence bias, and fitted these with our credibility-valence model (see SI 3.6.1). These analyses confirmed that a pure perseveration account can masquerade as an apparent positivity bias, and even predict the qualitative pattern of results related to credibility (i.e., a higher relative positivity bias for low-credibility feedback). Critically, however, this account consistently predicted a reduced magnitude of credibility-effect on relative positivity bias as compared to the one we observed in participants, suggesting at least some of the relative amplification of positivity bias goes above and beyond contributions from perseveration.”

      DISCUSSION

      “Previous reinforcement learning studies, report greater credit-assignment based on positive compared to negative feedback, albeit only in the context of veridical feedback (43,44,63). Here, we investigated whether a positivity bias is amplified for information of low credibility, but our findings on this matter were equivocal and varied as a function of scaling (absolute or relative) and study. We observe selective absolute amplification of the positivity bias for information of low and intermediate credibility in the discovery study only. In contrast, we find a relative (to the overall extent of CA) amplification of confirmation bias in both studies. Importantly, the magnitude of these amplification effects cannot be reproduced in ex-post simulations of a model incorporating simple choice perseveration without an explicit positivity bias, suggesting that at least part of the amplification reflects a genuine increase in positivity bias.”

      SI (3.6.1)

      “Interestingly, a pure perseveration account predicted an amplification of the relative positivity bias under low (compared to full) credibility (with the two rightmost histograms in Fig. S24d falling in the positive range). However, the magnitude of this effect was significantly smaller than the empirical effect (as the bulk of these same histograms lies below the green points). Moreover, this account predicted a negative amplification (i.e., attenuation) of an absolute positivity bias, which was again significantly smaller than the empirical effect (see corresponding histograms in S24b). This pattern raises an intriguing possibility that perseveration may, at least partially, mask a true amplification of absolute positivity bias.”

      Furthermore, our revisions make it now explicit that these analyses are based on ex-post simulations using the model best-fitting parameters. We do not argue that this pattern can’t be captured by other parameters crafted specifically to capture this pattern. However, we believe that the ex-post fitting is the best practice to check whether a model can produce an effect of interest (see for example The Importance of Falsification in Computational Cognitive Modeling, Palminteri et al., 2017; https://www.sciencedirect.com/science/article/pii/S1364661317300542?via%3Dihub). Based on this we agree with the reviewer the benefit from the suggested additional analyses is minimal.

      New simulations clearly demonstrate the confound in relative bias

      Figure 24 also speaks to the relative vs. absolute question. The model without positivity bias shows a slightly stronger absolute "positivity bias" for the most credible feedback, but a weaker relative bias. This is exactly in line with the logic laid out above. In standard bandit tasks, perseveration can be quite well-captured by a fixed absolute positivity bias, which is roughly what we see in the simulations (I'm not sure what to make of the slight increase; perhaps a useful lead for the authors). However, when we divide by average credit assignment, we now see a reduction. This clearly demonstrates that a reduction in relative bias can emerge without any true differences in positivity bias.

      This relates back to the earlier point about scaling. However, we wish to clarify that this is not a confound in the usual sense i.e., an external variable that varies systematically with the independent variable (credibility) and influences the dependent variable (positivity bias), thereby undermining causal inference. Rather, we consider it is a scaling issue: measuring absolute versus relative changes in the same variable can yield conflicting impressions.

      Given everything above, I think it is unlikely that the present data can provide even "solid" evidence for the claim that positivity bias is greater with less credible feedback. This confound could be quickly ruled out, however, by a study in which feedback is sometimes provided in the absence of a choice. This would empirically isolate positivity bias from choice-related effects, including perseveration.

      We trust our responses make clear we have tempered our claims and stated explicitly where a conclusion is equivocal. We believe we have convincing evidence for a nuanced claim regarding how credibility affects positivity bias.

      We are grateful for the reviewer’s suggestion of a study design to empirically isolate positivity bias from choice-related effects. We have considered this carefully, but do not believe the issue is as straightforward as suggested. As we understand it, the suggestion assumes that positivity bias should persist when people process feedback in the absence of choice (where perseverative tendencies would not be elicited). While this is possible, there is existing work that indicates otherwise. In particular, Chambon et al. (2020, Nature Human Behavior) compared learning following free versus forced choices and found that learning asymmetries, including a positivity bias, were selectively evident in free-choice trials but not in forced-choice trials. This implies that a positivity bias is intricately tied to the act of choosing, rather than a general learning artifact that emerges independently of choice context. This is further supported by arguments that the positivity bias in reinforcement learning is better understood as a form of confirmation bias, whereby feedback confirming a choice is weighted more heavily (Palminteri et al., 2017, Plos Comp. Bio.). In other words, it is unclear whether one should expect positivity/confirmation bias to emerge when feedback is provided in the absence of choice.

      That said, we agree fully with a need to have task designs that better dissociate positivity bias from perseveration. We now acknowledge in our Discussion that such designs can benefit future studies on this topic:

      Future studies could also benefit from using designs that are better suited for dissociating learning asymmetries from gradual perseveration (51).

      We hope to be able to pursue this direction in the future.

      Recommendations for the Authors:

      I greatly appreciate the care with which you responded to my comments. I'm sorry that I can't improve my overall evaluation, given the seriousness of the concerns in the public review (which the new results have unfortunately bolstered more than assuaged). If it were me, I would definitely collect more data because both issues could very likely be strongly addressed with slight modifications of the current task.

      Alternatively, you could just dramatically de-emphasize the claim that positivity bias is higher for less credible feedback. I will be sad because it was my favorite result, but you have many other strong results, and I would still label the paper "important" without this one.

      We thank the reviewer for an exceptionally thorough and insightful engagement with our manuscript. Your meticulous attention to detail, and sharp conceptual critiques, have been invaluable, and our paper is immeasurably stronger and more rigorous as a direct result of this input. Indeed, the referee’s comments inspired us to prepare a new article that delves deeply into the confound of dissociating between gradual choice-perseveration and learning asymmetries in RL (Learning asymmetry or perseveration? A critical re-evaluation and solution to a pervasive confound, Vidal-Perez et al., 2025; https://osf.io/preprints/psyarxiv/xdse5_v1).

      Specifically, in this new paper we address the point that dissociating positivity bias from perseveration is a challenge not just for our work, but for the entire field of behavioral reinforcement learning. In fact, we argue that all studies claiming evidence for positivity bias, over and above an effect of perseveration, are subject to flaws, including being biased to find evidence for positivity/confirmation bias. Furthermore, we agree with the reviewer’s wish to see modelagnostic support and note there are currently no robust, behavioral, model-agnostic signatures implicating positivity bias over and above an effect of perseveration. While this remains an acknowledged limitation within our current work, we trust the reviewer will agree that relative to other efforts in the field, our current work pushes the boundary and takes several important steps beyond what has previously been done in this area.

      Below are some minor notes, mostly on the new content-hopefully easy; please don't put much time into addressing these!

      Main text

      where individuals preferably learn from . Perhaps "preferentially"?

      The text has been modified to accommodate the reviewer’s comment:

      “Additionally, in both experiments, participants exhibited increased learning from trustworthy information when it was preceded by non-credible information and an amplified normalized positivity bias for noncredible sources, where individuals preferentially learn from positive compared to negative feedback (relative to the overall extent of learning).”

      One interpretation of this model is as a "sophisticated" logistic ... the CA parameters take the role of "regression coefficients"

      Consider removing "sophisticated" and also the quotations around "regression coefficients". This came across as unprofessional to me.

      The text has been modified to accommodate the reviewer’s comment:

      “The probability to choose a bandit (say A over B) in this family of models is a logistic function of the contrast choice-propensities between these two bandits. One interpretation of this model is as a logistic regression, where the CA parameters take the role of regression coefficients corresponding to the change in log odds of repeating the just-taken action in future trials based on the feedback (+/- CA for positive or negative feedback, respectively; the model also includes gradual perseveration which allows for constant log-odd changes that are not affected by choice feedback).”

      These models operate as our instructed-credibility and free-credibility Bayesian models, but also incorporate a perseveration values, updated in each trial as in our CA models (Eqs. 3 and 5).

      Is Eq 3 supposed to be Eq 4 here? I don't see how Eq 3 is relevant. Relatedly, please use a variable other than P for perseveration because P(chosen) reads as "probability chosen" - and you actually use P in latter sense in e.g. Eq 11

      The text has been modified to accommodate the reviewer’s comment. P values have been changed to Pers and P(bandit) has been replaced by Prob(bandit). “All models also included gradual perseveration for each bandit. In each trial the perseveration values (Pers) were updated according to

      Where PERS is a free parameter representing the P-value change for the chosen bandit, and fP (Î[0,1]) is the free parameter denoting the forgetting rate applied to the Pers value. Additionally, the Pers-values of all the non-chosen bandits (i.e., again, the unchosen bandit of the current pair, and all the bandits from the not-shown pairs) were forgotten as follows:

      We modelled choices using a softmax decision rule, representing the probability of the participant to choose a given bandit over the alternative:

      SI

      Figure 24 and Figure 26: in the x tick labels, consider using e.g. "0.5 vs 1" rather than "0.5-1". I initially read this as a bin range.

      We thank the reviewer for pointing this out. Our intention was to denote a direct subtraction (i.e., the effect for 0.5 credibility minus the effect for 1.0 credibility). We were concerned that not noting the subtraction might confuse readers about the direction of the plotted effect. We have clarified this in the figure legends:

      “Figure 24: Predicted positivity bias results for participants and for simulations of the Credibility-CA (including perseveration, but no valence-bias component). a, Valence bias results measured in absolute terms (by regressing the ML CA parameters, on their associated valence and credibility). b, Difference in positivity bias (measured in absolute terms) across credibility levels. On the x-axis, the hyphen (-) represents subtraction, such that a label of '0.5-1' indicates the difference in the measurement for the 0.5 and 1.0 credibility conditions. Such differences are again based in the same mixed effects model as plot a. The inflation of aVBI for lower-credibility agents is larger than the one predicted by a pure perseveration account. c, Valence bias results measured in relative terms (by regressing the rVBIs on their associated credibility). Participants present a higher rVBI than what would be predicted by a perseveration account (except for the completely credible agent). d, Difference in rVBI across credibility levels. Such differences are again based in the same mixed effects model as plot c. The inflation of rVBI for lower-credibility agents is larger than the one predicted by a pure perseveration account. Histograms depict the distribution of coefficients from 101 simulated group-level datasets generated by the Credibility-CA model and fitted with the Credibility-Valence CA model. Gray circles represent the mean coefficient from these simulations, while black/green circles show the actual regression coefficients from participant behaviour (green for significant effects in participants, black for non-significant). Significance markers (* p<.05, ** p<.01) indicate that fewer than 5% or 1% of simulated datasets, respectively, predicted an effect as strong as or stronger than that observed in participants, and in the same direction as the participant effect.”

      However, importantly, these simulations did not predict a change in the level of positivity bias as a function of feedback credibility

      You're confirming the null hypothesis here; running more simulations would likely yield a significant effect. The simulation shows a pretty clear pattern of increasing positivity bias with higher credibility. Crucially, this is the opposite of what people show. Please adjust the language accordingly.

      The text has been modified to accommodate the reviewer’s comment.

      “However, importantly, these simulations did not reveal a significant change in the level of positivity bias as a function of feedback credibility, neither at an absolute level (F(3,412)=1.43,p=0.24), nor at a relative level (F(3,412)=2.06,p=0.13) (Fig. S25a-c). Numerically, the trend was towards an increasing (rather than decreasing) positivity bias as a function of credibility.”

      More importantly, the inflation in positivity bias for lower credibility feedback is substantially higher in participants than what would be predicted by a pure perseveration account, a finding that holds true for both absolute (Fig. S24b) and relative (Fig. S24d) measures.

      A statistical test would be nice here, e.g. a regression like rVBI ~ credibility_1 * is_model. Alternatively, clearly state what to look for in the figure, where it is pretty clear when you know exactly what you're looking for.

      The text has been modified to make sure that the figure is easier to interpret (we pointed out to readers what they should look at):

      “Interestingly, a pure perseveration account predicted an amplification of the relative positivity bias under low (compared to full) credibility (with the two rightmost histograms in Fig. S24c falling in the positive range). However, the magnitude of this effect was significantly smaller than the empirical effect (as the bulk of these same histograms lies below the green points). Moreover, this account predicted a negative amplification (i.e., attenuation) of an absolute positivity bias, which was again significantly smaller than the empirical effect (see corresponding histograms in S24b). This pattern raises an intriguing possibility that perseveration may partially mask a true amplification of absolute positivity bias.”

    1. Author response:

      General Statements

      We thank the reviewers for providing us the opportunity to revise our manuscript titled “Identifying regulators of associative learning using a protein-labelling approach in C. elegans.” We appreciate the insightful feedback that we received to improve this work. In response, we have extensively revised the manuscript with the following changes: we have (1) clarified the criteria used for selecting candidate genes for behavioural testing, presenting additional data from ‘strong’ hits identified in multiple biological replicates (now testing 26 candidates, previously 17), (2) expanded our discussion of the functional relevance of validated hits, including providing new tissue-specific and neuron class-specific analyses, and (3) improved the presentation of our data, including visualising networks identified in the ‘learning proteome’, to better highlight the significance of our findings. We also substantially revised the text to indicate our attempts to address limitations related to background noise in the proteomic data and outlined potential refinements for future studies. All revisions are clearly marked in the manuscript in red font. A detailed, point-by-point response to each comment is provided below.

      Point-by-point description of the revisions:

      Reviewer #1 (Evidence, reproducibility and clarity):

      Summary:

      Rahmani et al., utilize the TurboID method to characterize the global proteome changes in the worm's nervous system induced by a salt-based associative learning paradigm. Altogether, Rahmani et al., uncover 706 proteins that are tagged by the TurboID method specifically in samples extracted from worms that underwent the memory inducing protocol. Next, the authors conduct a gene enrichment analysis that implicates specific molecular pathways in saltassociative learning, such as MAP-kinase and cAMP-mediated pathways. The authors then screen a representative group of the hits from the proteome analysis. The authors find that mutants of candidate genes from the MAP-kinase pathway, namely dlk-1 and uev-3, do not affect the performance in the learning paradigm. Instead multiple acetylcholine signaling mutants significantly affected the performance in the associative memory assay, e.g., acc-1, acc-3, gar-1, and lgc-46. Finally, the authors demonstrate that the acetylcholine signaling mutants did not exhibit a phenotype in similar but different conditioning paradigms, such as aversive salt-conditioning or appetitive odor conditioning, suggesting their effect is specific to appetitive salt conditioning.

      Major comments:

      (1) The statistical approach and analysis of the behavior assay:

      The authors use a 2-way ANOVA test which assumes normal distribution of the data. However, the chemotaxis index used in the study is bounded between -1 and 1, which prevents values near the boundaries to be normally distributed.

      Since most of the control data in this assay in this study is very close to 1, it strongly suggests that the CI data is not normally distributed and therefore 2-way ANOVA is expected to give skewed results.

      I am aware this is a common mistake and I also anticipate that most conclusions will still hold also under a more fitting statistical test.

      We appreciate the point raised by Reviewer 1 and understand the importance of performing the correct statistical tests.

      The statistical tests used in this study were chosen since parametric tests, particularly ANOVA tests to assess differences between multiple groups, are commonly used to assess behaviour in the C. elegans learning and memory field. Below is a summary of the tests used by studies that perform similar behavioural tests cited in this work, as examples:

      Author response table 1.

      A summary for the statistical tests performed by similar studies for chemotaxis assay data. References (listed in the leftmost column) were observed to (A) use parametric tests only or (B) performed either a parametric or non-parametric test on each chemotaxis assay dataset depending on whether the data passed a normality test. Listings for ANOVA tests are in bold to demonstrate their common use in the C. elegans learning and memory field.

      We note Reviewer 1's concern that this may stem from a common mistake. As stated, Two-way ANOVA generally relies on normally distributed data. We used GraphPad Prism to perform the Shapiro-Wilk normality test on our chemotaxis assay data as it is generally appropriate for sample sizes < 50 (α = 0.05), and found that most data passes this test including groups with skewed indices. For example, this is the data for Figure S8C:

      Author response table 2.

      Shapiro-Wilk normality test results for chemotaxis assay data in Figure S8C. Chemotaxis assay data was generated to assess salt associative learning capacity for wild-type (WT) versus lgc-46(-) mutant C. elegans. Three experimental groups were prepared for each C. elegans strain (naïve, high-salt control, and trained). From top-to-bottom, the data below displays the ‘W’ value, ‘P value’, a binary yes/no for whether the data passes the Shapiro-Wilk normality test, and a ‘P value summary’ (ns = nonsignificant). W values measure the similarity between a normal distribution and the chemotaxis assay data. Data is considered normal in the Shapiro-Wilk normality test when a W value is near 1.0 and the null hypothesis is not rejected (i.e., P value > 0.05).

      The manuscript now includes the use of the Shapiro-Wilk normality test to assess chemotaxis assay data before using two-way ANOVA on page 51.

      Nevertheless an appropriate statistical analysis should be performed. Since I assume the authors would wish to take into consideration both the different conditions and biological repeats, I can suggest two options:

      - Using a Generalized linear mixed model, one can do with R software.

      - Using a custom bootstrapping approach.

      We thank Reviewer 1 for suggesting these two options. We carefully considered both approaches and consulted with the in-house statistician at our institution (Dr Pawel Skuza, Flinders University) for expert advice to guide our decision. In summary:

      (1) Generalised linear mixed models: Generalised linear mixed models (GLMMs) are generally most appropriate for nested/hierarchal data. However, our chemotaxis assay data does not exhibit such nesting. Each biological replicate (N) consists of three technical replicates, which are averaged to yield a single chemotaxis index per N. Our statistical comparisons are based solely on these averaged values across experimental groups, making GLMMs less applicable in this context.

      (2) Bootstrapping: Based on advice from our statistician, while bootstrapping can be a powerful tool, its effectiveness is limited when applied to datasets with a low number of biological replicates (N). Bootstrapping relies on resampling existing data to simulate additional observations, which may artificially inflate statistical power and potentially suggest significance where the biological effect size is minimal or not meaningful. Increasing the number of biological replicates to accommodate bootstrapping could introduce additional variability and compromise the interpretability of the results.

      The total number of assays, especially controls, varies quite a bit between the tested mutants. For example compare the acc-1 experiment in Figure 4.A., and gap-1 or rho-1 in Figure S4.A and D. It is hard to know the exact N of the controls, but I assume that for example, lowering the wild type control of acc-1 to equivalent to gap-1 would have made it non significant. Perhaps the best approach would be to conduct a power analysis, to know what N should be acquired for all samples.

      We thoroughly evaluated performing the power analysis: however, this is typically performed with the assumption that an N = 1 represents a singular individual/person. An N =1 in this study is one biological replicate that includes hundreds of worms, which is why it is not typically employed in our field for this type of behavioural test.

      Considering these factors, we have opted to continue using a two-way ANOVA for our statistical analysis. This choice aligns with recent publications that employ similar experimental designs and data structures. Crucially, we have verified that our data meet the assumptions of normality, addressing key concerns regarding the suitability of parametric testing. We believe this approach is sufficiently rigorous to support our main conclusions. This rationale is now outlined on page 51.

      To be fully transparent, our aim is to present differences between wild-type and mutant strains that are clearly visible in the graphical data, such that the choice of statistical test does not become a limiting factor in interpreting biological relevance. We hope this rationale is understandable, and we sincerely appreciate the reviewer’s comment and the opportunity to clarify our analytical approach.

      We hope that Reviewer 1 will appreciate these considerations as sufficient justification to retain the statistical tests used in the original manuscript. Nevertheless, to constructively address this comment, we have performed the following revisions:

      (1) Consistent number of biological replicates: We performed additional biological replicates of the learning assay to confirm the behavioural phenotypes for the key candidates described (KIN-2 , F46H5.3, ACC-1, ACC-3, LGC-46). We chose N = 5 since most studies cited in this paper that perform similar behavioural tests do the same (see Author response table 3 below).

      Author response table 3.

      A summary for sample sizes generated by similar studies for chemotaxis assay data. References (listed in the leftmost column) were observed to the sample sizes (N) below corresponding to biological replicates of chemotaxis assay data. N values are in bold when the study uses N ≤ 5.

      (1) Grouped presentation of behavioural data: We now present all behavioural data by grouping genotypes tested within the same biological replicate, including wild-type controls, rather than combining genotypes tested separately. This ensures that each graph displays data from genotypes sharing the same N, also an important consideration for performing parametric tests. Accordingly, we re-performed statistical analyses using this reduced N for relevant graphs. As anticipated, this rendered some comparisons non-significant. All statistical comparisons are clearly indicated on each graph.

      (2) Improved clarity of figure legends: We revised figure legends for Figures 5, 6, S7, S8, & S9 to make clear how many biological replicates have been performed for each genotype by adding N numbers for each genotype in all figures.

      The authors use the phrasing "a non-significant trend", I find such claims uninterpretable and should be avoided. Examples: Page 16. Line 7 and Page 18, line 16.

      This is an important point. While we were not able to find the specific phrasing "a non-significant trend" from this comment in the original manuscript, we acknowledge that referring to a phenotype as both a trend and non-significant may confuse readers, which was originally stated in the manuscript in two locations.

      The main text has been revised on pages 27 & 28 when describing comparisons between trained groups between two C. elegans lines, by removing mentions of trends and retaining descriptions of non-significance.

      (2) Neuron-specific analysis and rescue of mutants:

      Throughout the study the authors avoid focusing on specific neurons. This is understandable as the authors aim at a systems biology approach, however, in my view this limits the impact of the study. I am aware that the proteome changes analyzed in this study were extracted from a pan neuronally expressed TurboID. Yet, neuron-specific changes may nevertheless be found. For example, running the protein lists from Table S2, in the Gene enrichment tool of wormbase, I found, across several biological replicates, enrichment for the NSM, CAN and RIG neurons. A more careful analysis may uncover specific neurons that take part in this associative memory paradigm. In addition, analysis of the overlap in expression of the final gene list in different neurons, comparing them, looking for overlap and connectivity, would also help to direct towards specific circuits.

      This is an important and useful suggestion. We appreciate the benefit in exploring the data from this study from a neuron class-specific lens, in addition to the systems-level analyses already presented.

      The WormBase gene enrichment tool is indeed valuable for broad transcriptomic analyses (the findings from utilising this tool are now on page 16); however, its use of Anatomy Ontology (AO) terms also contains annotations from more abundant non-neuronal tissues in the worm. To strengthen our analysis and complement the Wormbase tool, we also used the CeNGEN database as suggested by Reviewer 3 Major Comment 1 (Taylor et al., 2021), which uses single cell RNA-Seq data to profile gene expression across the C. elegans nervous system. We input our learning proteome data into CeNGEN as a systemic analysis, identifying neurons highly represented by the learning proteome (on pages 16-20). To do this, we specifically compared genes/proteins from high-salt control worms and trained worms to identify potential neurons that may be involved in this learning paradigm. Briefly, we found:

      - WormBase gene enrichment tool: Enrichment for anatomy terms corresponding to specific interneurons (ADA, RIS, RIG), ventral nerve cord neurons, pharyngeal neurons (M1, M2, M5, I4), PVD sensory neurons, DD motor neurons, serotonergic NSM neurons, and CAN.

      - CeNGEN analysis: Representation of neurons previously implicated in associative learning (e.g., AVK interneurons, RIS interneurons, salt-sensing neuron ASEL, CEP & ADE dopaminergic neurons, and AIB interneurons), as well as neurons not previously studied in this context (pharyngeal neurons I3 & I6, polymodal neuron IL1, motor neuron DA9, and interneuron DVC). Methods are detailed on pages 50 & 51.

      These data are summarised in the revised manuscript as Table S7 & Figure 4.

      To further address the reviewer’s suggestion, we examined the overlap in expression patterns of the validated learning-associated genes acc-1, acc-3, lgc-46, kin-2, and F46H5.3 across the neuron classes above, using the CeNGEN database. This was done to explore potential neuron classes in which these regulators may act in to regulate learning. This analysis revealed both shared and distinct expression profiles, suggesting potential functional connectivity or co-regulation among subsets of neurons. To summarise, we found:

      - All five learning regulators are expressed in RIM interneurons and DB motor neurons.

      - KIN-2 and F46H5.3 share the same neuron expression profile and are present in many neurons, so they may play a general function within the nervous system to facilitate learning.

      - ACC-3 is expressed in three sensory neuron classes (ASE, CEP, & IL1).

      - In contrast, ACC-1 and LGC-46 are expressed in neuron classes (in brackets) implicated in gustatory or olfactory learning paradigms (AIB, AVK, NSM, RIG, & RIS) (Beets et al., 2012, Fadda et al., 2020, Wang et al., 2025, Zhou et al., 2023, Sato et al., 021), neurons important for backward or forward locomotion (AVE, DA, DB, & VB) (Chalfie et al., 1985), and neuron classes for which their function is yet detailed in the literature (ADA, I4, M1, M2, & M5).

      These neurons form a potential neural circuit that may underlie this form of behavioural plasticity, which we now describe in the main text on pages 16-20 & 34-35 and summarise in Figure 4.

      OPTIONAL: A rescue of the phenotype of the mutants by re-expression of the gene is missing, this makes sure to avoid false-positive results coming from background mutations. For example, a pan neuronal or endogenous promoter rescue would help the authors to substantiate their claims, this can be done for the most promising genes. The ideal experiment would be a neuron-specific rescue but this can be saved for future works.

      We appreciate this suggestion and recognise its potential to strengthen our manuscript. In response, we made many attempts to generate pan-neuronal and endogenous promoter reexpression lines. However, we faced several technical issues in transgenic line generation, including poor survival following microinjection likely due to protein overexpression toxicity (e.g., C30G12.6, F46H5.3), and reduced animal viability for chemotaxis assays, potentially linked to transgene-related reproductive defects (e.g., ACC-1). As we have previously successfully generated dozens of transgenic lines in past work (e.g. Chew et al., Neuron 2018; Chew et al., Phil Trans B 2018; Gadenne/Chew et al., Life Science Alliance 2022), we believe the failure to produce most of these lines is not likely due to technical limitations. For transparency, these observations have been included in the discussion section of the manuscript on pages 39 & 40 as considerations for future troubleshooting.

      Fortunately, we were able to generate a pan-neuronal promoter line for KIN-2 that has been tested and included in the revised manuscript. This new data is shown in Figure 5B and described on pages 23 & 24. Briefly, this shows that pan-neuronal expression of KIN-2 from the ce179 mutant allele is sufficient to reproduce the enhanced learning phenotype observed in kin2(ce179) animals, confirming the role of KIN-2 in gustatory learning.

      To address the potential involvement of background mutations (also indicated by Reviewer 4 under ‘cross-commenting’), we have also performed experiments with backcrossed versions of several mutants. These experiments aimed to confirm that salt associative learning phenotypes are due to the expected mutation. Namely, we assessed kin-2(ce179) mutants that had been backcrossed previously by another laboratory, as well as C30G12.6(-) and F46H5.3(-) animals backcrossed in this study. Although not all backcrossed mutants retained their original phenotype (i.e., C30G12.6) (Figure 6D, a newly added figure), we found that backcrossed versions of KIN-2 and F46H5.3 both robustly showed enhanced learning (Figures 5A & 6B).

      This is described in the text on pages 23-26.

      Minor comments:

      (1) Lack of clarity regarding the validation of the biotin tagging of the proteome.

      The authors show in Figure 1 that they validated that the combination of the transgene and biotin allows them to find more biotin-tagged proteins. However there is significant biotin background also in control samples as is common for this method. The authors mention they validated biotin tagging of all their experiments, but it was unclear in the text whether they validated it in comparison to no-biotin controls, and checked for the fold change difference.

      This is an important point: We validated our biotin tagging method prior to mass spectrometry by comparing ‘no biotin’ and ‘biotin’ groups. This is shown in Figure S1 in the revised manuscript, which includes a western blot comparing untreated and biotin treated animals that are nontransgenic or expressing TurboID. As expected, by comparing biotinylated protein signal for untreated and treated lanes within each line, biotin treatment increased the signal 1.30-fold for non-transgenic and 1.70-fold for TurboID C. elegans. This is described on page 8 of the revised manuscript.

      To clarify, for mass spectrometry experiments, we tested a no-TurboID (non-transgenic) control, but did not perform a no-biotin control. We included the following four groups: (1) No-TurboID ‘control’ (2) No-TurboID ‘trained’, (3) pan-neuronal TurboID ‘control’ and (4) pan-neuronal TurboID ‘trained’, where trained versus control refers to whether ‘no salt’ was used as the conditioned stimulus or not, respectively (illustrated in Figure 1A). Due to the complexity of the learning assay (which involves multiple washes and handling steps, including a critical step where biotin is added during the conditioning period), and the need to collect sufficient numbers of worms for protein extraction (>3,000 worms per experimental group), adding ‘no-biotin’ controls would have doubled the number of experimental groups, which we considered unfeasible for practical reasons. This is explained on pages 8 & 9 of the revised manuscript.

      Also, it was unclear which exact samples were tested per replicate. In Page 9, Lines 17-18: "For all replicates, we determined that biotinylated proteins could be observed ...", But in Page 8, Line 24 : "We then isolated proteins from ... worms per group for both 'control' and 'trained' groups,... some of which were probed via western blotting to confirm the presence of biotinylated proteins".

      Could the authors specify which samples were verified and clarify how?

      Thank you for pointing out these unclear statements: We have clarified the experimental groups used for mass spectrometry experiments as detailed in the response above on pages 8 & 9. In addition, western blots corresponding to each biological replicate of mass spectrometry data described in the main text on page 10 and have been added to the revised manuscript (as Figure S3). These western blots compare biotinylation signal for proteins extracted from (1) NoTurboID ‘control’ (2) No-TurboID ‘trained’, (3) pan-neuronal TurboID ‘control’ and (4) panneuronal TurboID ‘trained’. These blots function to confirm that there were biotinylated proteins in TurboID samples, before enrichment by streptavidin-mediated pull-down for mass spectrometry.

      OPTIONAL: include the fold changes of biotinylated proteins of all the ones that were tested. Similar to Figure 1.C.

      This is an excellent suggestion. As recommended by the reviewer, we have included foldchanges for biotinylated protein levels between high-salt control and trained groups (on pages 9 & 10 for replicate #1 and in Table S2 for replicates #2-5). This was done by measuring protein levels in whole lanes for each experimental group per biological replicate within western blots (Figure 1C for replicate #1 and Figure S3 for replicates #2-5) of protein samples generated for mass spectrometry (N = 5).

      (2) Figure 2 does not add much to the reader, it can be summarized in the text, as the fraction of proteins enriched for specific cellular compartments.

      I would suggest to remove Figure 2 (originally written as figure 3) to text, or transfer it to the supplementry material.

      As noted in cross-comment response to Reviewer 4, there were typos in the original figure references, we have corrected them above. Essentially, this comment is referring to Figure 2.

      We appreciate this feedback from Reviewer 1. We agree that the original Figure 2 functions as a visual summary from analysis of the learning proteome at the subcellular compartment level. However, it also serves to highlight the following:

      - Representation for neuron-specific GO terms is relatively low, but even this small percentage represents entire protein-protein networks that are biologically meaningful, but that are difficult to adequately describe in the main text.

      - TurboID was expressed in neurons so this figure supports the relevance of the identified proteome to biological learning mechanisms.

      - Many of these candidates could not be assessed by learning assay using single mutants since related mutations are lethal or substantially affect locomotion. These networks therefore highlight the benefit in using strategies like TurboID to study learning.

      We have chosen to retain this figure, moving it to the supplementary material as Figure S4 in the revised manuscript, as suggested.

      OPTIONAL- I would suggest the authors to mark in a pathway summary figure similar to Figure 3 (originally written as Figure 4) the results from the behavior assay of the genetic screen. This would allow the reader to better get the bigger picture and to connect to the systemic approach taken in Figures 2 and 3.

      We think this is a fantastic suggestion and thank Reviewer 1 for this input. In the revised manuscript, we have added Figure 7, which summarises the tested candidates that displayed an effect on learning, mapped onto potential molecular pathways derived from networks in the learning proteome. This figure provides a visual framework linking the behavioural outcomes to the network context. This is described in the main text on pages 32-33.

      (3) Typo in Figure 3: the circle of PPM1: The blue right circle half is bigger than the left one.

      We thank the Reviewer for noticing this, the node size for PPM-1.A has been corrected in what is now Figure 2 in the revised work.

      (4) Unclarity in the discussions. In the discussion Page 24, Line 14, the authors raise this question: "why are the proteins we identified not general learning regulators?. The phrasing and logic of the argumentation of the possible answers was hard to follow. - Can you clarify?

      We appreciate this feedback in terms of unclarity, as we strive to explain the data as clearly and transparently as possible. Our goal in this paragraph was to discuss why some candidates were seen to only affect salt associative learning, as opposed to showing effects in multiple learning paradigms (i.e., which we were defining as a ‘general learning regulator’). We have adjusted the wording in several places in this paragraph now on pages 36 & 37 to address this comment. We hope the rephrased paragraph provides sufficient rationalisation for the discussion regarding our selection strategy used to isolate our protein list of potential learning regulators, and its potential limitations.

      Cross-Commenting

      Firstly, we would like to express our appreciation for the opportunity for reviewers to crosscomment on feedback from other reviewers. We believe this is an excellent feature of the peer review process, and we are grateful to the reviewers for their thoughtful engagement and collaborative input.

      I would like to thank Reviewer #4 for the great cross comment summary, I find it accurate and helpful.

      I also would like to thank Reviewer #4 for spotting the typos in my minor comments, their page and figure numbers are the correct ones.

      We have corrected these typos in the relevant comments, and have responded to them accordingly.

      Small comment on common point 1 - My feeling is that it is challanging to do quantitative mass spectrometry, especially with TurboID. In general, the nature of MS data is that it hints towards a direction but a followup validation work is required in order to assess it. For example, I am not surprised that the fraction of repeats a hit appeared in does not predict well whether this hit would be validated behavioraly. Given these limitations, I find the authors' approach reasonable.

      We thank Reviewer 1 for this positive and thoughtful feedback. We also appreciate Reviewer 4’s comment regarding quantitative mass spectrometry and have addressed this in detail below (see response to Reviewer 4). However, we agree with Reviewer 1 that there are practical challenges to performing quantitative mass spectrometry with TurboID, primarily due to the enrichment for biotinylated proteins that is a key feature of the sample preparation process.

      Importantly, we whole-heartedly agree with Reviewer 1’s statement that “In general, the nature of MS data is that it hints towards a direction but a follow-up validation work is required in order to assess it”. This is the core of our approach: however, we appreciate that there are limitations to a qualitative ‘absent/present’ approach. We have addressed some of these limitations by clarifying the criteria used for selecting candidate genes, based additionally on the presence of the candidate in multiple biological replicates (categorised as ‘strong’ hits). Based on this method, we were able to validate the role of several novel learning regulators (Figures 5, 6, & S7). We sincerely hope that this manuscript can function as a direction for future research, as suggested by this Reviewer.

      I also would like to highlight this major comment from reviewer 4:

      "In Experimental Procedures, authors state that they excluded data in which naive or control groups showed average CI < 0.6499, and/or trained groups showed average CI < -0.0499 or > .5499 for N2 (page 36, lines 5-7). "

      This threshold seems arbitrary to me too, and it requires the clarifications requested by reviewer 4.

      As detailed in our response to Reviewer 4, Major Comment 2, data were excluded only in rare cases, specifically when N2 worms failed to show strong salt attraction prior to training, or when trained N2 worms did not exhibit the expected behavioural difference compared to untrained controls – this can largely be attributed to clear contamination or over-population issues, which are visible prior to assessing CTX plates and counting chemotaxis indices.

      These criteria were initially established to provide an objective threshold for excluding biological replicates, particularly when planning to assay a large number of genetic mutants. However, after extensive testing across many replicates, we found that N2 worms (that were not starved, or not contaminated) consistently displayed the expected phenotype, rendering these thresholds unnecessary. We acknowledge that emphasizing these criteria may have been misleading, and have therefore removed them from page 50 in the revised manuscript to avoid confusion and ensure clarity.

      Reviewer #1 (Significance):

      This study does a great job to effectively utilize the TurboID technique to identify new pathways implicated in salt-associative learning in C. elegans. This technique was used in C. elegans before, but not in this context. The salt-associative memory induced proteome list is a valuable resource that will help future studies on associative memory in worms. Some of the implicated molecular pathways were found before to be involved in memory in worms like cAMP, as correctly referenced in the manuscript. The implication of the acetylcholine pathway is novel for C. elgeans, to the best of my knowledge. The finding that the uncovered genes are specifically required for salt associative memory and not for other memory assays is also interesting.

      However overall I find the impact of this study limited. The premise of this work is to use the Turbo-ID method to conduct a systems analysis of the proteomic changes. The work starts by conducting network analysis and gene enrichment which fit a systemic approach. However, since the authors find that ~30% of the tested hits affect the phenotype, and since only 17/706 proteins were assessed, it is challenging to draw conclusive broad systemic claims.

      Alternatively, the authors could have focused on the positive hits, and understand them better, find the specific circuits where these genes act. This could have increased the impact of the work. Since neither of these two options are satisfied, I view this work as solid, but not wide in its impact and therefore estimate the audience of this study would be more specialized.

      My expertise is in C. elegans behavior, genetics, and neuronal activity, programming and machine learning.

      We thank the Reviewer for these comments and appreciate the recognition of the value of the proteomic dataset and the identification of novel molecular pathways, including the acetylcholine pathway, as well as the specificity of the uncovered genes to salt-associative memory. Regarding the reviewer’s concern about the overall impact and scope of the study, we respectfully offer the following clarification. Our aim was to establish a systems-level approach for investigating learning-related proteomic changes using TurboID, and we acknowledge that only a subset of the identified proteins was experimentally tested (now 26/706 proteins in the revised manuscript). Although only five of the tested single gene mutants showed a robust learning phenotype in the revised work (after backcrossing, more stringent candidate selection, improved statistical analysis in addressing reviewer comments), our proteomic data provides us a unique opportunity to define these candidates within protein-protein networks (as illustrated in Figure 7). Importantly, our functional testing focused on single-gene mutants, which may not reveal phenotypes for genes that act redundantly (now mentioned on pages 28-30). This limitation is inherent to many genetic screens and highlights the value of our proteomic dataset, which enables the identification of broader protein-protein interaction networks and molecular pathways potentially involved in learning.

      To support this systems-level perspective, we have added Figure 7, which visually integrates the tested candidates into molecular pathways derived from the learning proteome for learning regulators KIN-2 and F46H5.3. We also emphasise more explicitly in the text (on pages 32-33) the value of our approach by highlighting the functional protein networks that can be derived from our proteomics dataset.

      We fully acknowledge that the use of TurboID across all neurons limits the resolution needed to pinpoint individual neuron contributions, and understand the benefit in further experiments to explore specific circuits. Many circuits required for salt sensing and salt-based learning are highly explored in the literature and defined explicitly (see Rahmani & Chew, 2021), so our intention was to complement the existing literature by exploring the protein-protein networks involved in learning, rather than on neuron-neuron connectivity. However, we recognise the benefit in integrating circuit-level analyses, given that our proteomic data suggests hundreds of candidates potentially involved in learning. While validating each of these candidates is beyond the scope of the current study, we have taken steps to suggest candidate neurons/circuits by incorporating tissue enrichment analyses and single-cell transcriptomic data (Table S7 & Figure 4). These additions highlight neuron classes of interest and suggest possible circuits relevant to learning.

      We hope this clarification helps convey the intended scope and contribution of our study. We also believe that the revisions made in response to Reviewer 1’s feedback have strengthened the manuscript and enhanced its significance within the field.

      Reviewer #2 (Evidence, reproducibility and clarity):

      Summary:

      In this study by Rahmani in colleagues, the authors sought to define the "learning proteome" for a gustatory associative learning paradigm in C. elegans. Using a cytoplasmic TurboID expressed under the control of a pan-neuronal promoter, the authors labeled proteins during the training portion of the paradigm, followed by proteomics analysis. This approach revealed hundreds of proteins potentially involved in learning, which the authors describe using gene ontology and pathways analysis. The authors performed functional characterization of some of these genes for their requirement in learning using the same paradigm. They also compared the requirement for these genes across various learning paradigms, and found that most hits they characterized appear to be specifically required for the training paradigm used for generating the "learning proteome".

      Major Comments:

      (1) The definition of a "hit" from the TurboID approach is does not appear stringent enough. According to the manuscript, a hit was defined as one unique peptide detected in a single biological replicate (out of 5), which could give rise to false positives. In figure S2, it is clear that there relatively little overlap between samples with regards to proteins detected between replicates, and while perhaps unintentional, presenting a single unique peptide appears to be an attempt to inflate the number of hits. Defining hits as present in more than one sample would be more rigorous. Changing the definition of hits would only require the time to re-list genes and change data presented in the manuscript accordingly.

      We thank Reviewer 2 for this valuable comment, and the following related suggestion. We agree with the statement that “Defining hits as present in more than one sample would be more rigorous”. Therefore, to address this comment, we have now separated candidates into two categories in Table 2 in the revised manuscript: ‘strong’ (present in 3 or more biological replicates) and ‘weak’ candidates (present in 2 or fewer biological replicates). However, we think these weaker candidates should still be included in the manuscript, considering we did observe relationships between these proteins and learning. For example, ACC-1, which influences salt associative learning in C. elegans, was detected in one replicate of mass spectrometry as a potential learning regulator (Figure S8A). We describe this classification in the main text on pages 21-22.

      We also agree with Reviewer 2 that the overlap between individual candidate hits is low between biological replicates; the inclusion of Figure S2 in the original manuscript serves to highlight this limitation. However, it is also important to consider that there is notable overlap for whole molecular pathways between biological replicates of mass spectrometry data as shown in Figure 2 in the revised manuscript (this consideration is now mentioned on pages 13-14). We have included Figure 3 to illustrate representation for two metabolic processes across several biological replicates normally indispensable to animal health, as an example to provide additional visual aid for the overlap between replicates of mass spectrometry. We provide this figure (described on pages 13 & 15) to demonstrate the strength of our approach in that it can detect candidates not easily assessable by conventional forward or reverse genetic screens.

      We also appreciate the opportunity to explain our approach. The criteria of “at least one unique peptide” was chosen based on a previous work for which we adapted for this manuscript (Prikas et al., 2020). It was not intended to inflate the number of hits but rather to ensure sensitivity in detecting low-abundance neuronal proteins. We have clarified this in our Methods (page 46).

      (2) The "hits" that the authors chose to functionally characterize do not seem like strong candidate hits based on the proteomics data that they generated. Indeed, most of the hits are present in a single, or at most 2, biological replicate. It is unclear as to why the strongest hits were not characterized, which if mutant strains are publicly available, would not be a difficult experiment to perform.

      We thank the reviewer for this important suggestion. To address this, we have described two molecular pathways with multiple components that appear in more than one biological replicate of mass spectrometry data in Figure 3 (main text on page 13). In addition, we have included Figures 6 & S7 where 9 additional single mutants corresponding to candidates in three or more biological replicates of mass spectrometry were tested for salt associative learning. Briefly, we found the following (number of replicates that a protein was unique to TurboID trained animals is in brackets):

      - Novel arginine kinase F46H5.3 (4 replicates) displays an effect in both salt associative learning and salt aversive learning in the same direction (Figures 6A, 6B, & S9A, pages 31-32 & 37-38).

      - Worms with a mutation for armadillo-domain protein C30G12.6 (3 replicates) only displayed an enhanced learning phenotype when non-backcrossed, not backcrossed. This suggests the enhanced learning phenotype was caused by a background mutation (Figure 6, pages 24-25).

      - We did not observe an effect on salt associative learning when assessing mutations for the ciliogenesis protein IFT-139 (5 replicates), guanyl nucleotide factors AEX-3 or TAG52 (3 replicates), p38/MAPK pathway interactor FSN-1 (3 replicates), IGCAM/RIG-4 (3 replicates), and acetylcholine components ACR-2 (4 replicates) and ELP-1 (3 replicates) (Figure S7, on pages 27-30). However, we note throughout the section for which these candidates are described that only single gene mutants were tested, meaning that genes that function in redundant or compensatory pathways may not exhibit a detectable phenotype.

      Because of the lack of strong evidence that these are indeed proteins regulated in the context of learning based on proteomics, including evidence of changes in the proteins (by imaging expression changes of fluorescent reporters or a biochemical approach), would increase confidence that these hits are genuine.

      We thank Reviewer 2 for this suggestion – we agree that it would have been ideal to have additional evidence suggesting that changes in candidate protein levels are associated directly with learning. Ideally, we would have explored this aspect further; however, as outlined in response to Reviewer 1 Major Comment 2 (OPTIONAL), this was not feasible within the scope of the current study due to several practical challenges. Specifically, we attempted to generate pan-neuronal and endogenous promoter rescue lines for several candidates, but encountered significant challenges, including poor survival post-microinjection (likely due to protein overexpression toxicity) and reduced viability for behavioural assays, potentially linked to transgene-related reproductive defects. This information is now described on pages 39 & 40 of the revised work.

      To address these limitations, we performed additional behavioural experiments where possible. We successfully generated a pan-neuronal promoter line for kin-2, which was tested and included in the revised manuscript (Figure 5B, pages 30 & 31). In addition, to confirm that observed learning phenotypes were due to the expected mutations and not background effects, we conducted experiments using backcrossed versions of several mutant lines as suggested by Reviewer 4 Cross Comment 3 (Figure 6, pages 23-24 & 24-26). Briefly, this shows that panneuronal expression of KIN-2 from the ce179 mutant allele is sufficient to repeat the enhanced learning phenotype observed in backcrossed kin-2(ce179) animals, providing additional evidence that the identified hits are required for learning. We also confirmed that F46H5.3 modulates salt associative learning, given both non-backcrossed and backcrossed F46H5.3(-) mutants display a learning enhancement phenotype. The revised text now describes this data on the page numbers mentioned above.

      Minor Comments:

      (1) The authors highlight that the proteins they discover seem to function uniquely in their gustatory associative paradigm, but this is not completely accurate. kin-2, which they characterize in figure 4, is required for positive butanone association (the authors even say as much in the manuscript) in Stein and Murphy, 2014.

      We appreciate this correction and thank the Reviewer for pointing this out. We have amended the wording appropriately on page 31 to clarify our meaning.

      “Although kin-2(ce179) mutants were not shown to impact salt aversive learning, they have been reported previously to display impaired intermediate-term memory (but intact learning and short-term memory) for butanone appetitive learning (Stein and Murphy, 2014).”

      Reviewer #2 (Significance):

      General Assessment:

      The approach used in this study is interesting and has the potential to further our knowledge about the molecular mechanisms of associative behaviors. Strengths of the study include the design with carefully thought out controls, and the premise of combining their proteomics with behavioral analysis to better understand the biological significance of their proteomics findings. However, the criteria for defining hits and prioritization of hits for behavioral characterizations were major wweaknesses of the paper.

      Advance:

      There have been multiple transcriptomic studies in the worm looking at gene expression changes in the context of behavioral training (Lakhina et al., 2015, Freytag 2017). This study compliments and extends those studies, by examining how the proteome changes in a different training paradigm. This approach here could be employed for multiple different training paradigms, presenting a new technical advance for the field.

      Audience:

      This paper would be of interest to the broader field of behavioral and molecular neuroscience. Though it uses an invertebrate system, many findings in the worm regarding learning and memory translate to higher organisms.

      I am an expert in molecular and behavioral neuroscience in both vertebrate and invertebrate models, with experience in genetics and genomics approaches.

      We appreciate Reviewer 2’s thoughtful assessment and constructive feedback. In response to concerns regarding definition and prioritisation of hits, we have revised our approach as detailed above to place more consideration on ‘strong’ hits present in multiple biological replicates. We have also added new behavioural data for additional mutants that fall into this category (Figures 6 & S7). We hope these revisions strengthen our study and enhance its relevance to the behavioural/molecular neuroscience community.

      Reviewer #3 (Evidence, reproducibility and clarity):

      Summary:

      In the manuscript titled "Identifying regulators of associative learning using a protein-labelling approach in C. elegans" the authors attempted to generate a snapshot of the proteomic changes that happen in the C. elegans nervous system during learning and memory formation. They employed the TurboID-based protein labeling method to identify the proteins that are uniquely found in samples that underwent training to associate no-salt with food, and consequently exhibited lower attraction to high salt in a chemotaxis assay. Using this system they obtained a list of target proteins that included proteins represented in molecular pathways previously implicated in associative learning. The authors then further validated some of the hits from the assay by testing single gene mutants for effects on learning and memory formation.

      Major Comments:

      In the discussion section, the authors comment on the sources of "background noise" in their data and ways to improve the specificity. They provide some analysis on this aspect in Supplementary figure S2. However, a better visualization of non-specificity in the sample could be a GO analysis of tissue-specificity, and presented as a pie chart as in Figure 2A. Nonneuronal proteins such as MYO-2 or MYO-3 repeatedly show up on the "TurboID trained" lists in several biological replicates (Tables S2 and S3). If a major fraction of the proteins after subtraction of control lists are non-specific, that increases the likelihood that the "hits" observed are by chance. This analysis should be presented in one of the main figures as it is essential for the reader to gauge the reliability of the experiment.

      We agree with this assessment and thank Reviewer 3 for this constructive suggestion. In response, we have now incorporated a comprehensive tissue-specific analysis of the learning proteome in the revised manuscript. Using the single neuron RNA-Seq database CeNGEN, we identified the proportion of neuronal vs non-neuronal proteins from each biological replicate of mass spectrometry data. Specifically, we present Table 1 on page 17 (which we originally intended to include in the manuscript, but inadvertently left out), which shows that 87-95% (i.e. a large majority) of proteins identified across replicates corresponded to genes detected in neurons, supporting that the TurboID enzyme was able to target the neuronal proteome as expected. Table 1 is now described in the main text of the revised work on page 16.

      In addition, we performed neuron-specific analyses using both the WormBase gene enrichment tool and the CeNGEN single-cell transcriptomic database, which we describe in detail on our response to Reviewer 1 Major Comment 2. To summarise, these analyses revealed enrichment of several neuron classes, including those previously implicated in associative learning (e.g., ASEL, AIB, RIS, AVK) as well as neurons not previously studied in this context (e.g., IL1, DA9, DVC) (summarised in Table S7). By examining expression overlap across neuron types, we identified shared and distinct profiles that suggest potential functional connectivity and candidate circuits underlying behavioural plasticity (Figure 4). Taken together, these data show that the proteins identified in our dataset are (1) neuronal and (2) expressed in neurons that are known to be required for learning. Methods are detailed on pages 50-51.

      Other than the above, the authors have provided sufficient details in their experimental and analysis procedures. They have performed appropriate controls, and their data has sufficient biological and technical replaictes for statistical analysis.

      We appreciate this positive feedback and thank the Reviewer for acknowledging the clarity of our experimental and analysis procedures.

      Minor Comments:

      There is an error in the first paragraph of the discussion, in the sentences discussing the learning effects in gar-1 mutant worms. The sentences in lines 12-16 on page 22 says that gar-1 mutants have improved salt-associative learning and defective salt-aversive learning, while in fact the data and figures state the opposite.

      We appreciate the Reviewer noting this discrepancy. As clarified in our response to Reviewer 1, Major Comment 1 above, we reanalysed the behavioural data to ensure consistency across genotypes by comparing only those tested within the same biological replicates (thus having the same N for all genotypes). Upon this reanalysis, we found that the previously reported phenotype for gar-1 mutants in salt-associative learning was not statistically different from wildtype controls. Therefore, we have removed references to GAR-1 from the manuscript.

      Reviewer #3 (Significance):

      Strengths and limitations:

      This study used neuron-specific TurboID expression with transient biotin exposure to capture a temporally restricted snapshot of the C. elegans nervous system proteome during saltassociative learning. This is an elegant method to identify proteins temporally specific to a certain condition. However, there are several limitations in the way the experiments and analyses were performed which affect the reliability of the data. As the authors themselves have noted in the discussion, background noise is a major issue and several steps could be taken to improve the noise at the experimental or analysis steps (use of integrated C. elegans lines to ensure uniformity of samples, flow cytometry to isolate neurons, quantitative mass spec to detect fold change vs. strict presence/absence).

      Advance:

      Several studies have demonstrated the use of proximity labeling to map the interactome by using a bait protein fusion. In fact, expressing TurboID not fused to a bait protein is often used as a negative control in proximity labeling experiments. However, this study demonstrates the use of free TurboID molecules to acquire a global snapshot of the proteome under a given condition.

      Audience:

      Even with the significant limitations, this study is specifically of interest to researchers interested in understanding learning and memory formation. Broadly, the methods used in this study could be modified to gain insights into the proteomic profiles at other transient developmental stages. The reviewer's field of expertise: Cell biology of C. elegans neurons.

      We thank the reviewer for their thoughtful evaluation of our work. We appreciate the recognition of the novelty and potential of using neuron-specific TurboID to capture a temporally restricted snapshot of the C. elegans nervous system proteome during learning. We agree that this approach offers a unique opportunity to identify proteins associated with specific behavioural states in future studies.

      We also appreciate the reviewer’s comments regarding limitations in experimental and analytical design. In revising the manuscript, we have taken several steps to address these concerns and improve the clarity, rigour, and interpretability of our data. Specifically:

      - We now provide a frequency-based representation of proteomic hits (Table 2), which helps clarify how candidate proteins were selected and highlights differences between trained and control groups.

      - We have added neuron-specific enrichment analyses using both WormBase and CenGEN databases (Table S7 & Figure 4), which help identify candidate neurons and potential circuits involved in learning (methods on pages 50-51).

      - We have clarified the rationale for using qualitative proteomics in the context of TurboID, in addition to acknowledging the challenges of integrating quantitative mass spectrometry with biotin-based enrichment (page 39). Additional methods for improving sample purity, such as using integrated lines or FACS-enrichment of neurons, could further refine this approach in future studies. For transparency, we did attempt to integrate the TurboID transgenic line to improve the strength and consistency of biotinylation signals. However, despite four rounds of backcrossing, this line exhibited unexpected phenotypes, including a failure to respond reliably to the established training protocol. As a result, we were unable to include it in the current study. Nonetheless, we believe our current approach provides a valuable proof-of-concept and lays the groundwork for future refinement.

      By addressing the major concerns of peer reviewers, we believe our study makes a significant and impactful contribution by demonstrating the feasibility of using TurboID to capture learninginduced proteomic changes in the nervous system. The identification of novel learning-related mutants, including those involved in acetylcholine signalling and cAMP pathways, provides new directions for future research into the molecular and circuit-level mechanisms of behavioural plasticity.

      Reviewer #4 (Evidence, reproducibility and clarity):

      Summary:

      In this manuscript, authors used a learning paradigm in C. elegans; when worms were fed in a saltless plate, its chemotaxis to salt is greatly reduced. To identify learning-related proteins, authors employed nervous system-specific transcriptome analysis to compare whole proteins in neurons between high-salt-fed animals and saltless-fed animals. Authors identified "learningspecific genes" which are observed only after saltless feeding. They categorized these proteins by GO analyses and pathway analyses, and further stepped forward to test mutants in selected genes identified by the proteome analysis. They find several mutants that are defective or hyper-proficient for learning, including acc-1/3 and lgc-46 acetylcholine receptors, gar-1 acetylcholine receptor GPCR, glna-3 glutaminase involved in glutamate biosynthesis, and kin-2, a cAMP pathway gene. These mutants were not previously reported to have abnormality in the learning paradigm.

      Major comments:

      (1) There are problems in the data processing and presentation of the proteomics data in the current manuscript which deteriorates the utility of the data. First, as the authors discuss (page 24, lines 5-12), the current approach does not consider amount of the peptides. Authors state that their current approach is "conservative", because some of the proteins may be present in both control and learned samples but in different amounts. This reviewer has a concern in the opposite way: some of the identified proteins may be pseudo-positive artifacts caused by the analytical noise. The problem is that authors included peptides that are "present" in "TurboID, trained" sample but "absent" in the "Non-Tg, trained" and "TurboID, control" samples in any one of the biological replicates, to identify "learning proteome" (706 proteins, page 8, last line - page 9, line 8; page 32, line 21-22). The word "present" implies that they included even peptides whose amounts are just above the detection threshold, which is subject to random noise caused by the detector or during sample collection and preparation processes. This consideration is partly supported by the fact that only a small fraction of the proteins are common between biological replicates (honestly and respectably shown in Figure S2). Because of this problem, there is no statistical estimate of the identity in "learning proteome" in the current manuscript. Therefore, the presentation style in Tables S2 and S3 are not very useful for readers, especially because authors already subtracted proteins identified in Non-Tg samples, which must also suffer from stochastic noise. I suggest either quantifying the MS/MS signal, or if authors need to stick to the "present"/"absent" description of the MS/MS data, use the number of appearances in biological replicates of each protein as estimate of the quantity of each protein. For example, found in 2 replicates in "TurboID, learned" and in 0 replicates in "Non-Tg, trained". One can apply statistics to these counts. This said, I would like to stress that proteins related to acquisition of memory may be very rare, especially because learning-related changes likely occur in a small subset of neurons. Therefore, 1 time vs 0 time may be still important, as well as something like 5 times vs 1 time. In summary, quantitative description of the proteomics results is desired.

      We thank the reviewer for these valuable comments and suggestions.

      We acknowledge that quantitative proteomics would provide beneficial information; however, as also indicated by Reviewer 1 (in cross-comment), it is practically challenging to perform with TurboID. We have included discussion of potential future experiments involving quantitative mass spectrometry, as well as a comprehensive discussion of some of the limitations of our approach as summarised by this Reviewer, in the Discussion section (page 39). However, we note that our qualitative approach also provides beneficial knowledge, such as the identification of functional protein networks acting within biological pathways previously implicated in learning (Figure 2), and novel learning regulators ACC-1/3, LGC-46, and F46H5.3.

      We agree with the assessment that the frequency of occurrence for each candidate we test per biological replicate is useful to disclose in the manuscript as a proxy for quantification. This was also highlighted by Reviewer 2 (Major Comment 1). As detailed above in response to R2, we have now separated candidates into two categories: ‘strong’ (present in 3 or more biological replicates) and ‘weak’ candidates (present in 2 or fewer biological replicates). We have also added behavioural data after testing 9 of these strong candidates in Figures 6 & S7.

      We have also added Table 2 to the revised manuscript, which summarises the frequency-based representation of the proteomics results, as suggested. This is described on pages 22-23.

      Briefly, this shows the range of candidates further explored using single mutant testing. Specifically, this data showed that many of the tested candidates were more frequently detected in trained worms compared to high-salt controls. This includes both strong and weak candidates, providing a clearer view of how proteomic frequency informed our selection for functional testing.

      (2) There is another problem in the treatment of the behavioural data. In Experimental Procedures, authors state that they excluded data in which naive or control groups showed average CI < 0.6499, and/or trained groups showed average CI < -0.0499 or > 0.5499 for N2 (page 36, lines 5-7). How were these values determined? One common example for judging a data point as an outlier is > mean + 1.5, 2 or 3 SD, or < mean - 1.5, 2 or 3 SD. Are these values any of these standards, or determined through other methods? If these values were determined simply by authors' decision, it could potentially introduce a bias and in the worst cases lead to incorrect conclusions. A related question is, authors state "trained animals showed a lower CI (~0.3)" where in the referred Figure 1B, the corresponding data shows averages close to 0. Why is the inconsistency? The assay that authors use is close to those described in the previous literature (Kunitomo et al., http://dx.doi.org/10.1038/ncomms3210). In this previous paper, it was described that animals conditioned under no salt with food show negative CI and are attracted to the low salt concentration area. Quantitative analysis of behavioural patterns showed migration bias towards lower salt concentrations (negative chemotaxis). Essentially the same concept was reported by Luo et al. (http://dx.doi.org/10.1016/j.neuron.2014.05.010). The experimental procedure employed in the current work is very similar with those by the Japanese group, with a notable difference: the chemotaxis assay plate included 50mM NaCl in Kunitomo et al, while authors used chemotaxis plate without added NaCl (p35, line 18). The latter is expected to cause shallow gradient towards the low-salt area, which may be the reason for the weak negative CI in the trained animals. In any case, the value of CI itself is not a problem, and authors' current assay is valid. The only concern of mine is the potential of author-introduced cognitive bias, possibly affecting, for example, whether a certain mutant has a significant defect or not. What happens if the cut-offs of -0.0499 and 0.5499 are omitted and all data were included in the analyses? What are the average CIs of N2 in all performed experiments for each of naive, control and trained groups?

      Thank you for pointing this out. As mentioned by both Reviewer 1 and Reviewer 4, the original manuscript states the following: “Data was excluded for salt associative learning experiments when wild-type N2 displayed (1) an average CI ≤ 0.6499 for naïve or control groups and/or (2) an average CI either < -0.0499 or >0.5499 for trained groups.”

      To clarify, we only excluded experiments in rare cases where N2 worms did not display robust high salt attraction before training, or where trained N2 did not display the expected behavioural difference compared to untrained or high-salt control N2. These anomalies were typically attributable to clear contamination or starvation issues that could clearly be observed prior to counting chemotaxis indices on CTX plates.

      We established these exclusion criteria in advance of conducting multiple learning assays to ensure an objective threshold for identifying and excluding assays affected by these rare but observable issues. However, these criteria were later found to be unnecessary, as N2 worms robustly displayed the expected untrained and trained phenotypes for salt associative learning when not compromised by starvation or contamination.

      We understand that the original criteria may have appeared to introduce arbitrary bias in data selection. To address this concern, we have removed these criteria from the revised manuscript from page 50.

      Minor comments:

      (1) Related to Major comments 1), the successful effect of neuron-specific TurboID procedure was not evaluated. Authors obtained both TurboID and Non-Tg proteome data. Do they see enrichment of neuron-specific proteins? This can be easily tested, for example by using the list of neuron-specific genes by Kaletsky et al. (http://dx.doi.org/10.1038/nature16483 or http://dx.doi.org/10.1371/journal.pgen.1007559), or referring to the CenGEN data.

      We thank this Reviewer for this helpful suggestion, which was echoed by Reviewer 3 (Major Comment 1). As indicated in the response to R3 above, the revised manuscript now includes Table 1 as a tissue-specific analysis of the learning proteome, using the single neuron RNASeq database CeNGEN to identify the proportion of neuronal proteins from each biological replicate of mass spectrometry data. Generally, we observed a range of 87-95% of proteins corresponded to genes from the CeNGEN database that had been detected in neurons, providing evidence that the TurboID enzyme was able to target the neuronal proteome as expected. Table 1 is now described in the main text of the revised work on pages 16 & 17.

      (2) The behavioural paradigm needs to be described accurately. Page 5, line 16-17, "C. elegans normally have a mild attraction towards higher salt concentration": in fact, C. elegans raised on NGM plates, which include approximately 50mM of NaCl, is attracted to around 50mM of NaCl (Kunitomo et al., Luo et al.) but not 100-200 mM.

      We thank the Reviewer for pointing this out. We agree that clarification is necessary. The revised text reads as follows on page 5: “C. elegans are typically grown in the presence of salt (usually ~ 50 mM) and display an attraction toward this concentration when assayed for chemotaxis behaviour on a salt gradient (Kunitomo et al., 2013, Luo et al., 2014).

      Training/conditioning with ‘no salt + food’ partially attenuates this attraction (group referred to ‘trained’).”

      Authors call this assay "salt associative learning", which refers to the fact that worms associate salt concentration (CS) and either presence or absence of food (appetitive or aversive US) during conditioning (Kunitomo et al., Luo et al., Nagashima et al.) but they are looking at only association with presence of food, and for proteome analysis they only change the CS (NaCl concentration, as discussed in Discussion, p24, lines 4-5). It is better to attempt to avoid confusion to the readers in general.

      Thank you Reviewer 4 for highlighting this clarity issue. We clarify our definition of “salt associative learning” for the purpose of this study in the revised manuscript on page 6 with the following text:

      “Similar behavioural paradigms involving pairings between salt/no salt and food/no food have been previously described in the literature (Nagashima et al. 2019). Here, learning experiments were performed by conditioning worms with either ‘no salt + food’ (referred to as ‘salt associative learning’) or ‘salt + no food’ (called ‘salt aversive learning’).”

      (3) page 32, line 23: the wording "excluding" is obscure and misleading because the elo-6 gene was included in the analysis.

      We appreciate this Reviewer for pointing out this misleading comment, which was unintentional. We have now removed it from the text (on page 21).

      (4) Typo at page 24, line 18: "that ACC-1" -> "than ACC-1".

      This has been corrected (on page 37).

      (5) Reference. In "LEO, T. H. T. et al.", given and sir names are flipped for all authors. Also, the paper has been formally published (http://dx.doi.org/10.1016/j.cub.2023.07.041).

      We appreciate the Reviewer drawing our attention to this – the reference has been corrected and updated.

      I would like to express my modest cross comments on the reviews:

      (1) Many of the reviewers comment on the shortage in the quantitative nature of the proteome analysis, so it seems to be a consensus.

      Thank you Reviewer 4 for this feedback. We appreciate the benefit in performing quantitative mass spectrometry, in that it provides an additional way to parse molecular mechanisms in a biological process (e.g., fold-changes in protein expression induced by learning). However, we note that quantitative mass spectrometry is challenging to integrate with TurboID due to the requirement to enrich for biotinylated peptides during sample processing (we now mention this on page 39). Nevertheless, it would be exciting to see this approach performed in a future study.

      To address the limitations of our original qualitative approach and enhance the clarity and utility of our dataset, we have made the following revisions in the manuscript:

      (1) Candidate selection criteria: We now clearly define how candidates were selected for functional testing, based on their frequency across biological replicates. Specifically, “strong candidates” were detected in three or more replicates, while “weak candidates” appeared in two or fewer.

      (2) Frequency-based representation (Table 2):We appreciate the suggestion by Reviewer 4 (Major Comment 1) to quantify differences between high-salt control and trained groups. We now provide the frequency-based representation of the candidates tested in this study within our proteomics data in Table 2. This data showed that many of the tested candidates were more frequently detected in trained worms compared to high-salt controls. This includes both strong and weak candidates

      We hope these additions help clarify our approach and demonstrate the value of the dataset, even within the constraints of qualitative proteomics.

      (2) Also, tissue- or cell-specificity of the identified proteins were commonly discussed. In reviewer #3's first Major comment, appearance of non-neuronal protein in the list was pointed out, which collaborate with my (#4 reviewer's) question on successful identification of neuronal proteins by this method. On the other hand, reviewer #1 pointed out subset neuron-specific proteins in the list. Obviously, these issues need to be systematically described by the authors.

      We agree with Reviewer 4 that these analyses provide a critical angle of analysis that is not explored in the original manuscript.

      Tissue analysis (Reviewer 3 Major Comment 1): We have used the single neuron RNA-Seq database CeNGEN, to identify that 87-95% (i.e. a large majority) of proteins identified across replicates corresponded to genes detected in neurons. These findings support that the TurboID enzyme was able to target the neuronal proteome as expected. Table 1 provides this information as is now described in the main text of the revised work on page 16.

      Neuron class analyses (Reviewer 1 Major Comment 2): In response, we have used the suggested Wormbase gene enrichment tool and CeNGEN. We specifically input proteins from the learning proteome into Wormbase, after filtering for proteins unique to TurboID trained animals. For CeNGEN, we compared genes/proteins from control worms and trained worms to identify potential neurons that may be involved in this learning paradigm.

      Briefly, we found highlight a range of neuron classes known in learning (e.g., RIS interneurons), cells that affect behaviour but have not been explored in learning (e.g., IL1 polymodal neurons), and neurons for which their function/s are unknown (e.g., pharyngeal neuron I3). Corresponding text for this new analysis has been added on pages 16-20, with a new table and figure added to illustrate these findings (Table S7 & Figure 4). Methods are detailed on pages 50-51.

      (3) Given reviewer #1's OPTIONAL Major comment, as an expert of behavioral assays in C. elegans, I would like to comment based on my experience that mutants received from Caenorhabditis Genetics Center or other labs often lose the phenotype after outcrossing by the wild type, indicating that a side mutation was responsible for the observed behavioral phenotype. Therefore, outcrossing may be helpful and easier than rescue experiments, though the latter are of course more accurate.

      Thank you for this suggestion. To address the potential involvement of background mutations, we have done experiments with backcrossed versions of mutants tested where possible, as shown in Figure 6. We found that F46H5.3(-) mutants maintained enhanced learning capacity after backcrossing with wild type, compared to their non-backcrossed mutant line. This was in contrast to C30G12.6(-) animals which lost their enhanced learning phenotype following backcrossing using wild type worms. This is described in the text on pages 24-26.

      (4) Just let me clarify the first Minor comment by reviewer #2. Authors described that the kin-2 mutant has abnormality in "salt associative learning" and "salt aversive learning", according to authors' terminology. In this comment by reviewer #2, "gustatory associative learning" probably refers to both of these assays.

      Reviewer 4 is correct. We have amended the wording appropriately on page 31 to clarify our meaning to address Reviewer 2’s comment.

      “Although kin-2(ce179) mutants were not shown to impact salt aversive learning, they have been reported previously to display impaired intermediate-term memory (but intact learning and short-term memory) for butanone appetitive learning (Stein and Murphy, 2014).”

      (5) There seem to be several typos in reviewer #1's Minor comments.

      "In Page 9, Lines 17-18" -> "Page 8, Lines 17-18".

      "Page 8, Line 24" -> "Page 7, Line 24".

      "I would suggest to remove figure 3" -> "I would suggest to remove figure 2"

      "summary figure similar to Figure 4" -> "summary figure similar to Figure 3"

      "In the discussion Page 24, Line 14" -> "In the discussion Page 23, Line 14"

      (I note that because a top page was inserted in the "merged" file but not in art file for review, there is a shift between authors' page numbers and pdf page numbers in the former.) It would be nice if reviewer #1 can confirm on these because I might be wrong.

      We appreciate Reviewer 4 noting this, and can confirm that these are the correct references (as indicated by Reviewer 1 in their cross-comments)

      Reviewer #4 (Significance):

      (1) Total neural proteome analysis has not been conducted before for learning-induced changes, though transcriptome analysis has been performed for odor learning (Lakhina et al., http://dx.doi.org/10.1016/j.neuron.2014.12.029). This guarantees the novelty of this manuscript, because for some genes, protein levels may change even though mRNA levels remain the same. We note an example in which a proteome analysis utilizing TurboID, though not the comparison between trained/control, has led to finding of learning related proteins (Hiroki et al., http://dx.doi.org/10.1038/s41467-022-30279-7). As described in the Major comments 1) in the previous section, improvement of data presentation will be necessary to substantiate this novelty.

      We appreciate this thoughtful feedback. We agree that while the neuronal transcriptome has been explored in Lakhina et al., 2015 for C. elegans in the context of memory, our study represents the first to examine learning-induced changes in the total neuronal proteome. We particularly agree with the statement that “for some genes, protein levels may change even though mRNA levels remain the same”. This is essential rationale that we now discuss on page 42.

      Additionally, we acknowledge the relevance of the study by Hiroki et al., 2022, which used TurboID to identify learning-related proteins, though not in a trained versus control comparison. Our work builds on this by directly comparing trained and control conditions, thereby offering new insights into the proteomic landscape of learning. This is now clarified on page 36.

      To substantiate the novelty and significance of our approach, we have revised the data presentation throughout the manuscript, including clearer candidate selection criteria, frequency-based representation of proteomic hits (Table 2), and neuron-specific enrichment analyses (Table S7 & Figure 4). We hope these improvements help convey the unique contribution of our study to the field.

      (2) Authors found six mutants that have abnormality in the salt learning (Fig. 4). These genes have not been described to have the abnormality, providing novel knowledge to the readers, especially those who work on C. elegans behavioural plasticity. Especially, involvement of acetylcholine neurotransmission has not been addressed. Although site of action (neurons involved) has not been tested in this manuscript, it will open the venue to further determine the way in which acetylcholine receptors, cAMP pathway etc. influences the learning process.

      Thank you Reviewer 4, for this encouraging feedback. To further strengthen the study and expand its relevance, we have tested additional mutants in response to Reviewer 3’s comments, as shown in Figures 6 & S7. These results provide even more candidate genes and pathways for future exploration, enhancing the significance and impact of our study.

    1. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #3 (Public review):

      The central issue for evaluating the overfilling hypothesis is the identity of the mechanism that causes the very potent (>80% when inter pulse is 20 ms), but very quickly reverting (< 50 ms) paired pulse depression (Fig 1G, I). To summarize: the logic for overfilling at local cortical L2/3 synapses depends critically on the premise that probability of release (pv) for docked and fully primed vesicles is already close to 100%. If so, the reasoning goes, the only way to account for the potent short-term enhancement seen when stimulation is extended beyond 2 pulses would be by concluding that the readily releasable pool overfills. However, the conclusion that pv is close to 100% depends on the premise that the quickly reverting depression is caused by exocytosis dependent depletion of release sites, and the evidence for this is not strong in my opinion. Caution is especially reasonable given that similarly quickly reverting depression at Schaffer collateral synapses, which are morphologically similar, was previously shown to NOT depend on exocytosis (Dobrunz and Stevens 1997). Note that the authors of the 1997 study speculated that Ca2+-channel inactivation might be the cause, but did not rule out a wide variety of other types of mechanisms that have been discovered since, including the transient vesicle undocking/re-docking (and subsequent re-priming) reported by Kusick et al (2020), which seems to have the correct timing.

      Thank you for your comments on an alternative possibility besides Ca<sup>2+</sup> channel inactivation. Kusick et al. (2020) showed that transient destabilization of docked vesicle pool is recovered within 14 ms after stimulation. This rapid recovery implies that post-stimulation undocking events might be largely resolved before the 20 ms inter-stimulus interval (ISI) used in our paired-pulse ratio (PPR) experiments, arguing against the possibility that post-AP undocking/re-docking events significantly influence PPR measured at 20 ms ISI. Furthermore, Vevea et al. (2021) showed that post-stimulus undocking is facilitated in synaptotagmin-7 (Syt7) knockout synapses. In our study, Syt7 knockdown did not affect PPR at 20 ms ISI, suggesting that the undocking process described in Kusick et al. may not be a major contributor to the paired-pulse depression observed at 20 ms interval in our study. Therefore, it is unlikely that transient vesicle undocking primarily underlies the strong PPD at 20 ms ISI in our experiments. Taken together, the undocking/redocking dynamics reported by Kusick et al. are too rapid to affect PPR at 20 ms ISI, and our Syt7 knockdown data further argue against a significant role of this process in the PPD observed at 20 ms interval.

      In an earlier round of review, I suggested raising extracellular Ca<sup>2+</sup>, to see if this would increase synaptic strength. This is a strong test of the authors' model because there is essentially no room for an increase in synaptic strength. The authors have now done experiments along these lines, but the result is not clear cut. On one hand, the new results suggest an increase in synaptic strength that is not compatible with the authors' model; technically the increase does not reach statistical significance, but, likely, this is only because the data set is small and the variation between experiments is large. Moreover, a more granular analysis of the individual experiments seems to raise more serious problems, even supporting the depletion-independent counter hypothesis to some extent. On the other hand, the increase in synaptic strength that is seen in the newly added experiments does seem to be less at local L2/3 cortical synapses compared to other types of synapses, measured by other groups, which goes in the general direction of supporting the critical premise that pv is unusually high at L2/3 cortical synapses. Overall, I am left wishing that the new data set were larger, and that reversal experiments had been included as explained in the specific points below.

      Specific Points:

      (1) One of the standard methods for distinguishing between depletion-dependent and depletion-independent depression mechanisms is by analyzing failures during paired pulses of minimal stimulation. The current study includes experiments along these lines showing that pv would have to be extremely close to 1 when Ca<sup>2+</sup> is 1.25 mM to preserve the authors' model (Section "High double failure rate ..."). Lower values for pv are not compatible with their model because the k<sub>1</sub> parameter already had to be pushed a bit beyond boundaries established by other types of experiments.

      It should be noted that we did not arbitrarily pushed the k<sub>1</sub> parameter beyond boundaries, but estimated the range of k<sub>1</sub> based on the fast time constant for recovery from paired pulse depression as shown in Fig. 3-S2-Ab.

      The authors now report a mean increase in synaptic strength of 23% after raising Ca to 2.5 mM. The mean increase is not quite statistically significant, but this is likely because of the small sample size. I extracted a 95% confidence interval of [-4%, +60%] from their numbers, with a 92% probability that the mean value of the increase in the full population is > 5%. I used the 5% value as the greatest increase that the model could bear because 5% implies pv < 0.9 using the equation from Dodge and Rahamimoff referenced in the rebuttal. My conclusion from this is that the mean result, rather than supporting the model, actually undermines it to some extent. It would have likely taken 1 or 2 more experiments to get above the 95% confidence threshold for statistical significance, but this is ultimately an arbitrary cut off.

      Our key claim in Fig. 3-S3 is not the statistical non-significance of EPSC changes, but the small magnitude of the change (1.23-fold). This small increase is far less than the 3.24-fold increase predicted by the fourth-power relationship (D&R equation, Dodge & Rahamimoff, 1967), which would be valid under the conditions that the fusion probability of docked vesicles (p<sub>v</sub>) is not saturated. We do not believe that addition of new experiments would increase the magnitude of EPSC change as high as the Dodge & Rahamimoff equation predicts, even if more experiments (n) yielded a statistical significance. In other words, even a small but statistically significant EPSC changes would still contradict with what we expect from low p<sub>v</sub> synapses. It should be noted that our main point is the extent of EPSC increase induced by high external [Ca<sup>2+</sup>], not a p-value. In this regard, it is hard for us to accept the Reviewer’s request for larger sample size expecting lower p-value.

      Although we agree to Reviewer’s assertion that our data may indicate a 92% probability for the high Ca<sup>2+</sup> -induced EPSC increases by more than 5%, we do not agree to the Reviewer’s interpretation that the EPSC increase necessarily implies an increase in p<sub>v</sub>. We are sorry that we could not clearly understand the Reviewer’s inference that the 5% increase of EPSCs implies p<sub>v</sub> < 0.9. Please note that release probability (p<sub>r</sub>) is the product of p<sub>v</sub> and the occupancy of docked vesicles in an active zone (p<sub>occ</sub>). We imagine that this inference might be under the premise that p<sub>occ</sub> is constant irrespective of external [Ca<sup>2+</sup>]. Contrary to the Reviewer’s premise, Figure 2c in Kusick et al. (2020) showed that the number of docked SVs increased by c. a. 20% upon increasing external [Ca<sup>2+</sup>] to 2 mM. Moreover, Figure 7F in Lin et al. (2025) demonstrated that the number of TS vesicles, equivalent to p<sub>occ</sub> increased by 23% at high external [Ca<sup>2+</sup>]. These extents of p<sub>occ</sub> increases are similar to our magnitude of high external Ca<sup>2+</sup> -induced increase in EPSC (1.23-fold). Of course, it is possible that both increase of p<sub>occ</sub> and p<sub>v</sub> contributed to the high [Ca<sup>2+</sup>]<sub>o</sub>-induced increase in EPSC. The low PPR and failure rate analysis, however, suggest that p<sub>v</sub> is already saturated in baseline conditions of 1.3 mM [Ca<sup>2+</sup>]<sub>o</sub> and thus it is more likely that an increase in p<sub>occ</sub> is primarily responsible for the 1.23-fold increase. Moreover, the 1.23-fold increase, does not match to the prediction of the D&R equation, which would be valid at synapses with low p<sub>v</sub>. Therefore, interpreting our observation (1.23-fold increase) as a slight increase in p<sub>occ</sub> is rather consistent with recent papers (Kusick et al.,2020; Lin et al., 2025) as well as our other results supporting the baseline saturation of p<sub>v</sub> as shown in Figure 2 and associated supplement figures (Fig. 2-S1 and Fig. 2-S2).

      (2) The variation between experiments seems to be even more problematic, at least as currently reported. The plot in Figure 3-figure supplement 3 (left) suggests that the variation reflects true variation between synapses, not measurement error.

      Note that there was a substantial variance in the number of docked or TS vesicles at baseline and its fold changes at high external Ca<sup>2+</sup> condition in previous studies too (Lin et al., 2025; Kusick et al., 2020). Our study did not focus on the heterogeneity but on the mean dynamics of short-term plasticity at L2/3 recurrent synapses. Acknowledging this, the short-term plasticity of these synapses could be best explained by assuming that vesicular fusion probability (p<sub>v</sub>) is near to unity, and that release probability is regulated by p<sub>occ</sub>. In other words, even though p<sub>v</sub> is near to unity, synaptic strength can increase upon high external [Ca<sup>2+</sup>], if the baseline occupancy of release sites (p<sub>occ</sub>) is low and p<sub>occ</sub> is increased by high [Ca<sup>2+</sup>]. Lin et al. (2025) showed that high external [Ca<sup>2+</sup>] induces an increase in the number of TS vesicles (equivalent to p<sub>occ</sub>) by 23% at the calyx synapses. Different from our synapses, the baseline p<sub>v</sub> (denoted as p<sub>fusion</sub> in Lin et al., 2025) of the calyx synapse is not saturated (= 0.22) at 1.5 mM external [Ca<sup>2+</sup>], and thus the calyx synapses displayed 2.36-fold increase of EPSC at 2 mM external [Ca<sup>2+</sup>], to which increases in p<sub>occ</sub> as well as in p<sub>v</sub> (from 0.22 to 0.42) contributed. Therefore, the small increase in EPSC (= 23%) supports that p<sub>v</sub> is already saturated at L2/3 recurrent synapses.

      And yet, synaptic strength increased almost 2-fold in 2 of the 8 experiments, which back extrapolates to pv < 0.2.

      We are sorry that we could not understand the first comment in this paragraph. Could you explain in detail why two-fold increase implies pv < 0.2?

      If all of the depression is caused by depletion as assumed, these individuals would exhibit paired pulse facilitation, not depression. And yet, from what I can tell, the individuals depressed, possibly as much as the synapses with low sensitivity to Ca<sup>2+</sup>, arguing against the critical premise that depression equals depletion, and even arguing - to some extent - for the counter hypothesis that a component of the depression is caused by a mechanism that is independent of depletion.

      For the first statement in this paragraph, we imagine that ‘the depression’ means paired pulse depression (PPD). If so, we can not understand why depletion-dependent PPD should lead to PPF. If the paired pulse interval is too short for docked vesicles to be replenished, the first pulse-induced vesicle depletion would result in PPD. We are very sorry that we could not understand Reviewer’s subsequent inference, because we could not understand the first statement.

      I would strongly recommend adding an additional plot that documents the relationship between the amount of increase in synaptic strength after increasing extracellular Ca<sup>2+</sup> and the paired pulse ratio as this seems central.

      We found no clear correlation of EPSC<sub>1</sub> with PPR changes (ΔPPR) as shown in the figure below.

      Author response image 1.

      Plot of PPR changes as a function of EPSC1.<br />

      (3) Decrease in PPR. The authors recognize that the decrease in the paired-pulse ratio after increasing Ca<sup>2+</sup> seems problematic for the overfilling hypothesis by stating: "Although a reduction in PPR is often interpreted as an increase in pv, under conditions where pv is already high, it more likely reflects a slight increase in p<sub>occ</sub> or in the number of TS vesicles, consistent with the previous estimates (Lin et al., 2025)."

      We admit that there is a logical jump in our statement you mentioned here. We appreciate your comment. We re-wrote that part in the revised manuscript (line 285) as follows:

      “Recent morphological and functional studies revealed that elevation of [Ca<sup>2+</sup>]<sub>o</sub> induces an increase in the number of TS or docked vesicles to a similar extent as our observation (Kusick et al., 2020; Lin et al., 2025), raising a possibility that an increase in p<sub>occ</sub> is responsible for the 1.23-fold increase in EPSC at high [Ca<sup>2+</sup>]<sub>o</sub> . A slight but significant reduction in PPR was observed under high [Ca<sup>2+</sup>]<sub>o</sub> too. An increase in p<sub>occ</sub> is thought to be associated with that in the baseline vesicle refilling rate. While PPR is always reduced by an increase in p<sub>v,</sub> the effects of refilling rate to PPR is complicated. For example, PPR can be reduced by both a decrease (Figure 2—figure supplement 1) and an increase (Lin et al., 2025) in the refilling rate induced by EGTA-AM and PDBu, respectively. Thus, the slight reduction in PPR is not contradictory to the possible contribution of p<sub>occ</sub> to the high [Ca<sup>2+</sup>]<sub>o</sub> effects.”

      I looked quickly, but did not immediately find an explanation in Lin et al 2025 involving an increase in pocc or number of TS vesicles, much less a reason to prefer this over the standard explanation that reduced PPR indicates an increase in pv.

      Fig. 7F of Lin et al. (2025) shows an 1.23-fold increase in the number of TS vesicles by high external [Ca<sup>2+</sup>]. The same figure (Fig. 7E) in Lin et al. (2025) also shows a two-fold increase of p<sub>fusion</sub> (equivalent to p<sub>v</sub> in our study) by high external [Ca<sup>2+</sup>] (from 0.22 to 0.42,). Because p<sub>occ</sub> is the occupancy of TS vesicles in a limited number of slots in an active zone, the fold change in the number of TS vesicles should be similar to that of p<sub>occ</sub>.

      The authors should explain why the most straightforward interpretation is not the correct one in this particular case to avoid the appearance of cherry picking explanations to fit the hypothesis.

      The results of Lin et al. (2025) indicate that high external [Ca<sub>2+</sub>] induces a milder increase in p<sub>occ</sub> (23%) compared to p<sub>v</sub> (190%) at the calyx synapses. Because the extent of p<sub>occ</sub> increase is much smaller than that of p<sub>v</sub> and multiple lines of evidence in our study support that the baseline p<sub>v</sub> is already saturated, we raised a possibility that an increase in p<sub>occ</sub> would primarily contribute to the unexpectedly low increase of EPSC at 2.5 mM [Ca<sub>2+</sub>]<sub>o</sub>. As mentioned above, our interpretation is also consistent with the EM study of Kusick et al. (2020). Nevertheless, the reduction of PPR at 2.5 mM Ca<sub>2+</sub> seems to support an increase in p<sub>v,</sub> arguing against this possibility. On the other hand, because p<sub>occ</sub> = k<sub>1</sub>/(k<sub>1</sub>+b<sub>1</sub>) under the simple vesicle refilling model (Fig. 3-S2Aa), a change in p<sub>occ</sub> should associate with changes in k<sub>1</sub> and/or b<sub>1</sub>. While PPR is always reduced by an increase in p<sub>v,</sub> the effects of refilling rate to PPR is complicated. For example, despite that EGTA-AM would not increase p<sub>v,</sub> it reduced PPR probably through reducing refilling rate (Fig. 2-S1). On the contrary, PDBu is thought to increase k<sub>1</sub> because it induces two-fold increase of p<sub>occ</sub> (Fig. 7L of Lin et al., 2025). Such a marked increase of p<sub>occ,</sub> rather than p<sub>v,</sub> seems to be responsible for the PDBu-induced marked reduction of PPR (Fig. 7I of Lin et al., 2025), because PDBu induced only a slight increase in p<sub>v</sub> (Fig. 7K of Lin et al., 2025). Therefore, the slight reduction of PPR is not contradictory to our interpretation that an increase in p<sub>occ</sub> might be responsible for the slight increase in EPSC induced by high [Ca<sup>2+</sup>]<sub>o</sub>.

      (4) The authors concede in the rebuttal that mean pv must be < 0.7, but I couldn't find any mention of this within the manuscript itself, nor any explanation for how the new estimate could be compatible with the value of > 0.99 in the section about failures.

      We have never stated in the rebuttal or elsewhere that the mean p<sub>v</sub> must be < 0.7. On the contrary, both of our manuscript and previous rebuttals consistently argued that the baseline p<sub>v</sub> is already saturated, based on our observations including low PPR, tight coupling, high double failure rate and the minimal effect of external Ca<sup>2+</sup> elevation.

      (5) Although not the main point, comparisons to synapses in other brain regions reported in other studies might not be accurate without directly matching experiments.

      Please understand that it not trivial to establish optimal experimental settings for studying other synapses using the same methods employed in the study. We think that it should be performed in a separate study. Furthermore, we have already shown in the manuscript that action potentials (APs) evoked by oChIEF activation occur in a physiologically natural manner, and the STP induced by these oChIEF-evoked APs is indistinguishable from the STP elicited by APs evoked by dual-patch electrical stimulation. Therefore, we believe that our use of optogenetic stimulation did not introduce any artificial bias in measuring STP.

      As it is, 2 of 8 synapses got weaker instead of stronger, hinting at possible rundown, but this cannot be assessed because reversibility was not evaluated. In addition, comparing axons with and without channel rhodopsins might be problematic because the channel rhodopsins might widen action potentials.

      We continuously monitored series resistance and baseline EPSC amplitude throughout the experiments. The figure below shows the mean time course of EPSCs at two different [Ca<sup>2+</sup>]<sub>o</sub>. As it shows, we observed no tendency for run-down of EPSCs during experiments. If any, such recordings were discarded from analysis. In addition, please understand that there is a substantial variance in the number of docked vesicles at both baseline and high external Ca<sup>2+</sup> (Lin et al., 2025; Kusick et al., 2020) as well as short-term dynamics of EPSCs at our synapses.

      Author response image 2.

      Time course of normalized amplitudes of the first EPSCs during paired-pulse stimulation at 20 ms ISI in control and in the elevated external Ca<sup>2+</sup> (n = 8).<br />

      (6) Perhaps authors could double check with Schotten et al about whether PDBu does/does not decrease the latency between osmotic shock and transmitter release. This might be an interesting discrepancy, but my understanding is that Schotten et al didn't acquire information about latency because of how the experiments were designed.

      Schotten et al. (2015) directly compared experimental and simulation data for hypertonicity-induced vesicle release. They showed a pronounced acceleration of the latency as the tonicity increases (Fig. 2-S2), but this tonicity-dependent acceleration was not reproduced by reducing the activation energy barrier for fusion (ΔEa) in their simulations (Fig. 2-S1). Thus, the authors mentioned that an unknown compensatory mechanism counteracting the osmotic perturbation might be responsible for the tonicity-dependent changes in the latency. Importantly, their modeling demonstrated that reducing ΔEa, which would correspond to increasing p<sub>v</sub> results in larger peak amplitudes and shorter time-to-peak, but did not accelerate the latency. Therefore, there is currently no direct explanation for the notion that PDBu or similar manipulations shorten latency via an increase in p<sub>v</sub>.

      (7) The authors state: "These data are difficult to reconcile with a model in which facilitation is mediated by Ca2+-dependent increases in pv." However, I believe that discarding the premise that depression is always caused by depletion would open up wide range of viable possibilities.

      We hope that Reviewer understands the reasons why we reached the conclusion that the baseline p<sub>v</sub> is saturated at our synapses. First of all, strong paired pulse depression (PPD) cannot be attributed to Ca<sup>2+</sup> channel inactivation because Ca<sup>2+</sup> influx at the axon terminal remained constant during 40 Hz train stimulation (Fig.2 -S2). Moreover, even if Ca<sup>2+</sup> channel inactivation is responsible for the strong PPD, this view cannot explain the delayed facilitation that emerges subsequent pulses (third EPSC and so on) in the 40 Hz train stimulation (Fig. 1-4), because Ca<sup>2+</sup> channel inactivation gradually accumulates during train stimulations as directly shown by Wykes et al. (2007) in chromaffin cells. Secondly, the strong PPD and very fast recovery from PPD indicates very fast refilling rate constant (k<sub>1</sub>). Under this high k<sub>1</sub>, the failure rates were best explained by p<sub>v</sub> close to unity. Thirdly, the extent of EPSC increase induced by high external Ca<sup>2+</sup> was much smaller than other synapses such as calyx synapses at which p<sub>v</sub> is not saturated (Lin et al., 2025), and rather similar to the increases in p<sub>occ</sub> estimated at calyx synapses or the EM study (Kusick et al., 2020; Lin et al., 2025).

      Reference

      Wykes et al. (2007). Differential regulation of endogenous N-and P/Q-type Ca<sup>2+</sup> channel inactivation by Ca<sup>2+</sup>/calmodulin impacts on their ability to support exocytosis in chromaffin cells. Journal of Neuroscience, 27(19), 5236-5248.

      Reviewer #3 (Recommendations for the authors):

      I continue to think that measuring changes in synaptic strength when raising extracellular Ca<sup>2+</sup> is a good experiment for evaluating the overfilling hypothesis. Future experiments would be better if the authors would include reversibility criteria to rule out rundown, etc. Also, comparisons to other types of synapses would be stronger if the same experimenter did the experiments at both types of synapses.

      We observed no systemic tendency for run-down of EPSCs during these experiments (Author response image 2). Furthermore, the observed variability is well within the expected variance range in the number of docked vesicles at both baseline and high external Ca²⁺ (Lin et al., 2025; Kusick et al., 2020) and reflects biological variability rather than experimental artifact. Therefore, we believe that additional reversibility experiments are not warranted. However, we are open to further discussion if the Reviewer has specific methodological concerns not resolved by our present data.

      For the second issue, as mentioned above, we think that studying at other synapse types should be done in a separate study.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Recommendations for the authors):

      (1) The onus of making the revisions understandable to the reviewers lies with the authors. In its current form, how the authors have approached the review is hard to follow, in my opinion. Although the authors have taken a lot of effort in answering the questions posed by reviewers, parallel changes in the manuscript are not clearly mentioned. In many cases, the authors have acknowledged the criticism in response to the reviewer, but have not changed their narrative, particularly in the results section.

      We fully acknowledge your concern regarding the narrative linking EB-induced GluCl expression to JH biosynthesis and fecundity enhancement, particularly the need to address alternative interpretations of the data. Below, we outline the specific revisions made to address your feedback and ensure the manuscript’s narrative aligns more precisely with the experimental evidence:

      (1) Revised Wording in the Results Section

      To avoid overinterpretation of causality, we have modified the language in key sections of the Results (e.g., Figure 5 and related text):

      Original phrasing:

      “These results suggest that EB activates GluCl which induces JH biosynthesis and release, which in turn stimulates reproduction in BPH (Figure 5J).”

      Revised phrasing:

      “We also examined whether silencing Gluclα impacts the AstA/AstAR signaling pathway in female adults. Knock-down of Gluclα in female adults was found to have no impact on the expression of AT, AstA, AstB, AstCC, AstAR, and AstBR. However, the expression of AstCCC and AstCR was significantly upregulated in dsGluclα-injected insects (Figure 5-figure supplement 2A-H). Further studies are required to delineate the direct or indirect mechanisms underlying this effect of Gluclα-knockdown.” (line 643-649). And we have removed Figure 5J in the revised manuscript.

      (2) Expanded Discussion of Alternative Mechanisms

      In the Discussion section, we have incorporated a dedicated paragraph to explore alternative pathways and compensatory mechanisms:

      Key additions:

      “This EB action on GluClα expression is likely indirect, and we do not consider EB as transcriptional regulator of GluClα. Thus, the mechanism behind EB-mediated induction of GluClα remains to be determined. It is possible that prolonged EB exposure triggers feedback mechanisms (e.g. cellular stress responses) to counteract EB-induced GluClα dysfunction, leading to transcriptional upregulation of the channel. Hence, considering that EB exposure in our experiments lasts several days, these findings might represent indirect (or secondary) effects caused by other factors downstream of GluCl signaling that affect channel expression.” (line 837-845).

      (2) In the response to reviewers, the authors have mentioned line numbers in the main text where changes were made. But very frequently, those lines do not refer to the changes or mention just a subsection of changes done. As an example please see point 1 of Specific Points below. The problem is throughout the document making it very difficult to follow the revision and contributing to the point mentioned above.

      Thank you for highlighting this critical oversight. We sincerely apologize for the inconsistency in referencing line numbers and incomplete descriptions of revisions, which undoubtedly hindered your ability to track changes effectively. We have eliminated all vague or incomplete line number references from the response letter. Instead, revisions are now explicitly tied to specific sections, figures, or paragraphs.

      (3) The authors need to infer the performed experiments rationally without over interpretation. Currently, many of the claims that the authors are making are unsubstantiated. As a result of the first review process, the authors have acknowledged the discrepancies, but they have failed to alter their interpretations accordingly.

      We fully agree that overinterpretation of data undermines scientific rigor. In response to your feedback, we have systematically revised the manuscript to align claims strictly with experimental evidence and to eliminate unsubstantiated assertions. We sincerely apologize for the earlier overinterpretations and appreciate your insistence on precision. The revised manuscript now rigorously distinguishes between observations (e.g., EB-GluCl-JH correlations) and hypotheses (e.g., GluCl’s mechanistic role). By tempering causal language and integrating competing explanations, we aimed to present a more accurate and defensible narrative.

      SPECIFIC POINTS (to each question initially raised and their rebuttals)

      (1a) "Actually, there are many studies showing that insects treated with insecticides can increase the expression of target genes". Please note what is asked for is that the ligand itself induces the expression of its receptor. Of course, insecticide treatment will result in the changes expression of targets. Of all the evidences furnished in rebuttal, only Peng et al. 2017 fits the above definition. Even in this case, the accepted mode of action of chlorantraniliprole is by inducing structural change in ryanodine receptor. The observed induction of ryanodine receptor chlorantraniliprole can best be described as secondary effect. All others references do not really suffice the point asked for.

      We appreciate the reviewers’ suggestions for improving the manuscript. First, we have supplemented additional studies supporting the notion that " There are several studies showing that insects treated with insecticides display increases in the expression of target genes. For example, the relative expression level of the ryanodine receptor gene of the rice stem borer, Chilo suppressalis was increased 10-fold after treatment with chlorantraniliprole, an insecticide which targets the ryanodine receptor (Peng et al., 2017). In Drosophila, starvation (and low insulin) elevates the transcription level of the receptors of the neuropeptides short neuropeptide F and tachykinin (Ko et al., 2015; Root et al., 2011). In BPH, reduction in mRNA and protein expression of a nicotinic acetylcholine receptor α8 subunit is associated with resistance to imidacloprid (Zhang et al., 2015). Knockdown of the α8 gene by RNA interference decreased the sensitivity of N. lugens to imidacloprid (Zhang et al., 2015). Hence, the expression of receptor genes may be regulated by diverse factors, including insecticide exposure.” We have inserted text in lines 846-857 to elaborate on these possibilities.

      Second, we would like to reiterate our position: we have merely described this phenomenon, specifically that EB treatment increases GluClα expression. “This EB action on GluClα expression is likely indirect, and we do not consider EB as transcriptional regulator of GluClα. Thus, the mechanism behind EB-mediated induction of GluClα remains to be determined. It is possible that prolonged EB exposure triggers feedback mechanisms (e.g. cellular stress responses) to counteract EB-induced GluClα dysfunction, leading to transcriptional upregulation of the channel. Hence, considering that EB exposure in our experiments lasts several days, these findings might represent indirect (or secondary) effects caused by other factors downstream of GluCl signaling that affect channel expression.” We have inserted text in lines 837-845 to elaborate on these possibilities.

      Once again, we sincerely appreciate this discussion, which has provided us with a deeper understanding of this phenomenon.

      b. The authors in their rebuttal accepts that they do not consider EB to a transcriptional regulator of Gluclα and the induction of Gluclα as a result of EB can best be considered as a secondary effect. But that is not reflected in the manuscript, particularly in the result section. Current state of writing implies EB up regulation of Gluclα to an important event that contributes majorly to the hypothesis. So much so that they have retained the schematic diagram (Fig. 5J) where EB -> Gluclα is drawn. Even the heading of the subsection says "EB-enhanced fecundity in BPHs is dependent on its molecular target protein, the Gluclα channel". As mentioned in the general points, it is not enough to have a good rebuttal written to the reviewer, the parent manuscript needs to reflect on the changes asked for.

      Thank you for your comments. We have carefully addressed your suggestions and made corresponding revisions to the manuscript.

      We fully acknowledge the reviewer's valid concern. In this revised manuscript, “However, we do not propose that EB is a direct transcriptional regulator of Gluclα, since EB and other avermectins are known to alter the channel conformation and thus their function (Wolstenholme, 2012; Wu et al., 2017). Thus, it is likely that the observed increase in Gluclα transcipt is a secondary effect downstream of EB signaling.” (Line 625-629). We agree that the original presentation in the manuscript, particularly within the Results section, did not adequately reflect this nuance and could be misinterpreted as suggesting a direct regulatory role for EB on Gluclα transcription.

      Regarding Fig. 5J, we have removed the figure and all mentions of Fig. 5J and its legend in the revised manuscript.

      c. "We have inserted text on lines 738 - 757 to explain these possibilities." Not a single line in the section mentioned above discussed the topic in hand. This is serious undermining of the review process or carelessness to the extreme level.

      In the Results section, we have now added descriptions “Taken together, these results reveal that EB exposure is associated with an increase in JH titer and that this elevated JH signaling contributes to enhanced fecundity in BPH.” (line 375-377).

      For the figures, we have removed Fig. 4N and all mentions of Fig. 4N and its legend in the revised manuscript.

      Lastly, regarding the issue of locating specific lines, we deeply regret any inconvenience caused. Due to the track changes mode used during revisions, line numbers may have shifted, resulting in incorrect references. We sincerely apologize for this and have now corrected the line numbers.

      (2) The section written in rebuttal should be included in the discussion as well, explaining why authors think a nymphal treatment with JH may work in increasing fecundity of the adults. Also, the authors accept that EBs effect on JH titer in Indirect. The text of the manuscript, results section and figures should be reflective of that. It is NOT ok to accept that EB impacts JH titer indirectly in a rebuttal letter while still continuing to portray EB direct effect on JH titer. In terms of diagrams, authors cannot put a -> sign until and unless the effect is direct. This is an accepted norm in biological publications.

      We appreciate the reviewer’s valuable suggestions here. We have now carefully revised the manuscript to address all concerns, particularly regarding the mechanism linking nymphal EB exposure to adult fecundity and the indirect nature of EB’s effect on JH titers. Below are our point-by-point responses and corresponding manuscript changes. Revised text is clearly marked in the resubmitted manuscript.

      (1) Clarifying the mechanism linking nymphal EB treatment to adult fecundity:

      Reviewer concern: Explain why nymphal EB treatment increases adult fecundity despite undetectable EB residues in adults.

      Response & Actions Taken:

      We agree this requires explicit discussion. We now propose that nymphal EB exposure triggers developmental reprogramming (e.g., metabolic/epigenetic changes) that persist into adulthood, indirectly enhancing JH synthesis and fecundity. This is supported by two key findings:

      (1) No detectable EB residues in adults after nymphal treatment (new Figure 1–figure supplement 1C).

      (2) Increased adult weight and nutrient reserves (Figure 1–figure supplement 3E,F), suggesting altered resource allocation.

      Added to Discussion (Lines 793–803): Notably, after exposing fourth-instar BPH nymphs to EB, no EB residues were detected in the subsequent adult stage. This finding indicates that the EB-induced increase in adult fecundity is initiated during the nymphal stage and s manifests in adulthood - a mechanism distinct from the direct fecundity enhancement of fecundity observed when EB is applied to adults. We propose that sublethal EB exposure during critical nymphal stages may reprogram metabolic or endocrine pathways, potentially via insulin/JH crosstalk. For instance, increased nutrient storage (e.g., proteins, sugars; Figure 2–figure supplement 2) could enhance insulin signaling, which in turn promotes JH biosynthesis in adults (Ling and Raikhel, 2021; Mirth et al., 2014; Sheng et al., 2011). Future studies should test whether EB alters insulin-like peptide expression or signaling during development.

      (3) Emphasizing EB’s indirect effect on JH titers:Reviewer concern: The manuscript overstated EB’s direct effect on JH. Arrows in figures implied causality where only correlation exists.

      Response & Actions

      Taken:We fully agree. EB’s effect on JH is indirect and multifactorial (via AstA/AstAR suppression, GluCl modulation, and metabolic changes). We have:

      Removed oversimplified schematics (original Figures 3N, 4N, 5J).

      Revised all causal language (e.g., "EB increases JH" → "EB exposure is associated with increased circulating JH III "). (Line 739)

      Clarified in Results/Discussion that EB-induced JH changes are likely secondary to neuroendocrine disruption.

      Key revisions:

      Results (Lines 375–377):

      "Taken together, these results reveal that EB exposure is associated with an increase in JH titer and that JH signaling contributes to enhanced fecundity in BPH."

      Discussion (Lines 837–845):

      This EB action on GluClα expression is likely indirect, and we do not consider EB as transcriptional regulator of GluClα. Thus, the mechanism behind EB-mediated induction of GluClα remains to be determined. It is possible that prolonged EB exposure triggers feedback mechanisms (e.g. cellular stress responses) to counteract EB-induced GluClα dysfunction, leading to transcriptional upregulation of the channel. Hence, considering that EB exposure in our experiments lasts several days, these findings might represent indirect (or secondary) effects caused by other factors downstream of GluCl signaling that affect channel expression.

      a. Lines 281-285 as mentioned, does not carry the relevant information.

      Thank you for your careful review of our manuscript. We sincerely apologize for the confusion regarding line references in our previous response. Due to extensive revisions and tracked changes during the revision process, the line numbers shifted, resulting in incorrect citations for Lines 281–285. The correct location for the added results (EB-induced increase in mature eggs in adult ovaries) is now in lines 253-258: “We furthermore observed that EB treatment of female adults also increases the number of mature eggs in the ovary (Figure 2-figure supplement 1).”

      b. Lines 351-356 as mentioned, does not carry the relevant information. Lines 281-285 as mentioned, does not carry the relevant information.

      Thank you for your careful review of our manuscript. We sincerely apologize for the confusion regarding line references in our previous response. The correct location for the added results is now in lines 366-371: “We also investigated the effects of EB treatment on the JH titer of female adults. The data indicate that the JH titer was also significantly increased in the EB-treated female adults compared with controls (Figure 3-figure supplement 3A). However, again the steroid 20-hydroxyecdysone, was not significantly different between EB-treated BPH and controls (Figure 3-figure supplement 3B).”

      c. Lines 378-379 as mentioned, does not carry the relevant information. Lines 387-390 as mentioned, does not carry the relevant information.

      We sincerely apologize for the confusion regarding line references in our previous response.

      The correct location for the added results is now in lines 393-394: We furthermore found that EB treatment in female adults increases JHAMT expression (Figure 3-figure supplement 3C).

      The other correct location for the added results is now in lines 405-408: We found that Kr-h1 was significantly upregulated in the adults of EB-treated BPH at the 5M, 5L nymph and 4 to 5 DAE stages (4.7-fold to 27.2-fold) when 4th instar nymph or female adults were treated with EB (Figure 3H and Figure 3-figure supplement 3D)..

      (3) The writing quality is still extremely poor. It does not meet any publication standard, let alone elife.

      We fully understand your concerns and frustrations, and we sincerely apologize for the deficiencies in our writing quality, which did not meet the high standards expected by you and the journal. We fully accept your criticism regarding the writing quality and have rigorously revised the manuscript according to your suggestions.

      (4) I am confused whether Figure 2B was redone or just edited. Otherwise this seems acceptable to me.

      Regarding Fig. 2B, we have edited the text on the y-axis. The previous wording included the term “retention,” which may have caused misunderstanding for both the readers and yourself, leading to the perception of contradiction. We have now revised this wording to ensure accurate comprehension.

      (5) The rebuttal is accepted. However, still some of the lines mentioned does not hold relevant information.

      This error has been corrected.

      The correct location for the added results is now in lines 255-258 and lines 279-282: “Hence, although EB does not affect the normal egg developmental stages (see description in next section), our results suggest that EB treatment promotes oogenesis and, as a result the insects both produce more eggs in the ovary and a larger number of eggs are laid.” and “However, considering that the number of eggs laid by EB treated females was larger than in control females (Figure 1 and Figure 1-figure supplement 1), our data indicates that EB treatment of BPH can both promote both oogenesis and oviposition.”

      (6) Thank you for the clarification. Although now discussed extensively in discussion section, the nuances of indirect effect and minimal change in expression should also be reflected in the result section text. This is to ensure that readers have clear idea about content of the paper.

      Corrected. To ensure readers gain a clear understanding of our data, we have briefly presented these discussions in the Results section. Please see line 397-402: The levels of met mRNA slightly increased in EB-treated BPH at the 5M and 5L instar nymph and 1 to 5 DAE adult stages compared to controls (1.7-fold to 2.9-fold) (Figure 3G). However, it should be mentioned that JH action does not result in an increase of Met. Thus, it is possible that other factors (indirect effects), induced by EB treatment cause the increase in the mRNA expression level of Met.

      (7) As per the author's interpretation, it becomes critical to quantitate the amount of EB present at the adult stages after a 4th instar exposure to it. Only this experiment will unambiguously proof the authors claim. Also, since they have done adult insect exposure to EB, such experiments should be systematically performed for as many sections as possible. Don't just focus on few instances where reviewers have pointed out the issue.

      Thank you for raising this critical point. To address this concern, we have conducted new supplementary experiments. The new experimental results demonstrate that residual levels of emamectin benzoate (EB) in adult-stage brown planthoppers (BPH) were below the instrument detection limit following treatment of 4th instar nymphs with EB. Line 172-184: “To determine whether EB administered during the fourth-instar larval stage persists as residues in the adult stage, we used HPLC-MS/MS to quantify the amount of EB present at the adult stage after exposing 4th-instar nymphs to this compound. However, we found no detectable EB residues in the adult stage following fourth-instar nymphal treatment (Figure 1-figure supplement 1C). This suggests that the mechanism underlying the increased fecundity of female adults induced by EB treatment of nymphs may differ from that caused by direct EB treatment of female adults. Combined with our previous observation that EB treatment significantly increased the body weight of adult females (Figure 1—figure supplement 3E and F), a possible explanation for this phenomenon is that EB may enhance food intake in BPH, potentially leading to elevated production of insulin-like peptides and thus increased growth. Increased insulin signaling could potentially also stimulate juvenile hormone (JH) biosynthesis during the adult stage (Badisco et al., 2013).”

      (8) Thank you for the revision. Lines 725-735 as mentioned, does not carry the relevant information. However, since the authors have decided to remove this systematically from the manuscript, discussion on this may not be required.

      Thank you for identifying the limited relevance of the content in Lines 725–735 of the original manuscript. As recommended, we have removed this section in the revised version to improve logical coherence and maintain focus on the core findings.

      (9) Normally, dsRNA would last for some time in the insect system and would down-regulate any further induction of target genes by EB. I suggest the authors to measure the level of the target genes by qPCR in KD insects before and after EB treatment to clear the confusion and unambiguously demonstrate the results. Please Note- such quantifications should be done for all the KD+EB experiments. Additionally, citing few papers where such a rescue effect has been demonstrated in closely related insect will help in building confidence.

      We appreciate the reviewer’s suggestion to clarify the interaction between RNAi-mediated gene knockdown (KD) and EB treatment. To address this, we performed additional experiments measuring Kr-h1 expression via qPCR in dsKr-h1-injected insects before and after EB exposure.

      The results (now Figure 3–figure supplement 4) show that:

      (1) EB did not rescue *Kr-h1* suppression at 24h post-treatment (*p* > 0.05).

      (2) Partial recovery of fecundity occurred later (Figure 3M), likely due to:

      a) Degradation of dsRNA over time, reducing KD efficacy (Liu et al., 2010).

      b) Indirect effects of EB (e.g., hormonal/metabolic reprogramming) compensating for residual Kr-h1 suppression.

      Please see line 441-453: “Next, we investigated whether EB treatment could rescue the dsRNA-mediated gene silencing effect. To address this, we selected the Kr-h1 gene and analyzed its expression levels after EB treatment. Our results showed that Kr-h1 expression was suppressed by ~70% at 72 h post-dsRNA injection. However, EB treatment did not significantly rescue Kr-h1 expression in gene knock down insects (*p* > 0.05) at 24h post-EB treatment (Figure 3-figure supplement 4). While dsRNA-mediated Kr-h1 suppression was robust initially, its efficacy may decline during prolonged experiments. This aligns with reports in BPH, where effects of RNAi gradually diminish beyond 7 days post-injection (Liu et al., 2010a). The late-phase fecundity increase might reflect partial Kr-h1 recovery due to RNAi degradation, allowing residual EB to weakly stimulate reproduction. In addition, the physiological impact of EB (e.g., neurotoxicity, hormonal modulation) could manifest via compensatory feedback loops or metabolic remodeling.”

      (10) Not a very convincing argument. Besides without a scale bar, it is hard for the reviewers to judge the size of the organism. Whole body measurements of JH synthesis enzymes will remain as a quite a drawback for the paper.

      In response to your suggestion, we have also included images with scale bars (see next Figure 1). The images show that the head region is difficult to separate from the brown thoracic sclerite region. Furthermore, the anatomical position of the Corpora Allata in brown planthoppers has never been reported, making dissection uncertain and highly challenging. To address this, we are now attempting to use Drosophila as a model to investigate how EB regulates JH synthesis and reproduction.

      Author response image 1.<br /> This illustration provides a visual representation of the brown planthopper (BPH), a major rice pest.<br />

      Figure 1. This illustration provides a visual representation of the brown planthopper (BPH), a major rice pest.).

      (11) "The phenomenon reported was specific to BPH and not found in other insects. This limits the implications of the study". This argument still holds. Combined with extreme species specificity, the general effect that EB causes brings into question the molecular specificity that the authors claim about the mode of action.

      We acknowledge that the specificity of the phenomenon to BPH may limit its broader implications, but we would like to emphasize that this study provides important insights into the unique biological mechanisms in BPH, a pest of significant agricultural importance. The molecular specificity we described in the manuscript is based on rigorous experimental evidence. We believe that it contributes to valuable knowledge to understand the interaction of external factors such as EB and BPH and resurgence of pests. We hope that this study will inspire further research into the mechanisms underlying similar phenomena in other insects, thereby broadening our understanding of insect biology. Since EB also has an effect on fecundity in Drosophila, albeit opposite to that in BPHs (Fig. 1 suppl. 2), it seems likely that EB actions may be of more general interest in insect reproduction.

      (12) The authors have added a few lines in the discussion but it does not change the overall design of the experiments. In this scenario, they should infer the performed experiments rationally without over interpretation. Currently, many of the claims that the authors are making are unsubstantiated. As a result of the first review process, the authors have acknowledged the discrepancies, but they have failed to alter their interpretations accordingly.

      We appreciate your concern regarding the experimental design and the need for rational inference without overinterpretation. In response, we would like to clarify that our discussion is based on the experimental data we have collected. We acknowledge that our study focuses on BPH and the specific effects of EB, and while we agree that broader generalizations require further research, we believe the new findings we present are valid and contribute to the understanding of this specific system.

      We also acknowledge the discrepancies you mentioned and have carefully considered your suggestions. In this revised version, we believe our interpretations are reasonable and consistent with the data, and we have adjusted our discussion to better reflect the scope of our findings. We hope that these revisions address your concerns. Thank you again for your constructive feedback.

      ADDITIONAL POINTS

      (1) Only one experiment was performed with Abamectin. No titration for the dosage were done for this compound, or at least not provided in the manuscript. Inclusion of this result will confuse readers. While removing this result does not impact the manuscript at all. My suggestion would be to remove this result.

      We acknowledge that the abamectin experiment lacks dose-titration details and that its standalone presentation could lead to confusion. However, we respectfully request to retain these results for the following reasons:

      Class-Specific Mechanism Validation:

      Abamectin and emamectin benzoate (EB) are both macrocyclic lactones targeting glutamate-gated chloride channels (GluCls). The observed similarity in their effects on BPH fecundity (e.g., Figure 1—figure supplement 1B) supports the hypothesis that GluCl modulation, rather than compound-specific off-target effects, drives the reproductive enhancement. This consistency strengthens the mechanistic argument central to our study.

      (2) The section "The impact of EB treatment on BPH reproductive fitness" is poorly described. This needs elaboration. A line or two should be included to describe why the parameters chosen to decide reproductive fitness were selected in the first place. I see that the definition of brachypterism has undergone a change from the first version of the manuscript. Can you provide an explanation for that? Also, there is no rationale behind inclusion of statements on insulin at this stage. The authors have not investigated insulin. Including that here will confuse readers. This can be added in the discussion though.

      Thank you for your suggestion. We have added an explanation regarding the primary consideration of evaluating reproductive fitness. In the interaction between sublethal doses of insecticides and pests, reproductive fitness is a key factor, as it accurately reflects the potential impact of insecticides on pest control in the field. Among the reproductive fitness parameters, factors such as female Nilaparvata lugens body weight, lifespan, and brachypterous ratio (as short-winged N. lugens exhibit higher oviposition rates than long-winged individuals) are critical determinants of reproductive success. Therefore, we comprehensively assessed the effects of EB on these parameters to elucidate the primary mechanism by which EB influences reproduction. We sincerely appreciate your constructive feedback.

      (3) "EB promotes ovarian maturation in BPH" this entire section needs to be rewritten and attention should be paid to the sequence of experiments described.

      Thank you for your suggestion. Based on your recommendation, we have rewritten this section (lines 267–275) and adjusted the sequence of experimental descriptions to improve the structural clarity of this part.

      (4) Figure 3N is outright wrong and should be removed or revised.

      In accordance with your recommendation, we have removed the figure.

      (5) When you are measuring hormonal titers, it is important to mention explicitly whether you are measuring hemolymph titer or whole body.

      We believe we have explicitly stated in the Methods section (line 1013) that we measured whole-body hormone titers. However, we now added this information to figure legends.

      (6)  EB induces JH biosynthesis through the peptidergic AstA/AstAR signaling pathway- this section needs attention at multiple points. Please check.

      We acknowledge that direct evidence for EB-AstA/AstAR interaction is limited and have framed these findings as a hypothesis for future validation.

      References

      Liu, S., Ding, Z., Zhang, C., Yang, B., Liu, Z., 2010. Gene knockdown by intro-thoracic injection of double-stranded RNA in the brown planthopper, Nilaparvata lugens. Insect Biochem. Mol. Biol. 40, 666-671

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) Summary:

      The authors note that it is challenging to perform diffusion MRI tractography consistently in both humans and macaques, particularly when deep subcortical structures are involved. The scientific advance described in this paper is effectively an update to the tracts that the XTRACT software supports. The claims of robustness are based on a very small selection of subjects from a very atypical dMRI acquisition (n=50 from HCP-Adult) and an even smaller selection of subjects from a more typical study (n=10 from ON-Harmony).

      Strengths:

      The changes to XTRACT are soundly motivated in theory (based on anatomical tracer studies) and practice (changes in seeding/masking for tractography), and I think the value added by these changes to XTRACT should be shared with the field. While other bundle segmentation software typically includes these types of changes in release notes, I think papers are more appropriate.

      We would like to thank the reviewer for their assessment and we appreciate the comments for improving our manuscript. We have added new results, sampling from a larger cohort with a typical dMRI protocol (N=50 from UK Biobank), as well as showcasing examples from individual subject reconstructions (Supplementary figures S6, S7). We also demonstrate comparisons against another approach that has been proposed for extracting parts of the cortico-striatal bundle in a bundle segmentation fashion, as the reviewer suggests (see comment and Author response image 1 below). 

      We would also like to take the opportunity to summarise the novelty of our contribuIons, as detailed in the Introduction, which we believe extend beyond a mere software update; this is a byproduct of this work rather than the aim. 

      i) We devise for the first Ime standard-space protocols for 21 challenging cortico-subcortical bundles for both human and macaque and we interrogate them in a comprehensive manner.

      ii) We demonstrate robustness of these protocols using criteria grounded on neuroanatomy, showing that tractography reconstructions follow topographical principles known from tracers both in WM and GM and for both species. We also show that these protocols capture individual variability as assessed by respecting family structure in data from the HCP twins.

      iii) We use high-resolution dMRI data (HCP and post-mortem macaque) to showcase feasibility of these reconstructions, and we show that reconstructions are also plausible with more conventional data, such as the ones from the UK Biobank.

      iv) We further showcase robustness and the value of cross-species mapping by using these tractography reconstructions to predict known homologous grey matter (GM) regions across the two species, both in cortex and subcortex, on the basis of similarity of grey matter areal connection patterns to the set of proposed white matter bundles.

      Weaknesses

      (2) The demonstration of the new tracts does not include a large number of carefully selected scans and is only compared to the prior methods in XTRACT. The small n and limited statistical comparisons are insufficient to claim that they are better than an alternative. Qualitatively, this method looks sound.

      We appreciate the suggestion for larger sample size, so we performed the same analysis using 50 randomly drawn UK Biobank subjects, instead of ON-Harmony, matching the N=50 randomly drawn HCP subjects (detailed explanation in the comment below, Main text Figure 4A; Supplementary Figures S4). We also generated results using the full set of N=339 HCP unrelated subjects (Supplementary Figure S5 compares 10, 50 and 339 unrelated HCP subjects). We provide further details in the relevant point (3) below. 

      With regards to comparisons to other methods, there are not really many analogous approaches that we can compare against. In our knowledge there are no previous cross-species, standard space tractography protocols for the tracts we considered in this study (including Muratoff, amygdalofugal, different parts of extreme an external capsules, along with their neighbouring tracts). We therefore i) directly compared against independent neuroanatomical knowledge and patterns (Figures 2, 3, 5), ii) confirmed that patterns against data quality and individual variability that the new tracts demonstrate are similar to patterns observed for the more established cortical tracts (Figure 4), iii) indirectly assessed efficacy by performing a demanding task, such as homologue identification on the basis of the tracts we reconstruct (Figures 6, 7). 

      We need to point out that our approach is not “bundle segmentation”, in the sense of “datadriven” approaches that cluster streamlines into bundles following full-brain tractography. The latter is different in spirit and assigns a label to each generated streamline; as full-brain tractography is challenging (Maier-Hein, Nature Comms 2017), we follow instead the approach of imposing anatomical constraints to miIgate for some of these challenges as suggested in (MaierHein, 2017).

      Nevertheless, we used TractSeg (one of the few alternatives that considers corticostriatal bundles) to perform some comparisons. The Author response image below shows average path distributions across 10 HCP subjects for a few bundles that we also reconstruct in our paper (no temporal part of striatal bundle is generated by Tractseg). We can observe that the output for each tract is highly overlapping across subjects, indicating that there is not much individual variability captured. We also see the reduced specificity in the connectivity end-points of the bundles. 

      Author response image 1.

      Comparison between 10-subject average for example subcortical tracts using TractSeg and XTRACT. We chose example bundles shared between our set and TractSeg. Per subject TractSeg produces a binary mask rather than a path distribution per tract. Furthermore, the mask is highly overlapping across subjects. Where direct correspondence was not possible, we found the closest matching tract. Specifically, we used ST_PREF for STBf, and merged ST_PREC with ST_POSTC to match StBm. There was no correspondence for the temporal part of StB.

      We subsequently performed the twinness test using both TractSeg and XTRACT (Author response image 2), as a way to assess whether aspects of individual variability can be captured. Due to heritability of brain organisation features, we anticipate that monozygotic twins have more similar tract reconstructions compared to dizygoIc twins and subsequently non-twin siblings. This pattern is reproduced using our proposed approach, but not using TractSeg that provides a rather flat pattern.  

      Author response image 2.

      Violin plots of the mean pairwise Pearson’s correlations across tracts between 72 monozygotic (MZ) twin pairs, 72 dizygotic (DZ) twin pairs, 72 non-twin sibling pairs, and 72 unrelated subject pairs from the Human Connectome Project, using Tractseg (left) and XTRACT (right). About 12 cortico-subcortical tracts were considered, as closely matched as possible between the two approaches. For Tractseg we considered: 'CA', 'FX', 'ST_FO', 'ST_M1S1' (merged ‘ST_PREC’ and ‘ST_POSTC’ to approximate the sensorimotor part of our striatal bundle), 'ST_OCC', 'ST_PAR', 'ST_PREF',  'ST_PREM', 'T_M1S1' (merged ‘T_PREC’ and ‘T_POSTC’ to approximate the sensorimotor part of our striatal bundle), 'T_PREF', 'T_PREM', 'UF'. For XTRACT we considered: 'ac', 'fx', 'StB<sub>f</sub>', 'StB<sub>m</sub>', 'StB<sub>p</sub>', 'StB<sub>t</sub>, 'EmC<sub>f</sub>', 'EmC<sub>p</sub>', 'EmC<sub>t</sub>', 'MB', 'amf', 'uf'. Showing the mean (μ) and standard deviation (σ) for each group. There were no significant di^erences between groups using TractSeg.

      Taken together, these results indicate as a minimum that the different approaches have potentially different aims. Their different behaviour across the two approaches can be desirable and beneficial for different applications (for instance WM ROI segmentation vs connectivity analysis) but makes it challenging to perform like-to-like comparisons.

      (3) “Subject selection at each stage is unclear in this manuscript. On page 5 the data are described as "Using dMRI data from the macaque (𝑁 = 6) and human brain (𝑁 = 50)". Were the 50 HCP subjects selected to cover a range of noise levels or subject head motion? Figure 4 describes 72 pairs for each of monozygotic, dizygotic, non-twin siblings, and unrelated pairs - are these treated separately? Similarly, NH had 10 subjects, but each was scanned 5 times. How was this represented in the sample construction?”

      We appreciate the suggestions and we agree that some of the choices in terms of group sizes may have been confusing. Short answer is we did not perform any subject selection, subjects were randomly drawn from what we had available. The 72 twin pairs are simply the maximum number of monozygotic twin pairs available in the HCP cohort, so we used 72 pairs in all categories to match this number in these specific tests. The N=6 animals are good quality post-mortem dMRI data that have been acquired in the past and we cannot easily expand. For the rest of the points, we have now made the following changes:

      We have replaced our comparison to the ON-Harmony dataset (10 subjects) with a comparison to 50 unrelated UK Biobank subjects (to match the 50 unrelated HCP subject cohort used throughout). Updated results can be seen in Figure 4A and Supplementary Figure S4. This allows a comparison of tractography reconstruction between high quality and more conventional quality data for the same N.

      We looked at QC metrics to ensure our chosen cohorts were representaIve of the full cohorts we had available. The N=50 unrelated HCP cohort and N=50 unrelated UKBiobank cohorts we used in the study captured well the range of the full 339 unrelated HCP cohort and N=7192 UKBiobank cohort in terms of absolute/relative moion (Author response image 3A and 3B respectively). A similar pattern was observed in terms of SNR and CNR ranges Author response image 4).

      We generated tractography reconstructions for single subjects, corresponding to the 10th percentile (P<sub>10</sub>), median and 90th percentile (P90) of the distributions with respect to similarity to the cohort average maps. These are now shown in Supplementary Figures S6, S7. We also checked the QC metrics for these single subjects and confirmed that average absolute subject moIon was highest for the P<sub>10</sub>, followed by the P<sub>50</sub> and lowest for the P<sub>90</sub> subject, capturing a range of within cohort data quality.

      We generated reconstructions for an even larger HCP cohort (all 339 unrelated HCP subjects) and these look very similar to the N=50 reconstructions (Supplementary Figure S5).

      Author response image 3.

      Subsets chosen from the HCP and UKB reflect similar range of average motion (relative and absolute) to the corresponding full cohorts. (A) Absolute and relative motion comparison between N=50 and N=339 unrelated HCP subjects. (B) Absolute and relative motion comparison between N=50 and N=7192 super-healthy UKB subjects.  

      Author response image 4.

      Average SNR and CNR values show similar range between the N=50 UKB subset and the full UK Biobank cohort of N=7192.

      (4) In the paper, the authors state "the mean agreement between HCP and NH reconstructions was lower for the new tracts, compared to the original protocols (𝑝 < 10^−10). This was due to occasionally reconstructing a sparser path distribution, i.e., slightly higher false negative rate," - how can we know this is a false negative rate without knowing the ground truth?

      We are sorry for the terminology, we have corrected this, as it was confusing. Indeed, we cannot call it false negaIve, what we meant is that reconstructions from lower resolution data for these bundles ended up being in general sparser than the ones from the high-resolution data, potentially missing parts of the tract. We have now revised the text accordingly.

      Reviewer #2 Public Review:

      (5) Summary:

      In this article, Assimopoulos et al. expand the FSL-XTRACT software to include new protocols for identifying cortical-subcortical tracts with diffusion MRI, with a focus on tracts connecting to the amygdala and striatum. They show that the amygdalofugal pathway and divisions of the striatal bundle/external capsule can be successfully reconstructed in both macaques and humans while preserving large-scale topographic features previously defined in tract tracing studies. The authors set out to create an automated subcortical tractography protocol, and they accomplished this for a subset of specific subcortical connections for users of the FSL ecosystem.

      Strengths:

      A main strength of the current study is the translation of established anatomical knowledge to a tractography protocol for delineating cortical-subcortical tracts that are difficult to reconstruct. Diffusion MRI-based tractography is highly prone to false positives; thus, constraining tractography outputs by known anatomical priors is important. Key additional strengths include 1) the creation of a protocol that can be applied to both macaque and human data; 2) demonstration that the protocol can be applied to be high quality data (3 shells, > 250 directions, 1.25 mm isotropic, 55 minutes) and lower quality data (2 shells, 100 directions, 2 mm isotropic, 6.5 minutes); and 3) validation that the anatomy of cortical-subcortical tracts derived from the new method are more similar in monozygotic twins than in siblings and unrelated individuals.

      We thank the Reviewer for the globally posiIve evaluaIon of this work and the perInent comments that have helped us to improve the paper.

      Weaknesses

      (6) Although this work validates the general organizational location and topographic organization of tractography-derived cortical-subcortical tracts against prior tract tracing studies (a clear strength), the validation is purely visual and thus only qualitative. Furthermore, it is difficult to assess how the current XTRACT method may compare to currently available tractography approaches to delineating similar cortical-subcortical connections. Finally, it appears that the cortical-subcortical tractography protocols developed here can only be used via FSL-XTRACT (yet not with other dMRI software), somewhat limiting the overall accessibility of the method.

      We agree that a more quanItative comparison against gold standard tracing data would be ideal. However, there are practical challenges that prohibit such a comparison at this stage: i) Access to data. There are no quantifiable, openly shared, large scale/whole brain tracing data available. The Markov study provided the only openly available weighted connectivity matrices measured by tracers in macaques (Markov, Cereb Cortex 2014), which are only cortico-cortical and do not provide the white matter routes, they only quantify the relative contrast in connection terminals. ii) 2D microscopy vs 3D tractography. The vast majority of tracing data one can find in neuroanatomy labs is on 2D microscopy slices with restricted field of view, which is also the case for the data we had access to for this study. This complicates significantly like-to-like comparisons against 3D whole-brain tractography reconstructions. iii) Quantifiability is even tricky in the case of gold standard axonal tracing, as it depends on nuisance factors, e.g. injection site, injection size, injection uniformity and coverage, which confound the gold-standard measurements, but are not relevant for tractography. For these reasons, a number of high-profile NIH BRAIN CONNECTS Centres (for instance hXps://connects.mgh.harvard.edu/, hXps://mesoscaleconnecIvity.org/) are resourced to address these challenges at scale in the coming years and provide the tools to the community to perform such quantitative comparisons in the future.  

      In terms of comparison with other approaches, we have performed new tests and detail a response to a similar comment (2) from Reviewer 1.

      Finally, our protocols have been FSL-tested, but have nothing that is FSL specific. We cannot speak of performance when used with other tools, but there is nothing that prohibits translation of these standard space protocols to other tools. In fact, the whole idea behind XTRACT was to generate an approach open to external contributions for bundle-specific delineation protocols, both for humans and for non-human species. A number of XTRACT extensions that have been published over the last 5 years for other NHP species (Roumazeilles et al. (2020); Bryant et al. (2020); Wang et al. (2025)) and similar approaches have been used in commercial packages (Boshkovski et al, 2106, ISMRM 2022).

      Recommendations To the Authors:

      (7) Superiority of the FSL-XTRACT approach to delineating cortical-subcortical tracts. The Introduction of the article describes how "Tractography protocols for white matter bundles that reach deeper subcortical regions, for instance the striatum or the amygdala, are more difficult to standardize" due to the size, proximity, complexity, and bottlenecks associated with corticalsubcortical tracts. It would be helpful for the authors to better describe how the analytic approach adopted here overcomes these various challenges. What does the present approach do differently than prior efforts to examine cortical-subcortical connectivity? 

      There have not been many prior efforts to standardise cortico-subcortical connecIvity reconstructions, as we overview in the Introduction. As outlined in (Schilling et al. (2020),  hXps://doi.org/10.1007/s00429-020-02129-z), tractography reconstructions can be highly accurate if we guide them using constraints that dictate where pathways are supposed to go and where they should not go. This is the philosophy behind XTRACT and all the proposed protocols, which provide neuroanatomical constraints across different bundles. At the same time these constraints are relatively coarse so that they are species-generalisable. We have clarified that in Discussion. The approach we took was to first identify anatomical constraints from neuroanatomy literature for each tract of interest independently, derive and test these protocols in the macaque, and then optimise in an iterative fashion until the protocols generalise well to humans and until, when considering groups of bundles, the generated reconstructions can follow topographical principles known from tract tracing literature. This process took years in order to perform these iterations as meticulously as we could. We have modified the first sections in Methods to reflect this better (3rd paragraph of 1st Methods section), as well as modified the third and second to last paragraphs of the Introduction (“We propose an approach that addresses these challenges…”).

      (8) Relatedly, it is difficult to fully evaluate the utility of the current approach to dissecting cortical-subcortical tracts without a qualitative or quantitative comparison to approaches that already exist in the field. Can the authors show that (or clarify how) the FSL-XTRACT approach is similar to - or superior to - currently available methods for defining cortical-striatal and amygdalofugal tracts (e.g., methods they cite in the Introduction)?”

      From the limited similar approaches that exist, we did perform some comparisons against TractSeg, please see Reply to Comment 2 from Reviewer 1. We have also expanded the relevant text in the introduction to clarify the differences:

      “…However, these either uIlise labour-intensive single-subject protocols (22,26), are not designed to be generalisable across species (42, 43), or are based mostly on geometrically-driven parcellaIons that do not necessarily preserve topographical principles of connecIons (40). We propose an approach that addresses these challenges and is automated, standardised, generalisable across two species and includes a larger set of cortico-subcortical bundles than considered before, yielding tractography reconstructions that are driven by neuroanatomical constraints.”

      (9) Future applications of the tractography protocol:

      It would be helpful for the authors to describe the contexts in which the automated tractography approach developed here can (and cannot) be applied in future studies. Are future applications limited to diffusion data that has been processed with FSL's BEDPOSTX and PROBTRACKX? Can FSL-XTRACT take in diffusion data modelled in other software (e.g., with CSD in mrtrix or with GQI in DSI Studio)? Can the seed/stop/target/exclusion ROIs be applied to whole-brain tractography generated in other software? Integration with other software suites would increase the accessibility of the new tract dissection protocols.

      We have added some text in the Discussion to clarify this point. Our protocols have been FSLtested, but have nothing that is FSL specific. We cannot speak of performance of other tools, but there is nothing that prohibits translaIon of these standard space protocols to other tools. As described before, the protocols are recipes with anatomical constraints including regions the corresponding white matter pathways connect to and regions they do not, constructed with cross-species generalisability in mind. In fact a number of other packages (even commercial) have adopted the XTRACT protocols with success in the past, so we do not see anything in principle that prohibits these new protocols to be similarly adopted. 

      We cannot comment on the protocols’ relevance for segmenIng whole-brain tractograms, as these can induce more false posiIves than tractography reconstructions from smaller seed regions and may require stricter exclusions.    

      (10) It was great to see confirmation that the XTRACT approach can be successfully applied in both high-quality diffusion data from the HCP and in the ON-Harmony data. Given the somewhat degraded performance in the lower quality dataset (e.g., Figure 4A), can the authors speak to the minimum data requirements needed to dissect these new cortical-subcortical tracts? Will the approach work on single-shell, low b data? Is there a minimum voxel resolution needed? Which tracts are expected to perform best and worst in lower-quality data?

      Thank you for these comments, even if we have not really tried in lower (spaIal and angular) resolution data, given the proximity of the tracts considered, as well as the small size of some bundles, we would not recommend lower resolution than those of the UK Biobank protocol. In general, we would consider the UK Biobank protocol (2mm, 2 shells) as the minimum and any modern clinical scanner can achieve this in 6-8 minutes. We hence evaluated performance from high quality HCP to lower quality UK Biobank data, covering a considerable range (scan Ime from 55 minutes down to 6 minutes). 

      In terms of which tract reconstructions were more reproducible for UKBiobank data, the tracts with lowest correlations across subjects (Figure 4) were the anterior commissure (AC) and the temporal part of the Extreme Capsule (EmC<sub>t</sub>), while the highest correlations were for the Muratoff Bundle (MB) and the temporal part of the Striatal Bundle (StB<sub>t</sub>). Interestingly, for the HCP data, the temporal part of the Extreme Capsule (EmC<sub>t</sub>) and the Muratoff Bundle were also the tracts with the lowest/highest correlations, respectively. Hence, certain tract reconstructions were consistently more variable than others across subjects, which may hint to also being more challenging to reconstruct. We have now clarified these aspects in the corresponding Results section. 

      (11) Anatomical validation of the new cortical-subcortical tracts

      I really appreciated the use of prior tract tracing findings to anatomically validate the corticalsubcortical tractography outputs for both the cortical-striatal and amygdalofugal tracts. It struck me, however, that the anatomical validation was purely qualitative, focused on the relative positioning or the topographical organization of major connections. The anatomical validation would be strengthened if profiles of connectivity between cortical regions and specific subcortical nuclei or subcortical subdivisions could be quantitatively compared, if at all possible. Can the differential connectivity shown visually for the putamen in Figure 3 be quantified for the tract tracing data and the tractography outputs? Does the amygdalofugal bundle show differential/preferential connectivity across amygdala nuclei in tract tracing data, and is this seen in tractography?

      We appreciate the comment, please see Reply to your comment 6 above. In addiIon to the challenges described there, we do not have access to terminal fields other than in the striatum and these ones are 2D, so we make a qualitaIve comparison of the relevant connecIvity contrasts. We expect that a number of currently ongoing high-profile BRAIN CONNECTS Centres (such as the LINC and the CMC) will be addressing such challenges in the coming years and will provide the tools and data to the community to perform such quanItaIve comparisons at scale.  

      (12) I believe that all visualizations of the macaque and human tractography showed groupaveraged maps. What do these tracts look like at the individual level? Understanding individual-level performance and anatomical variation is important, given the Discussion paragraph on using this method to guide neuromodulation.

      We now demonstrate some representative examples of individual subject reconstructions in Supplementary Figures S6, S7, ranking subjects by the average agreement of individual tract reconstructions to the mean and depicting the 10th percentile, median and 90th percentile of these subjects. We have also shown more results in Author response images 1-2, generated by TractSeg, to indicate how a different bundle segmentation approach would handle individual variability compared to our approach.

      (13) Connectivity-based comparisons across species:

      Figures 5 and 6 of the manuscript show that, as compared to using only cortico-cortical XTRACT tracts, using the full set of XTRACT tracts (with new cortical-subcortical tracts) allows for more specific mapping of homologous subcortical and cortical regions across humans and macaques. Is it possible that this result is driven by the fact that the "connectivity blueprints" for the subcortex did not use an intermediary GM x WM matrix to identify connection patterns, whereas the connectivity blueprints for the cortex did? I was surprised that a whole brain GM x WM connectivity matrix was used in the cortical connectivity mapping procedure, given known problems with false positives etc., when doing whole brain tractography - especially aHer such anatomical detail was considered when deriving the original tracts. Perhaps the intermediary step lowers connectivity specificity and accuracy overall (as per Figure 9), accounting for the poorer performance for cortico-cortical tracts?

      The point is well-taken, however it cannot drive the results in Figures 5 and 6. Before explaining this further, let us clarify the raIonale of using the GMxWM connecIvity matrix, which we have published quite extensively in the past for cortico-cortical connecIons (Mars, eLife 2018 - Warrington, Neuroimage 2020 - Roumazeilles, PLoS Biology 2020 - Warrington, Science Advances 2022 – Bryant, J Neuroscience 2025). 

      Having established the bodies of the tract using the XTRACT protocols, we use this intermediate step of multiplying with a GM x WM connectivity matrix to estimate the grey matter projections of the tracts. The most obvious approach of tracking towards the grey matter (i.e. simply find where tracts intersect GM) has the problem that one moves through bottlenecks in the cortical gyrus and after which fibres fan out. Most tractography algorithms have problems resolving this fanning. However, we take the opposite approach of tracking from the grey matter surface towards the white matter (GMxWM connectivity matrix), thus following the direction in which the fibres are expected to merge, rather than to fan out. We then multiply the GMxWM tractrogram with that of the body of the tract to identify the grey matter endpoints of the tract. This avoids some of the major problems associated with tracking towards the surface. In fact, using this approach improves connectivity specificity towards the cortex, rather than the opposite. We provide some indicative results here for a few tracts:

      Author response image 5.

      Connectivity profiles for example cortico-cortical tracts with and without using the intermediary GMxWM matrix. Tracts considered are the Superior Longitudinal Fasciculus 1 (SLF<sub>1</sub>), Superior Longitudinal Fasciculus 2 (SLF<sub>2</sub>), the Frontal Aslant (FA) and the Inferior Fronto-Occipital Fasciculus (IFO). We see that the surface connectivity patterns without using the GMxWM intermediary matrix are more diffuse (effect of “fanning out” gyral bias), with reduced specificity, compared to whenusing the GMxWM matrix

      Tracking to/from subcortical nuclei does not have the same tractography challenges as tracking towards the cortex and in fact we found that using the intermediary GMxWM matrix is less favourable for subcortex (Figure 9), which is why we opted for not using it. 

      Regardless of how cortical and subcortical connectivity patterns are obtained, the results in Figures 5 and 6 utilise only cortical connectivity patterns. Hence, no matter what tracts are considered (cortico-cortical or cortico-subcortical) to build the connectivity patterns, these results have been obtained by always using the intermediate step of multiplying with the GMxWM connectivity matrix (i.e. it is not the case that cortical features are obtained with the intermediate step and subcortical features without, all of them have the intermediate step applied, as the connectivity patterns comprise of cortical endpoints). Figure 9 is only applicable for subcortical endpoints that play no role in the comparisons shown in Figures 5 and 6. We hope this clarifies this point.

      (14) Methodological clarifications:

      The Methods describe how anatomical masks used in tractography were delineated in standard macaque space and then translated to humans using "correspondingly defined landmarks". Can the authors elaborate as to how this translation from macaques to humans was accomplished?

      For a given tract, our process for building a protocol involved looking into the wider anatomical literature, including the standard white matter atlas of Schmahmann and Pandya (2006) and numerous anatomy papers that are referenced in the protocol description, to determine the expected path the tract was meant to take in white matter and which cortical and subcortical regions are connected. This helped us define constraints and subsequently the corresponding masks. The masks were created through the combination of hand-drawn ROIs and standard space atlases. We firstly started with the macaque where tracer literature is more abundant, but, importantly, our protocol definitions have been designed such that the same protocol can be applied to the human and macaque brain. All choices were made with this aspect in mind, hence corresponding landmarks between the two brains were considered in the mask definition (for instance “the putamen”, “a sub-commissural white matter mask”, the “whole frontal pole” etc, as described in the protocol descriptions).

      The protocols have not been created by a single expert but have been collated from multiple experts (co-authors SA, SW, DF, KB, SH, SS drove this aspect) and the final definitions have been agreed upon by the authors. 

      (15) The article heavily utilizes spatial path distribution maps/normalized path distributions, yet does not describe precisely what these are and how they were generated. Can the authors provide more detail, along with the rationale for using these with Pearson's correlations to compare tracts across subjects (as opposed to, e.g., overlap sensitivity/specificity or the Jaccard coefficient)?

      We have now clarified in text how these plots are generated, particularly when compared using correlation values. We tried Jaccard indices on binarized masks of the tracts and these gave similar trends to the correlations reported in Figure 4 (i.e. higher similarities within that across cohorts). We however feel that correlations are better than Jaccard indices, as the latter assume binary masks, so they focus on spatial overlap ignoring the actual values of the path distributions, we hence kept correlations in the paper.

      Reviewing Editor Comments

      “The reviewers had broadly convergent comments and were enthusiastic about the work. As further detailed by Reviewer 3 (see below), if the authors choose to pursue revisions, there are several elements that have the potential to enhance impact.”

      Thank you, we have replied accordingly and aimed to address most of the comments of the Reviewers.   

      “Comparison to existing methods. How does this approach compare to other approaches cited by the authors?”

      Please see replies to Comment 2 of Reviewer 1 and Comment 7 of Reviewer 2. Briefly, we have now generated new results and clarified aspects in the text. 

      “Minimum data requirements. How broadly can this approach be used across scan variation? How does this impact data from individual participants? Displaying individual participants may help, in addition to group maps.”

      Please see replies to Comment 10 of Reviewer2 on minimum data requirements and individual parIcipants, as well as to Comment 3 of Reviewer 1 on the actual groups considered. Briefly, we have generated new figures and regenerated results using UKBiobank data. 

      Softare. What are the sofware requirements? Is the approach interoperable with other methods?”

      Please see Reply to Comment 9 of Reviewer 2. Our protocols can be used to guide tractography using other types of data as they comprise of guiding ROIs for a given tract. So, although we have not tested them beyond FSL-XTRACT, we believe they can be useful with other tractography packages as well, as there is nothing FSL-specific in these anatomically-informed recipes. 

      “Comparisons with tract tracing. To the degree possible, quantitative comparisons with tract tracing data would bolster confidence in the method.”

      Please see Replies to Comments 6 and 11 of Reviewer 2. Briefly, we appreciate the comment and it is something we would love to do, but there are no data readily available that would allow such quanItaIve comparison in a meaningful way. This is a known challenge in the tractography field, which is why NIH has invested in two 5 year Centres to address it. Our approach will provide a solid starIng point for opImising and comparing further cortico-subcortical tractography reconstructions against microscopy and tracers in the same animal and at scale.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In this study, Gu et al. employed novel viral strategies, combined with in vivo two-photon imaging, to map the tone response properties of two groups of cortical neurons in A1. The thalamocortical recipient (TR neurons) and the corticothalamic (CT neurons). They observed a clear tonotopic gradient among TR neurons but not in CT neurons. Moreover, CT neurons exhibited high heterogeneity of their frequency tuning and broader bandwidth, suggesting increased synaptic integration in these neurons. By parsing out different projecting-specific neurons within A1, this study provides insight into how neurons with different connectivity can exhibit different frequency response-related topographic organization.

      Strengths:

      This study reveals the importance of studying neurons with projection specificity rather than layer specificity since neurons within the same layer have very diverse molecular, morphological, physiological, and connectional features. By utilizing a newly developed rabies virus CSN-N2c GCaMP-expressing vector, the authors can label and image specifically the neurons (CT neurons) in A1 that project to the MGB. To compare, they used an anterograde trans-synaptic tracing strategy to label and image neurons in A1 that receive input from MGB (TR neurons).

      Weaknesses:

      Perhaps as cited in the introduction, it is well known that tonotopic gradient is well preserved across all layers within A1, but I feel if the authors want to highlight the specificity of their virus tracing strategy and the populations that they imaged in L2/3 (TR neurons) and L6 (CT neurons), they should perform control groups where they image general excitatory neurons in the two depths and compare to TR and CT neurons, respectively. This will show that it's not their imaging/analysis or behavioral paradigms that are different from other labs. 

      We thank the reviewer for these constructive suggestions. As recommended, we have performed control experiments that imaged the general excitatory neurons in superficial layers (shown below), and the results showed a clear tonotopic gradient, which was consistent with previous findings (Bandyopadhyay et al., 2010; Romero et al., 2020; Rothschild et al., 2010; Tischbirek et al., 2019), thereby validating the reliability of our imaging/analysis approach. The results are presented in a new supplemental figure (Figure 2- figure supplementary 3).

      Related publications:

      (1) Gu M, Li X, Liang S, Zhu J, Sun P, He Y, Yu H, Li R, Zhou Z, Lyu J, Li SC, Budinger E, Zhou Y, Jia H, Zhang J, Chen X. 2023. Rabies virus-based labeling of layer 6 corticothalamic neurons for two-photon imaging in vivo. iScience 26: 106625. DIO: https://doi.org/10.1016/j.isci.2023.106625, PMID: 37250327

      (2) Bandyopadhyay S, Shamma SA, Kanold PO. 2010. Dichotomy of functional organization in the mouse auditory cortex. Nat Neurosci 13: 361-8. DIO: https://doi.org/10.1038/nn.2490, PMID: 20118924

      (3) Romero S, Hight AE, Clayton KK, Resnik J, Williamson RS, Hancock KE, Polley DB. 2020. Cellular and Widefield Imaging of Sound Frequency Organization in Primary and Higher Order Fields of the Mouse Auditory Cortex. Cerebral Cortex 30: 1603-1622. DIO: https://doi.org/10.1093/cercor/bhz190, PMID: 31667491

      (4) Rothschild G, Nelken I, Mizrahi A. 2010. Functional organization and population dynamics in the mouse primary auditory cortex. Nat Neurosci 13: 353-60. DIO: https://doi.org/10.1038/nn.2484, PMID: 20118927

      (5) Tischbirek CH, Noda T, Tohmi M, Birkner A, Nelken I, Konnerth A. 2019. In Vivo Functional Mapping of a Cortical Column at Single-Neuron Resolution. Cell Rep 27: 1319-1326 e5. DIO: https://doi.org/10.1016/j.celrep.2019.04.007, PMID: 31042460

      Figures 1D and G, the y-axis is Distance from pia (%). I'm not exactly sure what this means. How does % translate to real cortical thickness?

      We thank the reviewer for this question. The distance of labeled cells from pia was normalized to the entire distance from pia to L6/WM border for each mouse, according to the previous study (Chang and Kawai, 2018). For all mice tested, the entire distance from pia to L6/WM border was 826.5 ± 23.4 mm (in the range of 752.9 to 886.1).

      Related publications:

      Chang M, Kawai HD. 2018. A characterization of laminar architecture in mouse primary auditory cortex. Brain Structure and Function 223: 4187-4209. DIO: https://doi.org/10.1007/s00429-018-1744-8, PMID: 30187193

      For Figure 2G and H, is each circle a neuron or an animal? Why are they staggered on top of each other on the x-axis? If the x-axis is the distance from caudal to rostral, each neuron should have a different distance? Also, it seems like it's because Figure 2H has more circles, which is why it has more variation, thus not significant (for example, at 600 or 900um, 2G seems to have fewer circles than 2H). 

      We sincerely appreciate the reviewer’s careful attention to the details of our figures. Each circle in the Figure 2G and H represents an individual imaging focal plane from different animals, and the median BF of some focal planes may be similar, leading to partial overlap. In the regions where overlap occurs, the brightness of the circle will be additive.

      Since fewer CT neurons, compared to TR neurons, responded to pure tones within each focal plane, as shown in Figure 2- figure supplementary 2, a larger number of focal planes were imaged to ensure a consistent and robust analysis of the pure tone response characteristics. The higher variance and lack of correlation in CT neurons is a key biological finding, not an artifact of sample size. The data clearly show a wide spread of median BFs at any given location for CT neurons, a feature absent in the TR population.

      Similarly, in Figures 2J and L, why are the circles staggered on the y-axis now? And is each circle now a neuron or a trial? It seems they have many more circles than Figure 2G and 2H. Also, I don't think doing a correlation is the proper stats for this type of plot (this point applies to Figures 3H and 3J).

      We regret any confusion have caused. In fact, Figure 2 illustrates the tonotopic gradient of CT and TR neurons at different scales. Specifically, Figures 2E-H present the imaging from the focal plane perspective (23 focal planes in Figures 2G, 40 focal planes in Figures 2H), whereas Figures 2I-L provide a more detailed view at the single-cell level (481 neurons in Figures 2J, 491 neurons in Figures 2L). So, Figures 2J and L do indeed have more circles than Figures 2G and H. The analysis at these varying scales consistently reveals the presence of a tonotopic gradient in TR neurons, whereas such a gradient is absent in CT neurons.

      We used Pearson correlation as a standard and direct method to quantify the linear relationship between a neuron's anatomical position and its frequency preference, which is widely used in the field to provide a quantitative measure (R-value) and a significance level (p-value) for the strength of a tonotopic gradient. The same statistical logic applies to testing for spatial gradients in local heterogeneity in Figure 3. We are confident that this is an appropriate and informative statistical approach for these data.

      What does the inter-quartile range of BF (IQRBF, in octaves) imply? What's the interpretation of this analysis? I am confused as to why TR neurons show high IQR in HF areas compared to LF areas, which means homogeneity among TR neurons (lines 213 - 216). On the same note, how is this different from the BF variability?  Isn't higher IQR equal to higher variability?

      We thank the reviewer for raising this important point. IQRBF, is a measure of local tuning heterogeneity. It quantifies the diversity of BFs among neighboring neurons. A small IQRBF means neighbors are similarly tuned (an orderly, homogeneous map), while a large IQRBF means neighbors have very different BFs (a disordered, heterogeneous map). (Winkowski and Kanold, 2013; Zeng et al., 2019).

      From the BF position reconstruction of all TR neurons (Figures 2I), most TR neurons respond to high-frequency sounds in the high-frequency (HF) region, but some neurons respond to low frequencies such as 2 kHz, which contributes to high IQR in HF areas. This does not contradict our main conclusion, that the TR neurons is significantly more homogeneous than the CT neurons. BF variability represents the stability of a neuron's BF over time, while IQR represents the variability of BF among different neurons within a certain range. (Chambers et al., 2023).

      Related publications:

      (1) Chambers AR, Aschauer DF, Eppler JB, Kaschube M, Rumpel S. 2023. A stable sensory map emerges from a dynamic equilibrium of neurons with unstable tuning properties. Cerebral Cortex 33: 5597-5612. DIO: https://doi.org/10.1093/cercor/bhac445, PMID: 36418925

      (2) Winkowski DE, Kanold PO. 2013. Laminar transformation of frequency organization in auditory cortex. Journal of Neuroscience 33: 1498-508. DIO: https://doi.org/10.1523/JNEUROSCI.3101-12.2013, PMID: 23345224

      (3) Zeng HH, Huang JF, Chen M, Wen YQ, Shen ZM, Poo MM. 2019. Local homogeneity of tonotopic organization in the primary auditory cortex of marmosets. Proceedings of the National Academy of Sciences of the United States of America 116: 3239-3244. DIO: https://doi.org/10.1073/pnas.1816653116, PMID: 30718428

      Figure 4A-B, there are no clear criteria on how the authors categorize V, I, and O shapes. The descriptions in the Methods (lines 721 - 725) are also very vague.

      We apologize for the initial vagueness and have replaced the descriptions in the Methods section. “V-shaped”: Neurons whose FRAs show decreasing frequency selectivity with increasing intensity. “I-shaped”: Neurons whose FRAs show constant frequency selectivity with increasing intensity. “O-shaped”: Neurons responsive to a small range of intensities and frequencies, with the peak response not occurring at the highest intensity level.

      To provide better visual intuition, we show multiple representative examples of each FRA type for both TR and CT neurons below. We are confident that these provide the necessary clarity and reproducibility for our analysis of receptive field properties.

      Author response image 1.

      Different FRA types within the dataset of TR and CT neurons. Each row shows 6 representative FRAs from a specific type. Types are V-shaped (‘V'), I-shaped (‘I’), and O-shaped (‘O’). The X-axis represents 11 pure tone frequencies, and the Y-axis represents 6 sound intensities.

      Reviewer #2 (Public Review):

      Summary:

      Gu and Liang et. al investigated how auditory information is mapped and transformed as it enters and exits an auditory cortex. They use anterograde transsynaptic tracers to label and perform calcium imaging of thalamorecipient neurons in A1 and retrograde tracers to label and perform calcium imaging of corticothalamic output neurons. They demonstrate a degradation of tonotopic organization from the input to output neurons.

      Strengths:

      The experiments appear well executed, well described, and analyzed.

      Weaknesses:

      (1) Given that the CT and TR neurons were imaged at different depths, the question as to whether or not these differences could otherwise be explained by layer-specific differences is still not 100% resolved. Control measurements would be needed either by recording (1) CT neurons in upper layers, (2) TR in deeper layers, (3) non-CT in deeper layers and/or (4) non-TR in upper layers.

      We appreciate these constructive suggestions. To address this, we performed new experiments and analyses.

      Comparison of TR neurons across superficial layers: we analyzed our existing TR neuron dataset to see if response properties varied by depth within the superficial layers. We found no significant differences in the fraction of tuned neurons, field IQR, or maximum bandwidth (BWmax) between TR neurons in L2/3 and L4. This suggests a degree of functional homogeneity within the thalamorecipient population across these layers. The results are presented in new supplemental figures (Figure 2- figure supplementary 4).

      Necessary control experiments.

      (1) CT neurons in upper layers. CT neurons are thalamic projection neurons that only exist in the deeper cortex, so CT neurons do not exist in upper layers (Antunes and Malmierca, 2021).

      (2) TR neurons in deeper layers. As we mentioned in the manuscript, due to high-titer AAV1-Cre virus labeling controversy (anterograde and retrograde labelling both exist), it is challenging to identify TR neurons in deeper layers.

      (3) non-CT in deeper layers and/or (4) non-TR in upper layers.

      To directly test if projection identity confers distinct functional properties within the same cortical layers, we performed the crucial control of comparing TR neurons to their neighboring non-TR neurons. We injected AAV1-Cre in MGB and a Cre-dependent mCherry into A1 to label TR neurons red. We then co-injected AAV-CaMKII-GCaMP6s to label the general excitatory population green.  In merged images, this allowed us to functionally image and directly compare TR neurons (yellow) and adjacent non-TR neurons (green). We separately recorded the responses of these neurons to pure tones using two-photon imaging. The results show that TR neurons are significantly more likely to be tuned to pure tones than their neighboring non-TR excitatory neurons. This finding provides direct evidence that a neuron's long-range connectivity, and not just its laminar location, is a key determinant of its response properties. The results are presented in new supplemental figures (Figure 2- figure supplementary 5).

      Related publications:

      Antunes FM, Malmierca MS. 2021. Corticothalamic Pathways in Auditory Processing: Recent Advances and Insights From Other Sensory Systems. Front Neural Circuits 15: 721186. DIO: https://doi.org/10.3389/fncir.2021.721186, PMID: 34489648

      (2) What percent of the neurons at the depths are CT neurons? Similar questions for TR neurons?

      We thank the reviewer for the comments. We performed histological analysis on brain slices from our experimental animals to quantify the density of these projection-specific populations. Our analysis reveals that CT neurons constitute approximately 25.47%\22.99%–36.50% of all neurons in Layer 6 of A1. In the superficial layers(L2/3 and L4), TR neurons comprise approximately 10.66%\10.53%–11.37% of the total neuronal population.

      Author response image 2.

      The fraction of CT and TR neurons. (A) Boxplots showing the fraction of CT neurons. N = 11 slices from 4 mice. (B) Boxplots showing the fraction of TR neurons. N = 11 slices from 4 mice.

      (3) V-shaped, I-shaped, or O-shaped is not an intuitively understood nomenclature, consider changing. Further, the x/y axis for Figure 4a is not labeled, so it's not clear what the heat maps are supposed to represent.

      The terms "V-shaped," "I-shaped," and "O-shaped" are an established nomenclature in the auditory neuroscience literature for describing frequency response areas (FRAs), and we use them for consistency with prior work. V-shaped: Neurons whose FRAs show decreasing frequency selectivity with increasing intensity. I-shaped: Neurons whose FRAs show constant frequency selectivity with increasing intensity. O-shaped: Neurons responsive to a small range of intensities and frequencies, with the peak response not occurring at the highest intensity level.

      (Rothschild et al., 2010). We have included a more detailed description in the Methods.

      The X-axis represents 11 pure tone frequencies, and the Y-axis represents 6 sound intensities. So, the heat map represents the FRA of neurons in A1, reflecting the responses for different frequencies and intensities of sound stimuli. In the revised manuscript, we have provided clarifications in the figure legend.

      (4) Many references about projection neurons and cortical circuits are based on studies from visual or somatosensory cortex. Auditory cortex organization is not necessarily the same as other sensory areas. Auditory cortex references should be used specifically, and not sources reporting on S1, and V1.

      We thank the reviewers for their valuable comments. We have made a concerted effort to ensure that claims about cortical circuit organization are supported by findings specifically from the auditory cortex wherever possible, strengthening the focus and specificity of our discussion.

      Reviewer #3 (Public Review):

      Summary:

      The authors performed wide-field and 2-photon imaging in vivo in awake head-fixed mice, to compare receptive fields and tonotopic organization in thalamocortical recipient (TR) neurons vs corticothalamic (CT) neurons of mouse auditory cortex. TR neurons were found in all cortical layers while CT neurons were restricted to layer 6. The TR neurons at nominal depths of 200-400 microns have a remarkable degree of tonotopy (as good if not better than tonotopic maps reported by multiunit recordings). In contrast, CT neurons were very heterogenous in terms of their best frequency (BF), even when focusing on the low vs high-frequency regions of the primary auditory cortex. CT neurons also had wider tuning.

      Strengths:

      This is a thorough examination using modern methods, helping to resolve a question in the field with projection-specific mapping.

      Weaknesses:

      There are some limitations due to the methods, and it's unclear what the importance of these responses are outside of behavioral context or measured at single timepoints given the plasticity, context-dependence, and receptive field 'drift' that can occur in the cortex.

      (1) Probably the biggest conceptual difficulty I have with the paper is comparing these results to past studies mapping auditory cortex topography, mainly due to differences in methods. Conventionally, the tonotopic organization is observed for characteristic frequency maps (not best frequency maps), as tuning precision degrades and the best frequency can shift as sound intensity increases. The authors used six attenuation levels (30-80 dB SPL) and reported that the background noise of the 2-photon scope is <30 dB SPL, which seems very quiet. The authors should at least describe the sound-proofing they used to get the noise level that low, and some sense of noise across the 2-40 kHz frequency range would be nice as a supplementary figure. It also remains unclear just what the 2-photon dF/F response represents in terms of spikes. Classic mapping using single-unit or multi-unit electrodes might be sensitive to single spikes (as might be emitted at characteristic frequency), but this might not be as obvious for Ca2+ imaging. This isn't a concern for the internal comparison here between TR and CT cells as conditions are similar, but is a concern for relating the tonotopy or lack thereof reported here to other studies.

      We sincerely thank the reviewer for the thoughtful evaluation of our manuscript and for your positive assessment of our work.

      (1)  Concern regarding Best Frequency (BF) vs. Characteristic Frequency (CF)

      Our use of BF, defined as the frequency eliciting the highest response averaged across all sound levels, is a standard and practical approach in 2-photon Ca²⁺ imaging studies. (Issa et al., 2014; Rothschild et al., 2010; Schmitt et al., 2023; Tischbirek et al., 2019). This method is well-suited for functionally characterizing large numbers of neurons simultaneously, where determining a precise firing threshold for each individual cell can be challenging.

      (2) Concern regarding background noise of the 2-photon setup

      We have expanded the Methods section ("Auditory stimulation") to include a detailed description of the sound-attenuation strategies used during the experiments. The use of a custom-built, double-walled sound-proof enclosure lined with wedge-shaped acoustic foam was implemented to significantly reduce external noise interference. These strategies ensured that auditory stimuli were delivered under highly controlled, low-noise conditions, thereby enhancing the reliability and accuracy of the neural response measurements obtained throughout the study.

      (3) Concern regarding the relationship between dF/F and spikes

      While Ca²⁺ signals are an indirect and filtered representation of spiking activity, they are a powerful tool for assessing the functional properties of genetically-defined cell populations. As you note, the properties and limitations of Ca²⁺ imaging apply equally to both the TR and CT neuron groups we recorded. Therefore, the profound difference we observed—a clear tonotopic gradient in one population and a lack thereof in the other—is a robust biological finding and not a methodological artifact.

      Related publications:

      (1) Issa JB, Haeffele BD, Agarwal A, Bergles DE, Young ED, Yue DT. 2014. Multiscale optical Ca2+ imaging of tonal organization in mouse auditory cortex. Neuron 83: 944-59. DIO: https://doi.org/10.1016/j.neuron.2014.07.009, PMID: 25088366

      (2) Rothschild G, Nelken I, Mizrahi A. 2010. Functional organization and population dynamics in the mouse primary auditory cortex. Nat Neurosci 13: 353-60. DIO: https://doi.org/10.1038/nn.2484, PMID: 20118927

      (3) Schmitt TTX, Andrea KMA, Wadle SL, Hirtz JJ. 2023. Distinct topographic organization and network activity patterns of corticocollicular neurons within layer 5 auditory cortex. Front Neural Circuits 17: 1210057. DIO: https://doi.org/10.3389/fncir.2023.1210057, PMID: 37521334

      (4) Tischbirek CH, Noda T, Tohmi M, Birkner A, Nelken I, Konnerth A. 2019. In Vivo Functional Mapping of a Cortical Column at Single-Neuron Resolution. Cell Rep 27: 1319-1326 e5. DIO: https://doi.org/10.1016/j.celrep.2019.04.007, PMID: 31042460

      (2) It seems a bit peculiar that while 2721 CT neurons (N=10 mice) were imaged, less than half as many TR cells were imaged (n=1041 cells from N=5 mice). I would have expected there to be many more TR neurons even mouse for mouse (normalizing by number of neurons per mouse), but perhaps the authors were just interested in a comparison data set and not being as thorough or complete with the TR imaging?

      As shown in the Figure 2- figure supplementary 2, a much higher fraction of TR neurons was "tuned" to pure tones (46% of 1041 neurons) compared to CT neurons (only 18% of 2721 neurons). To obtain a statistically robust and comparable number of tuned neurons for our core analysis (481 tuned TR neurons vs. 491 tuned CT neurons), it was necessary to sample a larger total population of CT neurons, which required imaging from more animals.

      (3) The authors' definitions of neuronal response type in the methods need more quantitative detail. The authors state: "Irregular" neurons exhibited spontaneous activity with highly variable responses to sound stimulation. "Tuned" neurons were responsive neurons that demonstrated significant selectivity for certain stimuli. "Silent" neurons were defined as those that remained completely inactive during our recording period (> 30 min). For tuned neurons, the best frequency (BF) was defined as the sound frequency associated with the highest response averaged across all sound levels.". The authors need to define what their thresholds are for 'highly variable', 'significant', and 'completely inactive'. Is best frequency the most significant response, the global max (even if another stimulus evokes a very close amplitude response), etc.

      We appreciate the reviewer's suggestions. We have added more detailed description in the Methods.

      Tuned neurons: A responsive neuron was further classified as "Tuned" if its responses showed significant frequency selectivity. We determined this using a one-way ANOVA on the neuron's response amplitudes across all tested frequencies (at the sound level that elicited the maximal response). If the ANOVA yielded a p-value < 0.05, the neuron was considered "Tuned”. Irregular neurons: Responsive neurons that did not meet the statistical criterion for being "Tuned" (i.e., ANOVA p-value ≥ 0.05) were classified as "Irregular”. This provides a clear, mutually exclusive category for sound-responsive but broadly-tuned or non-selective cells. Silent neurons: Neurons that were not responsive were classified as "Silent". This quantitatively defines them as cells that showed no significant stimulus-evoked activity during the entire recording session. Best frequency (BF): It is the frequency that elicited the maximal mean response, averaged across all sound levels.

      To provide greater clarity, we showed examples in the following figures.

      Author response image 3.

      Reviewer #1 (Recommendations For The Authors):

      (1) A1 and AuC were used exchangeably in the text.

      Thank you for pointing out this issue. Our terminological strategy was to remain faithful to the original terms used in the literature we cite, where "AuC" is often used more broadly. In the revised manuscript, we have performed a careful edit to ensure that we use the specific term "A1" (primary auditory cortex) when describing our own results and recording locations, which were functionally and anatomically confirmed.

      (2) Grammar mistakes throughout.

      We are grateful for the reviewer’s suggested improvement to our wording. The entire manuscript has undergone a thorough professional copyediting process to correct all grammatical errors and improve overall readability.

      (3) The discussion should talk more about how/why L6 CT neurons don't possess the tonotopic organization and what are the implications. Currently, it only says 'indicative of an increase in synaptic integration during cortical processing'...

      Thanks for this suggestion. We have substantially revised and expanded the Discussion section to explore the potential mechanisms and functional implications of the lack of tonotopy in L6 CT neurons.

      Broad pooling of inputs: We propose that the lack of tonotopy is an active computation, not a passive degradation. CT neurons likely pool inputs from a wide range of upstream neurons with diverse frequency preferences. This broad synaptic integration, reflected in their wider tuning bandwidth, would actively erase the fine-grained frequency map in favor of creating a different kind of representation.

      A shift from topography to abstract representation: This transformation away from a classic sensory map may be critical for the function of corticothalamic feedback. Instead of relaying "what" frequency was heard, the descending signal from CT neurons may convey more abstract, higher-order information, such as the behavioral relevance of a sound, predictions about upcoming sounds, or motor-related efference copy signals that are not inherently frequency-specific.’

      Modulatory role of the descending pathway: The descending A1-to-MGB pathway is often considered to be modulatory, shaping thalamic responses rather than driving them directly. A modulatory signal designed to globally adjust thalamic gain or selectivity may not require, and may even be hindered by, a fine-grained topographical organization.

      Reviewer #2 (Recommendations For The Authors):

      (1) Given that the CT and TR neurons were imaged at different depths, the question as to whether or not these differences could otherwise be explained by layer-specific differences is still not 100% resolved. Control measurements would be needed either by recording (1) CT neurons in upper layers (2) TR in deeper layers (3) non-CT in deeper layers and/or (4) non-TR in upper layers.

      We appreciate these constructive suggestions. To address this, we performed new experiments and analyses.

      Comparison of TR neurons across superficial layers: we analyzed our existing TR neuron dataset to see if response properties varied by depth within the superficial layers. We found no significant differences in the fraction of tuned neurons, field IQR, or maximum bandwidth (BWmax) between TR neurons in L2/3 and L4. This suggests a degree of functional homogeneity within the thalamorecipient population across these layers.

      Necessary control experiments.

      (1) CT neurons in upper layers. CT neurons are thalamic projection neurons that only exist in the deeper cortex, so CT neurons do not exist in upper layers (Antunes and Malmierca, 2021).

      (2) TR neurons in deeper layers. As we mentioned in the manuscript, due to high-titer AAV1-Cre virus labeling controversy (anterograde and retrograde labelling both exist), it is challenging to identify TR neurons in deeper layers.

      (3) non-CT in deeper layers and/or (4) non-TR in upper layers.

      To directly test if projection identity confers distinct functional properties within the same cortical layers, we performed the crucial control of comparing TR neurons to their neighboring non-TR neurons. We injected AAV1-Cre in MGB and a Cre-dependent mCherry into A1 to label TR neurons red. We then co-injected AAV-CaMKII-GCaMP6s to label the general excitatory population green.  In merged images, this allowed us to functionally image and directly compare TR neurons (yellow) and adjacent non-TR neurons (green). We separately recorded the responses of these neurons to pure tones using two-photon imaging. The results show that TR neurons are significantly more likely to be tuned to pure tones than their neighboring non-TR excitatory neurons. This finding provides direct evidence that a neuron's long-range connectivity, and not just its laminar location, is a key determinant of its response properties.

      Related publications:

      Antunes FM, Malmierca MS. 2021. Corticothalamic Pathways in Auditory Processing: Recent Advances and Insights From Other Sensory Systems. Front Neural Circuits 15: 721186. DIO: https://doi.org/10.3389/fncir.2021.721186, PMID: 34489648

      (3) V-shaped, I-shaped, or O-shaped is not an intuitively understood nomenclature, consider changing. Further, the x/y axis for Figure 4a is not labeled, so it's not clear what the heat maps are supposed to represent.

      The terms "V-shaped," "I-shaped," and "O-shaped" are an established nomenclature in the auditory neuroscience literature for describing frequency response areas (FRAs), and we use them for consistency with prior work. V-shaped: Neurons whose FRAs show decreasing frequency selectivity with increasing intensity. I-shaped: Neurons whose FRAs show constant frequency selectivity with increasing intensity. O-shaped: Neurons responsive to a small range of intensities and frequencies, with the peak response not occurring at the highest intensity level.

      (Rothschild et al., 2010). We have included a more detailed description in the Methods.

      The X-axis represents 11 pure tone frequencies, and the Y-axis represents 6 sound intensities. So, the heat map represents the FRA of neurons in A1, reflecting the responses for different frequencies and intensities of sound stimuli. In the revised manuscript, we have provided clarifications in the figure legend.

      (4) Many references about projection neurons and cortical circuits are based on studies from visual or somatosensory cortex. Auditory cortex organization is not necessarily the same as other sensory areas. Auditory cortex references should be used specifically, and not sources reporting on S1, V1.

      We thank the reviewers for their valuable comments. We have made a concerted effort to ensure that claims about cortical circuit organization are supported by findings specifically from the auditory cortex wherever possible, strengthening the focus and specificity of our discussion.

      Reviewer #3 (Recommendations For The Authors):

      I suggest showing some more examples of how different neurons and receptive field properties were quantified and statistically analyzed. Especially in Figure 4, but really throughout.

      We thank the reviewer for this valuable suggestion. To provide greater clarity, we have added more examples in the following figure.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary 

      The authors describe a method for gastruloid formation using mouse embryonic stem cells (mESCs) to study YS and AGM-like hematopoietic differentiation. They characterise the gastruloids during nine days of differentiation using a number of techniques including flow cytometry and single-cell RNA sequencing. They compare their findings to a published data set derived from E10-11.5 mouse AGM. At d9, gastruloids were transplanted under the adrenal gland capsule of immunocompromised mice to look for the development of cells capable of engrafting the mouse bone marrow. The authors then applied the gastruloid protocol to study overexpression of Mnx1 which causes infant AML in humans.

      In the introduction, the authors define their interpretation of the different waves of hematopoiesis that occur during development. 'The subsequent wave, known as definitive, produces: first, oligopotent erythro-myeloid progenitors (EMPs) in the YS (E8-E8.5); and later myelo-lymphoid progenitors (MLPs - E9.5-E10), multipotent progenitors (MPPs - E10-E11.5), and hematopoietic stem cells (HSCs - E10.5-E11.5), in the aorta-gonad-mesonephros (AGM) region of the embryo proper.' Herein they designate the yolk sac-derived wave of EMP hematopoiesis as definitive, according to convention, although paradoxically it does not develop from intra-embryonic mesoderm or give rise to HSCs.

      Our definition of primitive and definitive waves is widely used in the field (e.g. PMID: 18204427; PMID: 28299650; PMID: 33681211). Definitive haematopoiesis, encompassing EMP, MLP, MPP and HSC, highlights their origin from haemogenic endothelium, generation of mature cells with adult characteristics from progenitors with multilineage potential and direct and indirect developmental contributions to the intra-embryonic and time-restricted generation of HSCs. 

      General comments 

      The authors make the following claims in the paper: 

      (1) The development of a protocol for hemogenic gastruloids (hGx) that recapitulates YS and AGMlike waves of blood from HE.

      (2) The protocol recapitulates both YS and EMP-MPP embryonic blood development 'with spatial and temporal accuracy'.

      (3) The protocol generates HSC precursors capable of short-term engraftment in an adrenal niche.

      (4) Overexpression of MNX1 in hGx transforms YS EMP to 'recapitulate patient transcriptional signatures'.

      (5) hGx is a model to study normal and leukaemic embryonic hematopoiesis. 

      There are major concerns with the manuscript. The statements and claims made by the authors are not supported by the data presented, data is overinterpreted, and the conclusions cannot be justified. Furthermore, the data is presented in a way that makes it difficult for the reader to follow the narrative, causing confusion. The authors have not discussed how their hGx compares to the previously published mouse embryoid body protocols used to model early development and hematopoiesis. Specific points 

      (1) It is claimed that HGxs capture cellularity and topography of developmental blood formation. The hGx protocol described in the manuscript is a modification of a previously published gastruloid protocol (Rossi et al 2022). The rationale for the protocol modifications is not fully explained or justified. There is a lack of novelty in the presented protocol as the only modifications appear to be the inclusion of Activin A and an extension of the differentiation period from 7 to 9 days of culture. No direct comparison has been made between the two versions of gastruloid differentiation to justify the changes.

      The Reviewer paradoxically claims that the protocol is not novel and that it differs from a previous publication in at least 2 ways – the patterning pulse and the length of the protocol. Of these, the patterning pulse is key. As documented in Fig. 1S1, we cannot obtain Flk1-GFP expression in the absence of Activin A (Fig. 1S1A), and the concentration of Activin A scales activity of the Flk1 locus (Fig. 1S1B). Expression of Flk1 is a fundamental step in haemato-endothelial specification and, accordingly, we do not see CD41 or CD45+ cells in the absence of Activin A. Furthermore, these markers also titrate with the dose of Activin A (in Fig. 1S1B).

      Also, in our hands, there is a clear time-dependent progression of marker expression, with sequential acquisition of CD41 and CD45, with the latter not detectable until 192h (Fig. 1C-D), another key difference relative to the Rossi et al (2022) protocol. We suggest, and present further evidence for in this rebuttal and the revised manuscript, that the 192h-timepoint captures the onset of AGM-like haematopoiesis. We have edited the manuscript to clarify the differences and novelty in our protocol (lines 132-143) and provided a more detailed comparison with the report from Rossi et al. (2022) in the Discussion (lines 574-586).

      The inclusion of Activin A at high concentration at the beginning of differentiation would be expected to pattern endoderm rather than mesoderm. BMP signaling is required to induce Flk1+ mesoderm, even in the presence of Wnt.

      Again, we call the Reviewer’s attention to Fig. 1S1A which clearly shows that Activin A (with no BMP added) is required for induction of Flk1 expression, in the presence of Wnt. Activin A in combination with Wnt, is used in other protocols of haemato-endothelial differentiation from pluripotent cells, with no BMP added in the same step of patterning and differentiation (PMID: 39227582; PMID: 39223325). In the latter protocol, we also call the Reviewer’s attention to the fact that a higher concentration of Activin A precludes the need for BMP4 addition. Finally, one of us has recently reported that Activin A, on its own, will induce Flk1, as well as other anterior mesodermal progenitors (https://www.biorxiv.org/content/10.1101/2025.01.11.632562v1). In addressing the Reviewer’s concerns with the dose of Activin A used, we titrated its concentration against activation of Flk1, confirming optimal Flk1-GFP expression at the 100ng/ml dose used in the manuscript. We have included this data in the manuscript in Figure 1S1B.                         

      FACS analysis of the hGx during differentiation is needed to demonstrate the co-expression of Flk1GFP and lineage markers such as CD34 to indicate patterning of endothelium from Flk1+ mesoderm. The FACS plots in Fig. 1 show C-Kit expression but very little VE-cadherin which suggests that CD34 is not induced. Early endoderm expresses C-Kit, CXCR4, and Epcam, but not CD34 which could account for the lack of vascular structures within the hGx as shown in Fig. 1E.

      We were surprised by the Reviewer’s comment that there are no endothelial structures in our haemogenic gastruloids. The presence of a Flk1-GFP+ network is visible in the GFP images in Fig. 1B, from 144h onwards, and is detailed in the revised Fig. 2A, which shows overlap between Flk1GFP and the endothelial marker CD31. In addition, our single-cell RNA-seq data, included in the manuscript, confirms the presence of endothelial cells with a developing endothelial, including arterial, programme. This is now presented in the revised Fig. 3B-D of the manuscript, which updates a representation in the original manuscript. In contrast with the Reviewer’s claims that no endothelial cells are formed, the data show that Kdr (Flk1)+ cells co-express Cdh5/VE-Cadherin and indeed Cd34, attesting to the presence of an endothelial programme. Arterial markers Efnb2, Flt1, and Dll4 are present. A full-blown programme, which also includes haemogenic markers including Sox17, Esam, Cd44 and Mecom is clear at early (144h) and, particularly at late (192h) timepoints in cells sorted on detection of surface C-Kit (Fig. 3B-E in the manuscript). To address the specific point by the Reviewer, we also document co-expression of Flk1-GFP, CD34 and/or CD31 by flow cytometry (Fig. 2S1A-B in the revised manuscript).

      To summarise new and revised data in the manuscript in relation to this point:

      Immunofluorescence staining showing the Flk1-GFP-defined vascular network in Figure 1E and co-expression of endothelial marker CD31 in Figure 2A. In text: lines 159-163; 178-180.

      Flow cytometry analysis of co-expression of Flk1-GFP with CD31 and CD34 in Figure 2S1AD, including controls. In text: 180-187.

      Real-time quantitative (q)PCR analysis showing time-dependent expression of haematoendothelial and arterial markers in Figure 2F (specifically Dll4 and Mecom). In text: 200-209.

      An improved representation of our scRNA-seq data highlighting key haemato-endothelial markers in Figure 3B-D. In text: 268-304

      (2) The protocol has been incompletely characterised, and the authors have not shown how they can distinguish between either wave of Yolk Sac (YS) hematopoiesis (primitive erythroid/macrophage and erythro-myeloid EMP) or between YS and intraembryonic Aorta-Gonad-Mesonephros (AGM) hematopoiesis. No evidence of germ layer specification has been presented to confirm gastruloid formation, organisation, and functional ability to mimic early development. Furthermore, differentiation of YS primitive and YS EMP stages of development in vitro should result in the efficient generation of CD34+ endothelial and hematopoietic cells. There is no flow cytometry analysis showing the kinetics of CD34 cell generation during differentiation. Benchmarking the hGx against developing mouse YS and embryo data sets would be an important verification. 

      The Reviewer is correct that we have not provided detailed characterisation of the different germ layers, as this was not the focus of the study. In that context, we were surprised by the earlier comment assuming co-expression of C-Kit, Cxcr4 and Epcam, which we did not show, while overlooking the endothelial programme reiterated above, which we have presented. Given our focus on haemato-endothelial specification, we have started the single-cell RNA-seq characterisation of the haemogenic gastruloid at 120h and have not looked specifically at earlier timepoints of embryo patterning. This said, we show the presence of neuroectodermal cells in cluster 9; on the other hand, cluster 7 includes hepatoblast-like cells, denoting endodermal specification (Supplementary File S2). However, in the absence of earlier timepoints and given the bias towards mesodermal specification, we expect that specification of ectodermal and endodermal programmes may be incomplete. 

      In respect of the contention regarding the capture of YS-like and AGM-like haematopoiesis, we had presented evidence in the original version of the manuscript that haemogenic cells generated during gastruloid differentiation, particularly at late 192h and 216h timepoints project onto highly purified CKit+ CD31+ Gfi1-expressing cells from mouse AGM (PMID: 38383534), providing support for at least partial recapitulation of the corresponding developmental stage. These projections are represented in Fig. 4A, right and 4S1C of the revised manuscript. In distinguishing between YS-like and AGM-like haematopoiesis, we call the Reviewer’s attention to the replotting of the single-cell RNA-seq data already in the manuscript, which we provided in response to point 1 (Fig. 3B-D and 3S2B), which highlights an increase in Sox17, but not Sox18, expression in the 192h haemogenic endothelium, which suggests an association with AGM haematopoiesis (PMID: 20228271). A significant association of Cd44 and Procr expression with the same time-point (Fig. 3B-D in the manuscript), further supports an AGM-like endothelial-to-haematopoietic transition at the 192h timepoint. We have re-analysed the scRNA-seq data to better represent the expression of these markers in Fig. 3A-E and S32B. We agree that it remains challenging to identify markers exclusive to AGM haematopoiesis, which is operationally equated with generation of transplantable haematopoietic stem cells. While HSC generation is a key event characteristic of the AGM, not all AGM haematopoiesis corresponds to HSCs, an important point in evaluating the data presented in the manuscript, and one that is acknowledged by us. The main text has been edited to clarify the experiments pertaining to distinguishing AGM and YS haematopoiesis, which are detailed in lines 180-187, 200-221, 268-304, and 315-356.

      Following on the Reviewer’s comments about Cd34, we also inspected co-expression of Cd34 with Cd41 and Cd45, the latter co-expression present in, although not necessarily exclusive to, AGM haematopoiesis. Reassuringly, we observed clear co-expression with both markers (Author response image 1), in addition to a CD41+CD34- population, which likely reflects YS EMP-independent erythropoiesis. Flow cytometry analysis of co-expression of CD31 and CD34 in CD41+ and CD45+ populations at 144h and 216h timepoints has been included in Fig. 2B-D, Fig. 2S1A-D, including controls. In text: 180-187. We have earlier on in the rebuttal highlighted the fact that marker expression is responsive to the levels of Activin A used in the patterning pulse, with the 100ng/ml Activin A used in our protocol superior to 75ng/ml.

      Author response image 1.

      Association of CD34 with CD41 and CD45 expression is Activin A-responsive and supports the presence of definitive haematopoiesis. A. Flow cytometry analysis of CD34 and CD41 expression in 216h-haemogenic gastruloids; two doses of Activin A were used in the patterning pulse with CHI99021 between 48-72h. FMO controls shown. B. Flow cytometry analysis of CD34 and CD45 at 216h in the same experimental conditions.

      Given the centrality of this point in comments by all the Reviewers, we have conducted projections of our single-cell RNA-seq data against two studies which (1) capture arterial and haemogenic specification in the para-splanchnopleura (pSP) and AGM region between E8.0 and E11 (Hou et al, PMID: 32203131), and (2) uniquely capture YS, AGM and FL progenitors and the AGM endothelial-tohaematopoietic transition (EHT) in the same scRNA-seq dataset (Zhu et al, PMID: 32392346). Focusing the analysis on the subsets of haemogenic gastruloid cells sorted as CD41+ (144h) C-Kit+ (144h and 192h) and CD45+ (192h and 216h) (now represented in Fig. 3A, and projected onto the studies in Fig. 4A), we show:

      (1) That a subset of haemato-endothelial cells from haemogenic gastruloids at 144h to 216h project onto intra-embryonic cells spanning E8.25 to E10 (revised Fig. 4A left and 4S1A). This is in agreement with our original interpretation that 216h are no later than the MPP/pre-HSC state of embryonic development, requiring further maturation to generate engrafting progenitors. We have nevertheless removed specific references to pre-HSC, and instead referred to HSPC/progenitors.

      (2) That haemogenic gastruloids contain YS-like (including EMP-like) and AGM-like haematopoietic cells (Fig. 4A centre and 4 S1B). Significantly, some of the cells, particularly CKit-sorted cells with a candidate endothelial and HE-like signature project onto AGM pre-HE and HE, as well as IAHC. Some 144h CD41+ and 192h CD45+ cells also project onto IAHC, suggesting that YS-like and AGM-like programmes arise independently and with partial timedependent organisation in the haemogenic gastruloid model. Later, predominantly 216h cells, have characteristics of MPP/LMPP-like cells from the FL, suggesting a progenitor wave of differentiation.

      Altogether, the data support the notion that haemogenic gastruloids capture YS and AGM haematopoiesis until E10, as suggested by us in the manuscript.This re-analysis of the scRNA-seq data which was indeed prompted by challenging and insightful comments from the Reviewers, has been incorporated in the manuscript as described above and further listed here:

      Re-clustering and highlights of specific markers in our scRNA-seq data in Figure 3A-E. In text: 268-304.

      Projections to mouse embryo datasets in Figure 4A (Figure 4S1A-C; Supplementary File 3). In text: 315-356. 

      Single-cell RNA sequencing was used to compare hGx with mouse AGM. The authors incorrectly conclude that ' ..specification of endothelial and HE cells in hGx follows with time-dependent developmental progression into putative AGM-like HE..' And, '...HE-projected hGx cells.......expressed Gata2 but not Runx1, Myb, or Gfi1b..' Hemogenic endothelium is defined by the expression of Runx1 and Gfli1b is downstream of Runx1.

      As a hierarchy of regulation, Gata2 precedes and drives Runx1 expression at the specification of HE (PMID: 17823307; PMID: 24297996), while Runx1 drives the EHT, upstream of Gfi1b in haematopoietic clusters (PMID: 34517413). Please note that the text segment the Reviewer refers to has been removed from the manuscript, as the analysis is no longer solely focused on projection to Thambyrajah et al (2024) data, and instead gained significantly from the projections on to the Hou et al (2020) and Zhu et al (2020) studies, as detailed above.

      (3) The hGx protocol 'generates hematopoietic SC precursors capable of short-term engraftment' is not supported by the data presented. Short-term engraftment would be confirmed by flow cytometric detection of hematopoietic cells within the recipient bone marrow, spleen, thymus, and peripheral blood that expressed the BFP transgene. This analysis was not provided. PCR detection of transcripts, following an unspecified number of amplification cycles, as shown in Figure 3G (incorrectly referred to as Figure 3F in the legend) is not acceptable evidence for engraftment.

      We provide the full flow cytometry analysis of spleen engraftment in the 5 mice which received implantation of 216h-haemogenic gastruloids in the adrenal gland and were analysed at 4 weeks; an additional (control) animal received adrenal injection of PBS (Fig. 4B-D in the revised manuscript). In this experiment, the bone marrow collection was limiting, and material was prioritised for PCR (Fig. 4C and full gels in 4S2C in the revised manuscript).

      We had previously provided only representative plots of flow cytometry analysis of bone marrow and spleen, which we described as low-level engraftment and were chosen conservatively. The analysis was meant to complement the genomic DNA PCR, where detection was present in only some of the replicates tested per animal. On this note, we confirm that PCR analysis used conventional 40 cycles; the sensitivity had already been shown in the earlier version of the manuscript and is again represented in Fig. 4S2B. We argue that the low level of cytometric and molecular engraftment at 4 weeks, from haemogenic gastruloid-derived progenitors that have not progressed beyond a stage equivalent to E10 (Fig. 4A and Supplementary File 3 in the revised manuscript from scRNAseq projections), and that we have described as requiring additional maturation in vivo, are not surprising. Indeed, as previously shown and now repeated in in Fig. 2B-E (controls in Fig. 2S1E-G) in the revised manuscript, no more than 7 CD45+CD144+ multipotent cells are present per haemogenic gastruloid. We are only able to implant 3 haemogenic gastruloids in the adrenal gland of each transplanted animal. 

      We have rephrased Results and Discussion in lines 359-415 and 588-621, respectively, to rectify the nature of the engraftment, which we now attribute more generically to progenitors, also in light of the developmental time we could capture in the gastruloids prior to implantation.

      Transplanted hGx formed teratoma-like structures, with hematopoietic cells present at the site of transplant only analysed histologically. Indeed, the quality of the images provided does not provide convincing validation that donor-derived hematopoietic cells were present in the grafts.

      As stated in the text, the images mean to illustrate that the haemogenic gastruloids developed in situ. Further analysis motivated by the Reviewers’ comments and indeed a subsequent experiment with analysis of engraftment at a later timepoint of 8 weeks (revised Fig. 4E and 4 S2F-G) did not show a direct correspondence between engraftment and in vivo development or expansion, although this occurs in some cases. To be clearer, the observation of donor-derived blood cells in the implanted haemogenic gastruloids would not correspond to engraftment, as we have amply demonstrated that they have generated blood cells in vitro. There is no evidence that there are remaining pluripotent cells in the haemogenic gastruloid after 9 days of differentiation, and it is therefore not clear that the structures observed are teratomas. We specifically comment on this point in the revised manuscript – lines 601-607.

      There is no justification for the authors' conclusion that '... the data suggest that 216h hGx generate AGM-like pre-HSC capable of at least short-term multilineage engraftment upon maturation...'. Indeed, this statement is in conflict with previous studies demonstrating that pre-HSCs in the dorsal aorta of the mouse embryo are immature and actually incapable of engraftment.

      We have clearly stated that we do not see haematopoietic engraftment through transplantation of dissociated haemogenic gastruloids, which reach the E10 state containing pre-HSC (revised Fig 4A, 4S1A and Supplementary File 3). Instead, we observed rare myelo-erythroid (revised Fig. 4S2F-G) and myelo-lymphoid (revised Fig. 4E) engraftment upon in vivo maturation of haemogenic gastruloids with preserved 3D organisation. These statements are not contradictory. Nevertheless, we have now more cautiously attributed engraftment to the present of progenitors as a generic designation, and not to pre-HSC (lines 412-414 and 588-592 in the revised manuscript).

      The statement '...low-level production of engrafting cells recapitulates their rarity in vivo, in agreement with the embryo-like qualities of the gastruloid system....' is incorrect. Firstly, no evidence has been provided to show the hGx has formed a dorsal aorta facsimile capable of generating cells with engrafting capacity. Secondly, although engrafting cells are rare in the AGM, approximately one per embryo, they are capable of robust and extensive engraftment upon transplantation.

      As indicated above, the statement in lines 412-414 now reads “Engraftment is erythromyeloid at 4 weeks and lympho-myeloid at 8 weeks, reflecting different classes of progenitors, putatively of YS-like and AGM-like affiliation.” To be clear, with our original statement we meant to highlight that the production of definitive AGM-like haematopoietic progenitors (not all of which are engrafting) in haemogenic gastruloids does not correspond to non-physiological single-lineage programming. We did and do not claim that we achieved production of HSC, which would be long-term engrafting.

      (4) Expression MNX1 transcript and protein in hematopoietic cells in MNX1 rearranged acute myeloid leukaemia (AML) is one cause of AML in infants. In the hGX model of this disease, Mnx1 is overexpressed in the mESCs that are used to form gastruloids. Mnx1 overexpression seems to confer an overall growth advantage on the hGx and increase the serial replating capacity of the small number of hematopoietic cells that are generated. The inefficiency with which the hGx model generates hematopoietic cells makes it difficult to model this disease. The poor quality of the cytospin images prevents accurate identification of cells. The statement that the kit-expressing cells represent leukemic blast cells is not sufficiently validated to support this conclusion. What other stem cell genes are expressed? Surface kit expression also marks mast cells, frequently seen in clonogenic assays of blood cells. Flow cytometric and gene expression analyses using known markers would be required.

      The haemogenic gastruloid model generates haematopoietic and haemato-endothelial cells. MNX1 expands C-Kit+ cells at 144h, which we show to have a haemato-endothelial signature (see revised Fig. 3A-E, Supplementary File 2). We have added additional flow cytometry data showing that the replating cells from MNX1 express CD31 (Figure 6S1A-B).

      Serial replating of CFC assays is a conventional in vitro assay of leukaemia transformation. Critically, colony replating is not maintained in EV control cells, attesting to the transformation potential of MNX1. Although we have not fully-traced the cellular hierarchy of MNX1-driven transformation in the haemogenic gastruloid system, the in vitro replating expands a C-Kit+ cell (revised Fig. 6E), which reflects the surface phenotype of the leukaemia, also recapitulated in the mouse model initiated by MNX1-overexpressing FL cells. Importantly, it recapitulates the transcriptional profile of MNX1leukaemia patients (revised Fig. 7C), which is uniquely expressed by MNX1144h and replated colony cells, but not to MNX1 216h gastruloid cells, arguing against a generic signature of MNX1 overexpression (revised Fig. 7B). Importantly, the MNX1-transformation of haemogenic gastruloid cells is superior to the FL leukaemia model at capturing the unique transcriptional features of MNX1-driven leukaemia, distinct from other forms of AML in the same age group (Fig 7 S1D-F). It is possible that this corresponds to a pre-leukaemia event, and we will explore this in future studies, which are beyond the proof-of-principle nature of this paper.

      (5) In human infant MNX1 AML, the mutation is thought to arise at the fetal liver stage of development. There is no evidence that this developmental stage is mimicked in the hGx model.

      We never claim that the haemogenic gastruloid model mimics the foetal liver. We propose that susceptibility to MNX1 is at the HE-to-EMP transition. Moreover, and importantly, contrary to the Reviewer’s statement, there is no evidence in the literature that the mutation arises in the foetal liver stage, just that the mutation arises before birth (PMID: 38806630), which is different. In a mouse model of MNX1 overexpression, the authors achieve leukaemia engraftment upon MNX1 overexpression in foetal liver, but not in bone marrow cells (PMID: 37317878). This is in agreement with a vulnerability of embryonic / foetal, but not adult cells to the MNX1 expression caused by the translocation. However, haematopoietic cells in the foetal liver originate from YS and AGM precursors, so the origin of the MNX1susceptible cells can be in those locations, rather than the foetal liver itself.

      Reviewer #2 (Public review):

      Summary: 

      In this manuscript, the authors develop an exciting new hemogenic gastruloid (hGX) system, which they claim reproduces the sequential generation of various blood cell types. The key advantage of this cellular system would be its potential to more accurately recapitulate the spatiotemporal emergence of hematopoietic progenitors within their physiological niche compared to other available in vitro systems. The authors present a large set of data and also validate their new system in the context of investigating infant leukemia. 

      Strengths: 

      The development of this new in vitro system for generating hematopoietic cells is innovative and addresses a significant drawback of current in vitro models. The authors present a substantial dataset to characterize this system, and they also validate its application in the context of investigating infant leukemia. 

      Weaknesses: 

      The thorough characterization and full demonstration that the cells produced truly represent distinct waves of hematopoietic progenitors are incomplete. The data presented to support the generation of late yolk sac (YS) progenitors, such as lymphoid cells, and aortic-gonad-mesonephros (AGM)-like progenitors, including pre-hematopoietic stem cells (pre-HSCs), by this system are not entirely convincing. Given that this is likely the manuscript's most crucial claim, it warrants further scrutiny and direct experimental validation. Ideally, the identity of these progenitors should be further demonstrated by directly assessing their ability to differentiate into lymphoid cells or fully functional HSCs. Instead, the authors primarily rely on scRNA-seq data and a very limited set of markers (e.g., Ikzf1 and Mllt3) to infer the identity and functionality of these cells. Many of these markers are shared among various types of blood progenitors, and only a well-defined combination of markers could offer some assurance of the lymphoid and pre-HSC nature of these cells, although this would still be limited in the absence of functional assays.

      The identification of a pre-HSC-like CD45⁺CD41⁻/lo C-Kit⁺VE-Cadherin⁺ cell population is presented as evidence supporting the generation of pre-HSCs by this system, but this claim is questionable. This FACS profile may also be present in progenitors generated in the yolk sac such as early erythromyeloid progenitors (EMPs). It is only within the AGM context, and in conjunction with further functional assays demonstrating the ability of these cells to differentiate into HSCs and contribute to long-term repopulation, that this profile could be strongly associated with pre-HSCs. In the absence of such data, the cells exhibiting this profile in the current system cannot be conclusively identified as true pre-HSCs.

      We present 2 additional pieces of evidence to support our claims that we capture YS and AGM stages of haematopoietic development.

      (I) In the new Figures 4A and 4 S1A-C and Supplementary File 3 in the revised manuscript, we project our single-cell RNA-seq data onto (1) developing intra-embryonic pSP and AGM between E8 and E11 (Fig. 4A left, 4S1A) and (2) a single-cell RNA-seq study of HE development which combines haemogenic and haematopoietic cells from the YS, the developing HE and IAHC in the AGM, and FL (Fig. 4A centre, 4S1B). Our data maps E8.25-E10, and captures YS EMP and erythroid and myeloid progenitors, as well as AGM pre-HE, HE and IAHC, with some cells matching HSPC and LMPP, as suggested by the projection onto the Thambyrajah et al data set (already presented in the previous version of the manuscript, and now in Fig. 4A right and 4 S1C). The projection of the scRNA-seq data in presented in lines 314-355 of the revised manuscript. The scRNA-seq data itself was refocused on haemato-endothelial programmes as presented in the revised Fig. 3A-E, described in lines 267-303.

      (II) Given the difficulty in finding markers that specifically associate with AGM haematopoiesis, we inspected the possibility of capturing different regulatory requirements at different stages of gastruloid development mirroring differential effects in the embryo. Polycomb EZH2 is specifically required for EMP differentiation in the YS, but does not affect AGM-derived haematopoiesis; it is also not required for primitive erythroid cells (PMID: 29555646; PMID: 34857757). We treated haemogenic gastruloids from 120h onwards with either DMSO (0.05%) or GSK126 (0.5uM), and inspected the cellularity of gastruloids at 144h, which we equate with YS-EMP, and 216h – putatively AGM haematopoiesis. We show that EZH2 inhibition / GSK126 treatment specifically reduces %CD41+ cells at 144h, but does not reduce %CD41+ or %CD45+ cells at 216h. We have included this experiment in the manuscript in Fig. 2 S2B-C (in text: 209-221).

      These data, together with the scRNA-seq projections described, provide evidence to our claim that 144h haemogenic gastruloids capture YS EMPs, while CD41+ and CD45+ cells isolated at 216h reflect AGM progenitors. We cannot conclude as to the functional nature of the AGM cells from this experiment. The main text has been edited to clarify the experiments pertaining to distinguishing AGM and YS haematopoiesis (lines 180-187; 200-221; 268-304; 315-356).

      The engraftment data presented are also not fully convincing, as the observed repopulation is very limited and evaluated only at 4 weeks post-transplantation. The cells detected after 4 weeks could represent the progeny of EMPs that have been shown to provide transient repopulation rather than true HSCs. 

      In the original version of the manuscript, we stated that there is low level engraftment and did not claim to have generated HSC. Instead, we described cells with short-term engraftment potential. We agree with the Reviewer that the cells we show in the manuscript at 4 weeks could be EMPs (revised Fig. 4B-E and 4 S2D-G). Additionally, we now have 8-week analysis of implant recipients, in which we observed, again low-level, a multi-lineage engraftment of the recipient bone marrow in 1:3 recipients (revised Fig. 4B-E and 4S2F-H). This engraftment is myeloid-lymphoid and therefore likely to have originated in a later progenitor. To be clear, we do not claim that this corresponds to the presence of HSC. It nevertheless supports the maturation of progenitors with engraftment potential. Limiting amounts of material was prioritised for flow cytometry stainings, not allowing PCR analysis. We rephrased Results and Discussion in lines 359-414 and 588-621, respectively, to rectify the nature of the engraftment.      

      Reviewer #3 (Public review):  

      In this study, the authors employ a mouse ES-derived "hemogenic gastruloid" model which they generated and which they claim to be able to deconvolute YS and AGM stages of blood production in vitro. This work could represent a valuable resource for the field. However, in general, I find the conclusions in this manuscript poorly supported by the data presented. Importantly, it isn't clear what exactly are the "YS" and the "AGM"-like stages identified in the culture and where is the data that backs up this claim. In my opinion, the data in this manuscript lack convincing evidence that can enable us to identify what kind of hematopoietic progenitor cells are generated in this system. Therefore, the statement that "our study has positioned the MNX1-OE target cell within the YS-EMP stage (line 540)" is not supported by the evidence presented in this study. Overall, the system seems to be very preliminary and requires further optimization before those claims can be made.

      Specific comments below: 

      (1) The flow cytometric analysis of gastruloids presented in Figure 1 C-D is puzzling. There is a large % of C-Kit+ cells generated, but few VE-Cad+ Kit+ double positive cells. Similarly, there are many CD41+ cells, but very few CD45+ cells, which one would expect to appear toward the end of the differentiation process if blood cells are actually generated. It would be useful to present this analysis as consecutive gating (i.e. evaluating CD41 and CD45 within VE-Cad+ Kit+ cells, especially if the authors think that the presence of VE-Cad+ Kit+ cells is suggestive of EHT). The quantification presented in D is misleading as the scale of each graph is different.

      Fig. 1C-D provide an overview of haemogenic markers during the timecourse of haemogenic gastruloid differentiation, and does indeed show a late up-regulation of CD45, as the Reviewer points out would be expected. The %CD45+ cells is indeed low. However, we should point out that the haemogenic gastruloid protocol, although biased towards mesodermal outputs, does not aim to achieve pure haematopoietic specification, but rather place it in its embryo-like context. We refute that the scale is misleading: it is a necessity to represent the data in a way that is interpretable by the reader: and we made sure from the outset that the gates (in C) are truly representative and annotated, as are the plot axes (in D). Consecutive gating at the 216h-timepoint is shown and quantified in Fig. 2S1D-F, or in the alternative consecutive gating suggested by the Reviewer, in Author response iamge 2 below. At the request of Reviewer 1, we also analysed CD31 and CD34 within CD41 and CD45 populations, again as validation of the emergent haematopoietic character of the cells obtained. This new analysis is shown in revised Fig. 2B, quantified in 2C.

      Author response image 2.

      Flow cytometry analysis of VE-cadherin+ cells in haemogenic gastruloids at 216h of the differentiation protocol, probing co-expression of CD45, CD41 and C-Kit.

      (2) The imaging presented in Figure 1E is very unconvincing. C-Kit and CD45 signals appear as speckles and not as membrane/cell surfaces as they should. This experiment should be repeated and nuclear stain (i.e. DAPI) should be included.

      We included the requested immunofluorescence staining in Figure 1E (216h). We also show the earlier timepoint of 192h here as Author response image 3. In text: lines 158-162.

      Author response image 3.

      Confocal images of haematopoietic production in haemogenic gastruloids. Wholemount, cleared haemogenic gastruloids were stained for CD45 (pseudo-coloured red) and C-Kit antigens (pseudo-coloured yellow) with indirect staining, as described in the manuscript. Flk1-GFP signal is shown in green. Nuclei are contrasted with DAPI. (A) 192h. (B) 216h.

      (3) Overall, I am not convinced that hematopoietic cells are consistently generated in these organoids. The authors should sort hematopoietic cells and perform May-Grunwald Giemsa stainings as they did in Figure 6 to confirm the nature of the blood cells generated.

      It is factual that the data are reproducible and complemented by functional assays shown in revised Fig. 2D-E, which clearly demonstrate haematopoietic output. The single-cell RNA-seq data also show expression of a haematopoietic programme, which we have complemented with biologically independent qRT-PCR analysis of the expression of key endothelial and haematopoietic marker and regulatory genes (revised Fig. 2F; in text: 200-209). As requested, we include Giemsa-Wright’s stained cytospins obtained at 216h to illustrate haematopoietic output. These are shown in revised Fig. 2S2A, in text: lines 194-199. Inevitably, the cytospins will be inconclusive as to the presence of endothelial-tohaematopoietic transition or the generation of haematopoietic stem/progenitor cells, as these cells do not have a distinctive morphology.

      (4) The scRNAseq in Figure 2 is very difficult to interpret. Specific points related to this: - Cluster annotation in Figure 2a is missing and should be included. 

      Why do the heatmaps show the expression of genes within sorted cells? Couldn't the authors show expression within clusters of hematopoietic cells as identified transcriptionally (which ones are they? See previous point)? Gene names are illegible.

      I see no expression of Hlf or Myb in CD45+ cells (Figure 2G). Hlf is not expressed by any of the populations examined (panels E, F, G). This suggests no MPP or pre-HSC are generated in the culture, contrary to what is stated in lines 242-245. (PMID 31076455 and 34589491).Later on, it is again stated that "hGx cells... lacked detection of HSC genes like Hlf, Gfi1, or Hoxa9" (lines 281-283). To me, this is proof of the absence of AGM-like hematopoiesis generated in those gastruloids.

      For a combination of logistic and technical reasons, we performed single-cell RNA-seq using the Smart-Seq2 platform, which is inherently low throughput. We overcame the issue of cell coverage by complementing whole-gastruloid transcriptional profiling at successive time-points with sorting of subpopulations of cells based on individual markers documented in Fig. 1. We clearly stated which platform was used as well as the number and type of cells profiled (Fig. 3S1 and lines 226-241 of the revised manuscript), and our approach is standard. Following suggestions of the Reviewers to further focus our analysis on the haemogenic cellular differentiation within the gastruloids, we revised the presentation of the scRNA-seq data to now provide UMAP projections with representation and quantification of individual genes, including the ones queried by the Reviewer in Fig. 3 and respective supplements. Specifically, re-clustering and highlighting of specific markers are shown in Figure 3A-D and presented in lines 267-303 of the revised manuscript. Complementary independent real-time quantitative (q)PCR analysis showing time-dependent expression of endothelial and haematopoietic markers is now in Figure 2F. In text: 200-208.

      (5) Mapping of scRNA-Seq data onto the dataset by Thambyrajah et al. is not proof of the generation of AGM HE. The dataset they are mapping to only contains AGM cells, therefore cells do not have the option to map onto something that is not AGM. The authors should try mapping to other publicly available datasets also including YS cells.

      We have done this and the data are presented in Figure 4A (Figure 4S1A) and Supplementary File. In text: 314-355. As detailed in response to Reviewer 1, we have conducted projections of our single-cell RNA-seq data against two studies which (1) capture arterial and haemogenic specification in the para-splanchnopleura (pSP) and AGM region between E8.0 and E11 (Hou et al, PMID: 32203131) (revised Fig. 4A and 4 S1A), and (2) uniquely capture YS, AGM and FL progenitors and the AGM endothelial-to-haematopoietic transition (EHT) in the same scRNA-seq dataset (Zhu et al, PMID: 32392346) (revised Fig. 4A and 4 S1B). Specifically in answering the Reviewers’ point, we show that different subsets of haemogenic gastruloid cells sorted on haemogenic surface markers C-Kit, CD41 and CD45 cluster onto pre-HE and HE, intra-aortic clusters and FL progenitor compartments, and to YS EMP and erythroid and myeloid progenitors. This lends support to our claim that the haemogenic gastruloid system specifies both YS-like and AGM-like cells. Please note that we now do point out that some CD41+ cells at 144h project onto IAC, as do cells at the later timepoints, suggesting that AGM-like and YS-EMP-like waves may overlap at the 144h timepoint (lines…). In the future, we will address specific location of these cells, but that corresponds to a largescale spatial transcriptomics analysis requiring extensive optimisation for section capture which is beyond the scope of this manuscript and this revision. 

      (6) Conclusions in Figure 3, named "hGx specify cells with preHSC characteristics" are not supported by the data presented here. Again, I am not convinced that hematopoietic cells can be efficiently generated in this system, and certainly not HSCs or pre-HSCs.

      We have provided evidence in the original manuscript, and now through additional experiments, that there is haematopoietic specification, including of progenitor cells, in the haemogenic gastruloid system. Molecular markers are shown in revised Fig. 2F and Fig. 3 and supplements; CFC assays are shown in revised Fig. 2D-E; cytospins are in revised Fig. 2 S2A; further analysis of 4-week implants and new analysis of 8-week implants (discussed below) are in revised Fig. 4 B-D and Fig. 4 S2 and we discussed the new scRNA-seq projections above. Importantly, we have never claimed, and again do not, that haemogenic gastruloids generate HSC. We accept the Reviewer’s comment that we have not provided sufficient evidence for the specification of pre-HSC-like cells and accordingly now refer more generically and conservatively to progenitors.

      FACS analysis in 3A is again very unconvincing. I do not think the population identified as C-Kit+ CD144+ is real. Also, why not try gating the other way around, as commonly done (e.g. VE-Cad+ Kit+ and then CD41/CD45)?

      Our gating strategy is not unconventional, which was done from a more populated gate onto the less abundant one to ensure that the results are numerically more robust. In the case of haemogenic gastruloids, unlike the AGM preparations the Reviewer may be referring to, CD41 and CD45+ cells are more abundant as there is no circulation of more differentiated haematopoietic cells away from the endothelial structures. This said, we did perform the gating as suggested (Rev Fig. 2), indeed confirming that most VE-cad+ Kit+ cells are CD45+. Interestingly VE-cad+Kit- are predominantly CD41+, reinforcing the haematopoietic nature of these cells.

      The authors must have tried really hard, but the lack of short- or long-engraftment in a number of immunodeficient mouse models (lines 305-313) really suggests that no blood progenitors are generated in their system. I am not familiar with the adrenal gland transplant system, but it seems like a very non-physiological system for trying to assess the maturation of putative pre-HSCs. The data supporting the engraftment of these mice, essentially seen only by PCR and in some cases with a very low threshold for detection, are very weak, and again unconvincing. It is stated that "BFP engraftment of the Spl and BM by flow cytometry was very low level albeit consistently above control (Fig. S4E)" (lines 337-338). I do not think that two dots in a dot plot can be presented as evidence of engraftment.

      We have presented the data with full disclosure and do not deny that the engraftment achieved is low-level and short-term, indicating incomplete maturation of definitive haematopoietic progenitors in the current haemogenic gastruloid system. Indeed, by not wanting to overstate the finding, we were deliberately conservative in our representative flow cytometry plots and focused on the PCR for sensitivity. We now present the full flow cytometry analysis for spleen where we preserved more cells after the genomic DNA extraction (revised Fig. 4C) and call the Reviewer’s attention to the fact that detection of BFP+ cells by PCR and flow cytometry in the recipient animals is consistent between the 2 methods (revised Fig. 4C and D; full gels previously presented now in Fig. 4S2C; sensitivity analysis was also previously available and is now in Fig. 4S2B). In addition, we have now also been able to detect low-level myelo-lymphoid engraftment in the bone marrow and spleen 8 weeks after adrenal implantation, again suggesting the presence of a small number of definitive haematopoietic progenitors that potentially mature from the 3 haemogenic gastruloids implanted (Fig. 4E and 4 S2F-G in the revised manuscript. We rephrased Results and Discussion at lines 359-414 and 589-621, respectively, to rectify the nature of the engraftment which we attribute to progenitors.

      (7) Given the above, I find that the foundations needed for extracting meaningful data from the system when perturbed are very shaky at best. Nevertheless, the authors proceed to overexpress MNX1 by LV transduction, a system previously shown to transform fetal liver cells, mimicking the effect of the t(7;12) AML-associated translocation. Comments on this section:

      The increase in the size of the organoid when MNX1 is expressed is a very unspecific finding and not necessarily an indication of any hematopoietic effect of MNX1 OE.

      We agree with the Reviewer on this point; it is nevertheless a reproducible observation which we thought relevant to describe for completeness and data reproducibility.

      The mild increase of cKit+ cells (Figure 4E) at the 144hr timepoint and the lack of any changes in CD41+ or CD45+ cells suggests that the increase in Kit+ cells % is not due to any hematopoietic effect of MNX1 OE. No hematopoietic GO categories are seen in RNA seq analysis, which supports this interpretation. Could it be that just endothelial cells are being generated?

      The Reviewer is correct that the MNX1-overexpressing cells have a strong endothelial signature, which is present in patients (revised Fig. 5A). We investigated a potential link with C-Kit by staining cells from the replating colonies during the process of in vitro transformation with CD31. We observed that 40-50% of C-Kit+ cells (20-30% total colony cells) co-expressed CD31, at least at early plating. These cells co-exist with haematopoietic cells, namely Ter119+ cells, as expected from the YSlike erythroid and EMP-like affiliation of haematopoietic output from 144h-haemogenic gastruloids. These data are included in Fig. 6S1A-B (in text 506-507) of the revised manuscript.

      (8) There seems to be a relatively convincing increase in replating potential upon MNX1-OE, but this experiment has been poorly characterized. What type of colonies are generated? What exactly is the "proportion of colony forming cells" in Figures 5B-D? The colony increase is accompanied by an increase in Kit+ cells; however, the flow cytometry analysis has not been quantified.

      Given the inability to replate control EV cells, there is not a population to compare with in terms of quantification. The level of C-Kit+ represented in Fig. 6E of the revised manuscript is achieved at plate 2 or 3 (depending on the experiment), both of which are significantly enriched for colony-forming cells relative to control (revised Fig. 6B, D).  

      (9) Do hGx cells engraft upon MNX1-OE? This experiment, which appears not to have been performed, is essential to conclude that leukemic transformation has occurred.

      For the purpose of this study, we are satisfied with confirmation of in vitro transformation potential of MNX1 haemogenic gastruloids, which can be used for screening purposes. Although interesting, in vivo leukaemia engraftment from haemogenic gastruloids is beyond the scope of this study.

      Reviewer #2 (Recommendations for the authors):

      (1) Minor comments

      (a) I find the denomination "hGx" very confusing as it would suggest that these gastruloids are human, whereas, in fact, they are murine.

      We agree with the Reviewer on the confusing nomenclature and have edited the manuscript to call “haemGx” instead.

      (b) I find the presence of mast cells in CFC of MNX1-OE cultures very puzzling as this does not bear any resemblance to human leukemia.

      We detect an enrichment of mast cell transcriptional programmes, as defined by the cell type repositories. While it is not mast cells to represent leukaemic cells in patients, this ontology is likely to reflect the developmental stage and origin of progenitors which are affected by MNX1.

      (2) I have a few suggestions to improve figures and tables clarity, to help readers better follow the data presented.

      (a) To enhance readability, it would be beneficial to highlight the genes mentioned in the text within the scRNA-seq figures. Many figures currently display over 30-40 genes in small font sizes, making it difficult to quickly locate specific genes discussed in the text. Additionally, implementing a colorcoding system to categorize these genes according to their proposed lineages would improve clarity and organization.

      We have now performed major re-organisation and re-analyses of the scRNA-seq data, which we believe has improved the readability and clarity of the corresponding sections of the manuscript.

      (b) The data presented in Supplementary Table 1, along with other supplementary tables, are challenging to interpret due to insufficient annotations. Enhancing these tables with clearer and more detailed annotations would significantly improve clarity and aid readers in understanding the supplementary materials.

      Descriptive text has been added to accompany each Supplementary File to aid in understanding the results reported therein.

      Reviewer #3 (Recommendations for the authors):

      In addition to what was written in the public review, I would suggest the authors simplify and shorten the text. Currently, a lot of unnecessary detail is included which makes the story very hard to follow. Moreover, the authors should modify the figures to make them more comprehensible, especially for RNA-seq data.

      We have significantly re-arranged and shortened parts of the manuscript, particularly by focusing the Discussion. Results presentation has also been improved through additional analysis and graphic representation of the scRNA-seq data, which we believe has improved the readability and clarity.s

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #2 (Public review)

      In this manuscript, Weiguang Kong et al. investigate the role of immunoglobulin M (IgM) in antiviral defense in the teleost largemouth bass (Micropterus salmoides). The study employs an IgM depletion model, viral infection experiments, and complementary in vitro assays to explore the role of IgM in systemic and mucosal immunity. The authors conclude that IgM is crucial for both systemic and mucosal antiviral defense, highlighting its role in viral neutralization through direct interactions with viral particles. The study's findings have theoretical implications for understanding immunoglobulin function across vertebrates and practical relevance for aquaculture immunology.

      Strengths:

      The manuscript applies multiple complementary approaches, including IgM depletion, viral infection models, and histological and gene expression analyses, to address an important immunological question. The study challenges established views that IgT is primarily responsible for mucosal immunity, presenting evidence for a dual role of IgM at both systemic and mucosal levels. If validated, the findings have evolutionary significance, suggesting the conserved role of IgM as an antiviral effector across jawed vertebrates for over 500 million years. The practical implications for vaccine strategies targeting mucosal immunity in fish are noteworthy, addressing a key challenge in aquaculture.

      Weaknesses:

      Several conceptual and technical issues undermine the strength of the evidence:<br /> Monoclonal Antibody (MoAb) Validation: The study relies heavily on a monoclonal antibody to deplete IgM, but its specificity and functionality are not adequately validated. The epitope recognized by the antibody is not identified, and there is no evidence excluding cross-reactivity with other isotypes. Mass spectrometry, immunoprecipitation, or Western blot analysis using tissue lysates with varying immunoglobulin expression levels would strengthen the claim of IgM-specific depletion.<br /> IgM Depletion Kinetics: The rapid depletion of IgM from serum and mucus (within one day) is unexpected and inconsistent with prior literature. Additional evidence, such as Western blot analyses comparing treated and control fish, is necessary to confirm this finding.

      Novelty of Claims: The manuscript claims a novel role for IgM in viral neutralization, despite extensive prior literature demonstrating this role in fish. This overstatement detracts from the contribution of the study and requires a more accurate contextualization of the findings.

      Support for IgM's Crucial Role: The mortality data following IgM depletion do not fully support the claim that IgM is indispensable for antiviral defense. The survival of IgM-depleted fish remains high (75%) compared to non-primed controls (~50%), suggesting that other immune components may compensate for IgM loss

      .<br /> Presentation of IgM Depletion Model: The study describes the IgM depletion model as novel, although similar models have been previously published (e.g., Ding et al., 2023). This should be clarified to avoid overstating its novelty.

      While the manuscript attempts to address an important question in teleost immunology, the current evidence is insufficient to fully support the authors' conclusions. Addressing the validation of the monoclonal antibody, re-evaluating depletion kinetics, and tempering claims of novelty would strengthen the study's impact. The findings, if rigorously validated, have important implications for understanding the evolution of vertebrate immunity and practical applications in fish health management.

      This work is of interest to immunologists, evolutionary biologists, and aquaculture researchers. The methodological framework, once validated, could be valuable for studying immunoglobulin function in other non-model organisms and for developing targeted vaccine strategies. However, the current weaknesses limit its broader applicability and impact.

      We would like to thank Reviewer for the helpful comments. As the reviewer suggested, we verified the specificity of anti-bass IgM MoAb using multiple well-established experimental approaches, including mass spectrometry analysis, western blot, flow cytometry, and in vivo IgM depletion models. Additionally, we included western blot analyses to further confirm the IgM depletion kinetics. Moreover, we carefully revised any overstated claims in the original manuscript and incorporated the valuable suggestions of the reviewer in the Introduction and Discussion sections to enhance the clarity and rigor of our work.

      Reviewer #1 (Recommendations for the authors):

      (1) Experiments and Data Validation:

      Monoclonal Antibody Validation:

      Provide detailed validation of the monoclonal antibody (MoAb) used for IgM depletion.Perform immunoprecipitation followed by mass spectrometry to confirm the specificity of the MoAb and identify any off-target interactions. Conduct Western blot analysis using tissue lysates with varying IgM, IgT, and IgD expression to demonstrate specificity. Include controls, such as a group treated with a control antibody of the same isotype, to confirm the depletion specificity and effects. Present data on the binding site of the MoAb and confirm it targets IgM.

      We thank the reviewer for this constructive comment and have carried out a comprehensive validation of anti-bass IgM monoclonal antibody (MoAb).

      Validation of anti-bass IgM MoAb by Mass Spectrometry

      To validate the specificity of anti-bass IgM MoAb, target proteins were immunoprecipitated from bass serum using IgM MoAb-coupled CNBr-activated Sepharose 4B beads, followed by mass spectrometry analysis to verify exclusive IgM heavy-chain identification (Figure 3–figure supplement 1A). Quantitative mass spectrometry verified the antibody’s specificity, with IgM heavy-chain peptides representing 97.3% of total signal, indicating negligible off-target reactivity. This high target specificity was further supported by the no detectable cross-reactivity to IgT/IgD (Figure 3–figure supplement 1B). Moreover, the 72% sequence coverage (Figure 3–figure supplement 1C) and confirmed LC-MS/MS spectra of IgM peptides (Figure 3–figure supplement 1D) further validated target selectivity.

      Validation of anti-bass IgM MoAb by western blot and flow cytometry

      We compared the anti-bass IgM MoAb with an isotype control (mouse IgG1) under both non-reducing and reducing serum immunoblots. The western blot results showed that the developed MoAb bound specifically to IgM in largemouth bass serum. Owing to the structural diversity of fish IgM isoforms, denatured non-reducing electrophoresis typically yields multiple bands with varying molecular weights (Rombout et al., 1993; Ye et al., 2010). Immunoblot analysis revealed multiple bands with varying molecular weights under non-reducing conditions, with the main band ranging from 700 to 800 kDa and a distinct ~70 kDa band under reducing conditions (Figure 3–figure supplement 2A). Notably, the isotype control showed no detectable bands under both non-reducing and reducing conditions (Figure 3–figure supplement 2A). Additionally, we analyzed tissue lysates from various sources (i.e., Spleen, skin, gill, and gut) and observed consistently recognized bands at identical positions and sizes, whereas the isotype control showed no detectable bands (Figure 3–figure supplement 2B-F).

      Next, we performed flow cytometry analysis to confirm antibody specificity. In largemouth bass head kidney leukocytes, IgM<sup>+</sup> B cells accounted for 28.56% of the population, compared to only 0.41% for the isotype control (Figure 3–figure supplement 2G). Following flow sorting of negative and positive cell populations, we extracted RNA from equal cell numbers. Gene expression analysis revealed high expression of IgM and IgD in the positive population, while IgT and T cell markers were absent (Figure 3–figure supplement 2H and I). These results collectively demonstrate that the monoclonal antibody specifically targets largemouth bass IgM.

      Validation of the depletion specificity and effects using an isotype-matched control antibody

      Largemouth bass (~3 to 5 g) were intraperitoneally injected with 300 µg of mouse anti-bass IgM monoclonal antibody (MoAb, clone 66, IgG1) or an isotype control (mouse IgG1, Abclonal, China). The concentration of IgM in the serum and gut mucus from these MoAb-treated fish was measured by western blot. Our results indicated that anti-bass IgM treatment led to a marked reduction in IgM protein levels in serum (Author response image 1A) and gut mucus (Author response image 1B) from day 1 post-treatment, in contrast to control fish treated with an isotype-matched control antibody.

      Author response image 1.

      Validation of the depletion specificity and effects using an isotype-matched control antibody. (A, B) The depletion effects of IgM from the serum (A) or gut mucus (B) of control or IgM‐depleted fish was detected by western blot. Iso: Isotype group; Dep: IgM‐depleted group.

      We fully agree with the reviewer that epitope characterization would further validate and elucidate the specificity of IgM MoAb. In the present study, we have demonstrated the antibody's IgM-specific binding through multiple classic experimental methods: (1) mass spectrometry analysis, (2) western blot analysis, (3) flow cytometry analysis, and (4) in vivo IgM depletion models. These results collectively support the conclusion that our MoAb specifically targets IgM. We feel that conformational epitope mapping requires structural biology approaches are out of the scope of this work, although future studies should address them in detail.

      Kinetics of IgM Depletion:

      Provide additional evidence for the observed rapid depletion of IgM from serum and mucus within one day, as this is inconsistent with previous findings. Include Western blot results to confirm IgM depletion kinetics.

      Thanks for the reviewer’s suggestion. Previous studies have demonstrated significant differences in the depletion efficiency and persistence of IgM<sup>+</sup> B cells between warm-water and cold-water fish species. In Nile tilapia (Oreochromis niloticus), a warm-water species, administration of 20 µg of anti-IgM antibody resulted in a near-complete depletion of IgM<sup>+</sup> B cells within 9 days (Li et al., 2023). In contrast, rainbow trout (Oncorhynchus mykiss), a cold-water species, required significantly higher doses (200–300 µg) to achieve similar depletion, which persisted in both blood and gut from week 1 up until week 9 post-depletion treatment (Ding et al., 2023). In this study, we investigated largemouth bass (Micropterus salmoides), a warm-water freshwater species. Administration of 300 μg of IgM antibody resulted in rapid IgM+ B cell depletion from serum and mucus within one day, indicating that the rapid depletion kinetics may be attributed to the combined effects of the elevated antibody dose and the species-specific immunological characteristics. Moreover, we provide a western blot analysis of serum and mucus after IgM depletion as shown in Figure 5–figure supplement 1G and H.

      Neutralizing Capacity Assays:

      Discuss the potential role of complement or other serum/mucus factors in the neutralization assays. Consider performing neutralization assays that isolate viruses, antibody, and target cells to assess the specific role of IgM.

      Thanks for the reviewer’s insightful suggestion regarding the potential influence of complement and other serum/mucus factors in our neutralization assays. We sincerely regret that the lack of clarity in our methodological description caused misunderstandings to the reviewer. In fact, prior to performing the virus neutralization assays, serum and mucus samples were heat-inactivated at 56 °C to eliminate potential complement interference. Now, we added the related description of heat-inactivation of serum and mucus samples in the revised manuscript (Lines 727-729). Moreover, our results showed that selective IgM depletion from high LMBV-specific IgM titer mucus and serum samples resulted in significantly increased viral loads and enhanced cytopathic effects (CPE), while no significant difference was observed compared to the control group (shown in Figure 6 of the manuscript).

      To further rule out complement or other factors, we purified IgM from serum and gut mucus of 42DPI-S fish for neutralization assays. Briefly, anti-bass IgM MoAb was coupled to CNBr-activated sepharose 4B beads and used for purification of IgM from both serum and gut mucus of 42DPI-S fish. After that, 100 µL of LMBV (1 × 10<sup>4</sup> TCID<sub>50</sub>) in MEM was incubated with PBS and purified IgM (100 µg/mL) at 28 °C for 1 hour and then the mixtures were applied to infect EPC cells. Medium or bass IgM was added to EPC cells as controls. We added the new text in Materials and methods of the revised manuscript in Lines 735-741. Our result showed that a significant reduction in both LMBV-MCP gene expression and protein levels was observed in EPC cells treated with purified IgM from serum (Figure 6–figure supplement 2A, C, and D) or gut mucus (Figure 6–figure supplement 2B, E, and F). Moreover, significantly lower CPE were observed in the IgM treated group, while no CPE was observed in medium and bass IgM group (Figure 6–figure supplement 2G). Collectively, these findings strongly suggest that the neutralization process is a potential mechanism of IgM, serving as a key molecule in adaptive immunity against viral infection. Here, we have incorporated these new findings in the Results section of the revised manuscript (Lines 382-388).

      IgT Depletion Model:

      To fully establish the role of IgM and IgT in antiviral defense, consider including an experimental group where IgT is depleted.

      Thanks for the reviewer’s suggestion. The role of IgT in mucosal antiviral immunity in teleost fish has been reported in our previous studies (Yu et al, 2022). However, this study primarily investigates the antiviral function of IgM in systemic and mucosal immunity and further analyzes the mechanisms of viral neutralization. In future research, we plan to establish an IgT and IgM double-depletion/knockout model to further elucidate their specific roles in antiviral immune defense.

      (2) Writing and Presentation:

      Introduction:

      Replace the cited review article on IgT absence with original research articles (e.g., Bradshaw et al., 2020; Györkei et al., 2024) to strengthen the context.

      Thank you for your valuable suggestion. We have changed in the revised manuscript (Lines 45-50) as “Notably, while IgT has been identified in the majority of teleost species, genomic analyses reveal its absence in some species, such as medaka (Oryzias latipes), channel catfish (Ictalurus punctatus), Atlantic cod (Gadus morhua), and turquoise killifish (Nothobranchius furzeri) (Bengtén et al., 2002; Bradshaw et al., 2020; Magadán-Mompóet al., 2011; Györkei et al., 2024).”

      Highlight the evolutionary contrast between the presence of the J chain in older cartilaginous fishes and amphibians and its loss in teleosts. Relevant references include Hagiwara et al., 1985, and Hohman et al., 2003.

      Thank you for your valuable suggestion. We have added the relevant description in the revised manuscript (Lines 61-66) “Interestingly, the assembly mechanism of IgM exhibits significant evolutionary variation across vertebrate lineages. In cartilaginous fishes and tetrapods, IgM is secreted as a J chain-linked pentamer, which may enhance multivalent antigen recognition (Hagiwara et al., 1985; Hohman et al., 2003). By contrast, teleosts have undergone J chain gene loss, resulting in the stable of tetrameric IgM formation (Bromage et al., 2004).”

      Acknowledge prior studies demonstrating the viral neutralization role of teleost IgM (e.g., Castro et al., 2021; Chinchilla et al., 2013). Avoid overstating the novelty of findings.

      Thanks for the reviewer’s suggestion. Here, we revised the related description: “More crucially, our study provides further insight into the role of sIgM in viral neutralization and firstly clarified the mechanism through which teleost sIgM blocks viral infection by directly targeting viral particles. From an evolutionary perspective, our findings indicate that sIgM in both primitive and modern vertebrates follows conserved principles in the development of specialized antiviral immunity.” in the revised manuscript (Lines 20-25) and “To the best of our knowledge, our study provides new insights into the role of sIgM in viral neutralization, suggesting a potential function of sIgM in combating viral infections.” in the revised manuscript (Lines 536-538).

      Clarify terms such as "primitive IgM" and avoid misleading evolutionary language (e.g., VLRs are not "candidates"; they mediate adaptive responses).

      Thanks for the reviewer’s suggestion. We changed the description of the primitive IgM in the sentence of the revised manuscript as “From an evolutionary perspective, our findings indicate that sIgM in both primitive and modern vertebrates follows conserved principles in the development of specialized antiviral immunity.” in the revised manuscript (Lines 23-25) and “our findings suggest that sIgM in both primitive and modern vertebrates utilize conserved mechanisms in response to viral infections” in the revised manuscript (Lines 574-575). Moreover, we deleted the description of VLRs for "candidates" and rewrote the relevant sentence in the revised manuscript (Lines 37-39) as “Agnathans, the most ancient vertebrate lineage, do not possess bona fide Ig but have variable lymphocyte receptors (VLRs) capable of mediating adaptive immune responses (Flajnik, 2018).”

      Results and Discussion:

      Address inconsistencies between data and claims, such as the statement that IgM plays a "crucial role" in protection against LMBV, which is not fully supported by mortality data.

      Thank you for your insightful comment. We have carefully reviewed our data and revised the language throughout the manuscript to ensure that our claims are fully consistent with the mortality data. We have changed the description of “IgM plays a crucial role in protection against LMBV” as “plays a role” (Line 119), “sIgM participates in” (Line 127), “contributes to immune protection” (Line 507) to more accurately reflect the mortality data

      Revise the model in Figure 8 to reflect the concerns raised regarding proliferation data, the role of IgM in protective resistance, and the potential contributions of complement in neutralization assays.

      Thank you for your insightful comment. We have added the raised concerns regarding “the viral proliferation data and the role of IgM in protective resistance” in Figure 8 (shown below). Meanwhile, we added relevant descriptions in the figure legends of the revised manuscript (Lines 587-592) as “Upon secondary LMBV infection, plasma cells produce substantial quantities of LMBV-specific IgM. Critically, these virus-specific sIgM from both mucosal and systemic sources has the ability to neutralize the virus by directly binding viral particles and blocking host cell entry, thereby effectively reducing the proliferation of viruses within tissues. Consequently, the IgM-mediated neutralization confers protection against LMBV-induced tissue damage and significantly reduced mortality during secondary infection.”

      However, considering the following two reasons: (1) heat-inactivation of serum and mucus samples at 56°C prior to neutralization assays effectively abolished complement activity, and (2) purified IgM from both serum and gut mucus demonstrated comparable neutralization capacity, confirming IgM-dependent mechanisms independent of complement. Therefore, we did not add the potential function of complement in neutralization to Figure 8.

      Provide a comparative analysis with other vertebrate models to strengthen the evolutionary implications of findings.

      Thank you for your insightful comment. We have added comparative analyses across additional vertebrate models in the discussion of the revised manuscript to enhance the evolutionary perspective of our findings. The details are as follows:

      “Virus-specific IgM production has been well-documented in reptiles, birds, and mammals upon viral infection (Dascalu et al., 2024; Harrington et al., 2021; Hetzel et al., 2021; Neul et al., 2017;). While current evidence confirms the capacity of cartilaginous fish and amphibians to mount specific IgM responses against bacterial pathogens and immune antigens (Dooley and Flajnik, 2005; Ramsey et al., 2010), the potential for viral induction of analogous IgM-mediated immunity in these species remains unresolved.” in the revised manuscript (Lines 498-504) and “Extensive studies in endotherms (birds and mammals) have demonstrated that specific IgM contributes to viral resistance by neutralizing viruses (Baumgarth et al., 2000; Diamond et al., 2013; Ku et al., 2021; Hagan et al., 2016; Singh et al., 2022). In contrast, the neutralizing activity of IgM in amphibians and reptiles remains largely unexplored. Although viral infections have been shown to induce neutralizing antibodies in Chinese soft-shelled turtles (Pelodiscus sinensis) (Nie and Lu, 1999), the specific Ig isotypes mediating this response have yet to be elucidated. In teleost fish, IgM has been shown to possess viral neutralizing activity similar to that observed in endotherms (Castro et al., 2013; Ye et al., 2013). Furthermore, our recent work demonstrated that secretory IgT (sIgT) in rainbow trout (Oncorhynchus mykiss) can neutralize viruses, significantly reducing susceptibility to infection (Yu et al., 2022). However, whether IgM in teleost fish possesses the antiviral neutralizing capacity necessary for fish to resist reinfection remains poorly understood.” in the revised manuscript (Lines 521-534)

      Include a description of the Western blot procedure shown in Figures 7D and 7F in the Methods section.

      Thank you for your suggestion. A detailed protocol for the western blot experiments presented in Figures 7D and 7F has been added to the Methods section (Western Blot Analysis) in the revised manuscript (Lines 684-687). The details are as follows: Gut mucus, serum, and cells samples were analyzed by western blot as described by Yu et al (2022). Briefly, the samples were separated using 4%–15% SDS-PAGE Ready Gel (Thermo Fisher Scientific, USA) and subsequently transferred to Sequi-Blot polyvinylidene fluoride (PVDF) membranes (Bio-Rad, USA). The membranes were blocked using a 8% skim milk for 2 hours and then incubated with monoclonal antibody (MoAb). For IgM concentration detection, the membranes were incubated with mouse anti-bass IgM MoAb (clone 66, IgG1, 1 μg/mL) and then incubation with HRP goat-anti-mouse IgG (Invitrogen, USA) for 1 hour. IgM concentrations were determined by comparing the signal strength values to a standard curve generated with known amounts of purified bass IgM. For neutralizing effect detection, the membranes were incubated with mouse anti-LMBV MCP MoAb (4A91E7, 1 μg/mL) followed by incubation with HRP goat-anti-mouse IgG (Invitrogen, USA) for 1 hour. The β-actin is used as a reference protein to standardize the differences between samples. Immunoblots were scanned using the GE Amersham Imager 600 (GE Healthcare, USA) with ECL solution (EpiZyme, China).

      Ensure all figures are labeled appropriately (e.g., replace "Morality" with "Mortality" in Figure 5A).

      Thanks for bringing this to our attention. We have corrected the label in Figure 5A (shown below) and reviewed all figures to ensure that they are appropriately labeled.

      (3) Minor Corrections:

      Line 117: Correct the typo "across both both."

      Thanks for bringing this to our attention. We have changed “across both both” to “across both” in the revised manuscript (Line 119).

      Line 203: Revise to "IgM plays a role (not crucial role)."

      Thank you for your valuable suggestion. We have modified the description of IgM's role from “crucial” to “plays a role” to better align with our experimental findings in the revised manuscript (Line 202).

      Line 684: Correct the typo "given an intravenous injection with 200 μg."

      Thanks for bringing this to our attention. We have corrected the phrase to “given an intravenous injection with 200 μg” in the revised manuscript (Line 700-701).

      Line 686: Fix the sentence fragment "previously. EdU+ cells."

      Thank you for your careful review. We have revised the sentence fragment for clarity in the revised manuscript (Lines 702-703).

      Abstract and other sections: Adjust language to remove claims of novelty unsupported by data, particularly regarding the role of IgM in viral neutralization.

      Thank you for your constructive feedback. We have thoroughly reviewed and revised the language throughout the abstract and other sections to remove any unsupported claims of novelty, particularly regarding the role of IgM in viral neutralization in the revised manuscript (Lines 20-25).

      (4)Technical Details:

      Verify data availability, including raw data and analysis scripts, in line with eLife's data policies. Include detailed descriptions of all methods, particularly those involving Western blot analysis and antibody validation.

      Thank you for your suggestion. We added the verify data availability, including raw data and analysis scripts as “The raw RNA sequencing data have been deposited in the NCBI Sequence Read Archive under BioProject accession number PRJNA1254665. The mass spectrometny proteomics data have been deposited to the iProX platform with the dataset identifier IPX0011847000.” in the revised manuscript (Lines 808-811).

      (5) Ethical and Policy Adherence:

      Confirm compliance with ethical standards for animal use and antibody development.Ensure proper citation of all referenced works and accurate reporting of prior findings.

      Thank you for your valuable comment. We confirm that our study fully complies with ethical standards for animal use and antibody development. Additionally, we have carefully reviewed the manuscript to ensure that all referenced works are properly cited and that prior findings are accurately reported.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      Overall, the conclusions of the paper are mostly supported by the data but may be overstated in some cases, and some details are also missing or not easily recognizable within the figures. The provision of additional information and analyses would be valuable to the reader and may even benefit the authors' interpretation of the data. 

      We thank the reviewer for the thoughtful and constructive feedback. We are pleased that the reviewer found the overall conclusions of our paper to be well supported by the data, and we appreciate the suggestions for improving figure clarity and interpretive accuracy. Below, we address each point with corresponding revisions.

      The conclusion that DREADD expression gradually decreases after 1.5-2 years is only based on a select few of the subjects assessed; in Figure 2, it appears that only 3 hM4Di cases and 2 hM3Dq cases are assessed after the 2-year timepoint. The observed decline appears consistent within the hM4Di cases, but not for the hM3Dq cases (see Figure 2C: the AAV2.1-hSyn-hM3Dq-IRES-AcGFP line is increasing after 2 years.) 

      We agree that our interpretation should be stated more cautiously, given the limited number of cases assessed beyond the two-year timepoint. In the revised manuscript, we have clarified in the Results that the observed decline is based on a subset of animals. We have also included a text stating that while a consistent decline was observed in hM4Di-expressing monkeys, the trajectory for hM3Dq expression was more variable with at least one case showing an increased signal beyond two years.

      Revised Results section:

      Lines 140, “hM4Di expression levels remained stable at peak levels for approximately 1.5 years, followed by a gradual decline observed in one case after 2.5 years, and after approximately 3 years in the other two cases (Figure 2B, a and e/d, respectively). Compared with hM4Di expression, hM3Dq expression exhibited greater post-peak fluctuations. Nevertheless, it remained at ~70% of peak levels after about 1 year. This post-peak fluctuation was not significantly associated with the cumulative number of DREADD agonist injections (repeated-measures two-way ANOVA, main effect of activation times, F<sub>(1,6)</sub> = 5.745, P = 0.054). Beyond 2 years post-injection, expression declined to ~50% in one case, whereas another case showed an apparent increase (Figure 2C, c and m, respectively).”

      Given that individual differences may affect expression levels, it would be helpful to see additional labels on the graphs (or in the legends) indicating which subject and which region are being represented for each line and/or data point in Figure 1C, 2B, 2C, 5A, and 5B. Alternatively, for Figures 5A and B, an accompanying table listing this information would be sufficient. 

      We thank the reviewer for these helpful suggestions. In response, we have revised the relevant figures (Fig. 1C, 2B, 2C, and 5) as noted in the “Recommendations for the authors”, including simplifying visual encodings and improving labeling. We have also updated Table 2 to explicitly indicate the animal ID and brain regions associated with each data point shown in the figures.

      While the authors comment on several factors that may influence peak expression levels, including serotype, promoter, titer, tag, and DREADD type, they do not comment on the volume of injection. The range in volume used per region in this study is between 2 and 54 microliters, with larger volumes typically (but not always) being used for cortical regions like the OFC and dlPFC, and smaller volumes for subcortical regions like the amygdala and putamen. This may weaken the claim that there is no significant relationship between peak expression level and brain region, as volume may be considered a confounding variable. Additionally, because of the possibility that larger volumes of viral vectors may be more likely to induce an immune response, which the authors suggest as a potential influence on transgene expression, not including volume as a factor of interest seems to be an oversight. 

      We thank the reviewer for raising this important issue. We agree that injection volume could act as a confounding variable, particularly since larger volumes were used in only handheld cortical injections. This overlap makes it difficult to disentangle the effect of volume from those of brain region or injection method. Moreover, data points associated with these larger volumes also deviated when volume was included in the model.

      To address this, we performed a separate analysis restricted to injections delivered via microinjector, where a comparable volume range was used across cases. In this subset, we included injection volume as additional factor in the model and found that volume did not significantly impact peak expression levels. Instead, the presence of co-expressed protein tags remained a significant predictor, while viral titer no longer showed a significant effect. These updated results have replaced the originals in the revised Results section and in the new Figure 5. We have also revised the Discussion to reflect these updated findings.

      The authors conclude that vectors encoding co-expressed protein tags (such as HA) led to reduced peak expression levels, relative to vectors with an IRES-GFP sequence or with no such element at all. While interesting, this finding does not necessarily seem relevant for the efficacy of long-term expression and function, given that the authors show in Figures 1 and 2 that peak expression (as indicated by a change in binding potential relative to non-displaced radioligand, or ΔBPND) appears to taper off in all or most of the constructs assessed. The authors should take care to point out that the decline in peak expression should not be confused with the decline in longitudinal expression, as this is not clear in the discussion; i.e. the subheading, "Factors influencing DREADD expression," might be better written as, "Factors influencing peak DREADD expression," and subsequent wording in this section should specify that these particular data concern peak expression only. 

      We appreciate this important clarification. In response, we have revised the title to "Protein tags reduce peak DREADD expression levels" in the Results section and “Factors influencing peak DREADD expression levels” in the Discussion section. Additionally, we specified that our analysis focused on peak ΔBP<sub>ND</sub> values around 60 days post-injection. We have also explicitly distinguished these findings from the later-stage changes in expression seen in the longitudinal PET data in both the Results and Discussion sections.

      Reviewer #1 (Recommendations for the authors):

      (1) Will any of these datasets be made available to other researchers upon request?

      All data used to generate the figures have been made publicly available via our GitHub repository (https://github.com/minamimoto-lab/2024-Nagai-LongitudinalPET.git). This has been stated in the "Data availability" section in the revised manuscript.

      (2) Suggested modifications to figures:

      a) In Figures 2B and C, the inclusion of "serotype" as a separate legend with individual shapes seems superfluous, as the serotype is also listed as part of the colour-coded vector

      We agree that the serotype legend was redundant since this information is already included in the color-coded vector labels. In response, we have removed the serotype shape indicators and now represent the data using only vector-construct-based color coding for clarity in Figure 2B and C.

      b) In Figures 3A and B, it would be nice to see tics (representing agonist administration) for all subjects, not just the two that are exemplified in panels C-D and F-H. Perhaps grey tics for the non-exemplified subjects could be used.

      In response, we have included black and white ticks to indicate all agonist administration across all subjects in Figure 3A and B, with the type of agonist clearly specified. 

      c) In Figure 4C, a Nissl- stained section is said to demonstrate the absence of neuronal loss at the vector injection sites. However, if the neuronal loss is subtle or widespread, this might not be easily visualized by Nissl. I would suggest including an additional image from the same section, in a non-injected cortical area, to show there is no significant difference between the injected and non-injected region.

      To better demonstrate the absence of neuronal loss at the injection site, we have included an image from the contralateral, non-injected region of the same section for comparison (Fig. 4C).

      d) In Figure 5A: is it possible that the hM3Dq construct with a titer of 5×10^13 gc/ml is an outlier, relative to the other hM3Dq constructs used?

      We thank the reviewer for raising this important observation. To evaluate whether the high-titer constructs represented a statistical outlier that might artifactually influence the observed trends, we performed a permutation-based outlier analysis. This assessment identified this point in question, as well as one additional case (titer 4.6 x 10e13 gc/ml, #255, L_Put), as significant outlier relative to the distribution of the dataset.

      Accordingly, we excluded these two data points from the analysis. Importantly, this exclusion did not meaningfully alter the overall trend or the statistical conclusions—specifically, the significant effect of co-expressed protein tags on peak expression levels remain robust. We have updated the Methods section to describe this outlier handling and added a corresponding note in the figure legend.

      Reviewer #2 (Public review): 

      Weaknesses 

      This study is a meta-analysis of several experiments performed in one lab. The good side is that it combined a large amount of data that might not have been published individually; the downside is that all things were not planned and equated, creating a lot of unexplained variances in the data. This was yet judiciously used by the authors, but one might think that planned and organized multicentric experiments would provide more information and help test more parameters, including some related to inter-individual variability, and particular genetic constructs. 

      We thank the reviewer for bringing this important point to our attention. We fully acknowledge that the retrospective nature of our dataset—compiled from multiple studies conducted within a single laboratory—introduces variability related to differences in injection parameters and scanning timelines. While this reflects the practical realities and constraints of long-term NHP research, we agree that more standardized and prospectively designed studies would better control such source of variances. To address this, we have added the following statement to the "Technical consideration" section in Discussion:

      Lines 297, "This study included a retrospective analysis of datasets pooled from multiple studies conducted within a single laboratory, which inherently introduced variability across injection parameters and scan intervals. While such an approach reflects real-world practices in long-term NHP research, future studies, including multicenter efforts using harmonized protocols, will be valuable for systematically assessing inter-individual differences and optimizing key experimental parameters."

      Reviewer #2 (Recommendations for the authors):

      I just have a few minor points that might help improve the paper:

      (1) Figure 1C y-axis label: should add deltaBPnd in parentheses for clarity.

      We have added “ΔBP<sub>ND</sub>” to the y-axis label for clarity.

      The choice of a sigmoid curve is the simplest clear fit, but it doesn't really consider the presence of the peak described in the paper. Would there be a way to fit the dynamic including fitting the peak?

      We agree that using a simple sigmoid curve for modeling expression dynamics is a limitation. In response to this and a similar comment from Reviewer #3, we tested a double logistic function (as suggested) to see if it better represented the rise and decline pattern. However, as described below, the original simple sigmoid curve was a better fit for the data. We have included a discussion regarding this limitation of this analysis. See Reviewer #3 recommendations (2) for details.

      The colour scheme in Figure 1C should be changed to make things clearer, and maybe use another dimension (like dotted lines) to separate hM4Di from hM3Dq.

      We have improved the visual clarity of Figure 1C by modifying the color scheme to represent vector construct and using distinct line types (dashed for hM4Di and solid for hM3Dq data) to separate DREADD type.

      (2) Figure 2

      I don't understand how the referencing to 100 was made: was it by selecting the overall peak value or the peak value observed between 40 and 80 days? If the former then I can't see how some values are higher than the peak. If the second then it means some peak values occurred after 80 days and data are not completely re-aligned.

      We thank the reviewer for the opportunity to clarify this point. The normalization was based on the peak value observed between 40–80 days post-injection, as this window typically captured the peak expression phase in our dataset (see Figure 1). However, in some long-term cases where PET scans were limited during this period—e.g., with one scan performing at day 40—it is possible that the actual peak occurred later. Therefore, instances where ΔBP<sub>ND</sub> values slightly exceeded the reference peak at later time points likely reflect this sampling limitation. We have clarified this methodological detail in the revised Results section to improve transparency.

      The methods section mentions the use of CNO but this is not in the main paper which seems to state that only DCZ was used: the authors should clarify this

      Although DCZ was the primary agonist used, CNO and C21 were also used in a few animals (e.g., monkeys #153, #221, and #207) for behavioral assessments. We have clarified this in the Results section and revised Figure 3 to indicate the specific agonist used for each subject. Additionally, we have updated the Methods section to clearly specify the use and dosage of DCZ, CNO, and C21, to avoid any confusion regarding the experimental design.

      Reviewer #3 (Public review): 

      Minor weaknesses are related to a few instances of suboptimal phrasing, and some room for improvement in time course visualization and quantification. These would be easily addressed in a revision. <br /> These findings will undoubtedly have a very significant impact on the rapidly growing but still highly challenging field of primate chemogenetic manipulations. As such, the work represents an invaluable resource for the community.

      We thank the reviewer for the positive assessment of our manuscript and for the constructive suggestions. We address each comment in the following point-by-point responses and have revised the manuscript accordingly.

      Reviewer #3 (Recommendations for the authors):

      (1) Please clarify the reasoning was, behind restricting the analysis in Figure 1 only to 7 monkeys with subcortical AAV injection?

      We focused the analysis shown in Figure 1 on 7 monkeys with subcortical AAV injections who received comparative injection volumes. These data were primary part of vector test studies, allowing for repeated PET scans within 150 days post-injection. In contrast, monkeys with cortical injections—including larger volumes—were allocated to behavioral studies and therefore were not scanned as frequently during the early phase. We will clarify this rationale in the Results section.

      (2) Figure 1: Not sure if a simple sigmoid is the best model for these, mostly peaking and then descending somewhat, curves. I suggest testing a more complex model, for instance, double logistic function of a type f(t) = a + b/(1+exp(-c*(t-d))) - e/(1+exp(-g*(t-h))), with the first logistic term modeling the rise to peak, and the second term for partial decline and stabilization

      We appreciate the reviewer’s thoughtful suggestion to use a double logistic function to better model both the rising and declining phases of the expression curve. In response to this and similar comments from Reviewer #1, we tested the proposed model and found that, while it could capture the peak and subsequent decline, the resulting fit appeared less biologically plausible (See below). Moreover, model comparison using BIC favored the original simple sigmoid model (BIC = 61.1 vs. 62.9 for the simple and double logistic model, respectively). This information has been included in the revised figure legend for clarity.

      Given these results, we retained the original simple sigmoid function in the revised manuscript, as it provides a sufficient and interpretable approximation of the early expression trajectory—particularly the peak expression-time estimation, which was the main purpose of this analysis. We have updated the Methods section to clarify our modeling and rationale as follows:

      Lines 530, "To model the time course of DREADD expression, we used a single sigmoid function, referencing past in vivo fluorescent measurements (Diester et al., 2011). Curve fitting was performed using least squares minimization. For comparison, a double logistic function was also tested and evaluated using the Bayesian Information Criterion (BIC) to assess model fit."

      We also acknowledge that a more detailed understanding of post-peak expression changes will require additional PET measurements, particularly between 60- and 120-days post-injection, across a larger number of animals. We have included this point in the revised Discussion to highlight the need for future work focused on finer-grained modeling of expression decline:

      Lines 317, “Although we modeled the time course of DREADD expression using a single sigmoid function, PET data from several monkeys showed a modest decline following the peak. While the sigmoid model captured the early-phase dynamics and offered a reliable estimate of peak timing, additional PET scans—particularly between 60- and 120-days post-injection—will be essential to fully characterize the biological basis of the post-peak expression trajectories.”

      Author response image 1.<br />

      (3) Figure 2: It seems that the individual curves are for different monkeys, I counted 7 in B and 8 in C, why "across 11 monkeys"? Were there several monkeys both with hM4Diand hM3Dq? Does not look like that from Table 1. Generally, I would suggest associating specific animals from Tables 1 and 2 to the panels in Figures 1 and 2.

      Some animals received multiple vector types, leading to more curves than individual subjects. We have revised the figure legends and updated Table 2 to explicitly relate each curve with the specific animal and brain region.

      (4) I also propose plotting the average of (interpolated) curves across animals, to convey the main message of the figure more effectively.

      We agree that plotting the mean of the interpolated expression curves would help convey the group trend. We added averaged curves to Figure 2BC.

      (5) Similarly, in line 155 "We assessed data from 17 monkeys to evaluate ... Monkeys expressing hM4Di were assessed through behavioral testing (N = 11) and alterations in neuronal activity using electrophysiology (N = 2)..." - please explain how 17 is derived from 11, 2, 5 and 1. It is possible to glean from Table 1 that it is the calculation is 11 (including 2 with ephys) + 5 + 1 = 17, but it might appear as a mistake if one does not go deep into Table 1.

      We have clarified in both the text and Table 1 that some monkeys (e.g., #201 and #207) underwent both behavioral and electrophysiological assessments, resulting in the overlapping counts. Specifically, the dataset includes 11 monkeys for hM4Di-related behavior testing (two of which underwent electrophysiology testing), 5 monkeys assessed for hM3Dq with FDG-PET, and 1 monkey assessed for hM3Dq with electrophysiology, totaling 19 assessments across 17 monkeys. We have revised the Results section to make this distinction more explicit to avoid confusion, as follows:

      Lines 164, "Monkeys expressing hM4Di (N = 11) were assessed through behavioral testing, two of which also underwent electrophysiological assessment. Monkeys expressing hM3Dq (N = 6) were assessed for changes in glucose metabolism via [<sup>18</sup>F]FDG-PET (N = 5) or alterations in neuronal activity using electrophysiology (N = 1).”

      (6) Line 473: "These stock solutions were then diluted in saline to a final volume of 0.1 ml (2.5% DMSO in saline), achieving a dose of 0.1 ml/kg and 3 mg/kg for DCZ and CNO, respectively." Please clarify: the injection volume was always 0.1 ml? then it is not clear how the dose can be 0.1 ml/kg (for a several kg monkey), and why DCZ and CNO doses are described in ml/kg vs mg/kg?

      We thank the reviewer for pointing out this ambiguity. We apologize for the oversight and also acknowledge that we omitted mention of C21, which was used in a small number of cases. To address this, we have revised the “Administration of DREADD agonist” section of the Methods to clearly describe the preparation, the volume, and dosage for each agonist (DCZ, CNO, and C21) as follows:

      Lines 493, “Deschloroclozapine (DCZ; HY-42110, MedChemExpress) was the primary agonist used. DCZ was first dissolved in dimethyl sulfoxide (DMSO; FUJIFILM Wako Pure Chemical Corp.) and then diluted in saline to a final volume of 1 mL, with the final DMSO concentration adjusted to 2.5% or less. DCZ was administered intramuscularly at a dose of 0.1 mg/kg for hM4Di activation, and at 1–3 µg/kg for hM3Dq activation. For behavioral testing, DCZ was injected approximately 15 min before the start of the experiment unless otherwise noted. Fresh DCZ solutions were prepared daily.

      In a limited number of cases, clozapine-N-oxide (CNO; Toronto Research Chemicals) or Compound 21 (C21; Tocris) was used as an alternative DREADD agonist for some hM4Di experiments. Both compounds were dissolved in DMSO and then diluted in saline to a final volume of 2–3 mL, also maintaining DMSO concentrations below 2.5%. CNO and C21 were administered intravenously at doses of 3 mg/kg and 0.3 mg/kg, respectively.”

      (7) Figure 5A: What do regression lines represent? Do they show a simple linear regression (then please report statistics such as R-squared and p-values), or is it related to the linear model described in Table 3 (but then I am not sure how separate DREADDs can be plotted if they are one of the factors)?

      We thank the reviewer for the insightful question. In the original version of Figure 5A, the regression lines represented simple linear fits used to illustrate the relationship between viral titer and peak expression levels, based on our initial analysis in which titer appeared to have a significant effect without any notable interaction with other factors (such as DREADD type).

      However, after conducting a more detailed analysis that incorporated injection volume as an additional factor and excluded cortical injections and statistical outliers (as suggested by Reviewer #1), viral titer was no longer found to significantly predict peak expression levels. Consequently, we revised the figure to focus on the effect of reporter tag, which remained the most consistent and robust predictor in our model.

      In the updated Figure 5, we have removed the relationship between viral titer and expression level with regression lines.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations for the authors):

      Because many conclusions are drawn from overexpression studies and from a single cell line (HEK293), it is unclear how general these effects are. In particular, one of the main claims put forth in this manuscript is that of specificity, namely, that FZD5/8, and none of the other FZDs, are uniquely involved in this internalization and degradation. While there are examples of similar specificities, many of these examples can be attributed to a particular cellular context. Without demonstrating that this FZD5/8 specificity is observed in multiple cell lines and contexts, this point remains unconvincing and questionable. One way to address this point of criticism is to omit the word "specifically" in the title and soften the language concerning this idea throughout the manuscript.

      We appreciate your valuable comments and suggestions. We have removed the word “specifically” from the title and softened the language concerning this idea throughout the manuscript. Moreover, we performed new experiments to show that Wnt3a/5a induces FZD5/8 endocytosis and degradation and that IWP-2 treatment increases the cell surface levels of FZD5/8 in cell lines other than 293A (Figure 1-Figure supplement 1 and Figure 2-Figure supplement 1). These results indicate that Wnt-induced FZD5/8 endocytosis and degradation are not cell specific.

      The starting point for these studies is a survey of all 10 FZDs, V5-tagged and overexpressed in HEK293 cells. Here, the authors observed a decline in cell surface levels of only FZD5 and 8 in response to Wnt3a and Wnt5a. As illustrated in the immunoblot (Fig 1B), several FZDs were poorly expressed, including FZD1, 3, 6 and 9, which calls into question that only FZD5 and 8 were affected. Furthermore, total levels of FZD8 don't diminish appreciably, as claimed by the authors, and only FZD5 shows a subtle decline upon WNT treatment. All of these experiments are performed with overexpressed V5-tagged FZD proteins or with endogenously V5-tagged (KI) proteins, and it is possible that overexpression or tagging lead to potentially artifactual observations. Examining the effects of WNTs on FZD protein localization and levels need to be done with endogenously expressed, non-tagged FZDs. In this context, it is somewhat puzzling that the authors don't show such an experiment using the pan- and FZD5/8-specific antibodies, which they use in multiple experiments throughout the manuscript. With these available tools it should be possible to examine FZD levels at the cell surface in response to Wnt3a and Wnt5a, ideally in multiple cell lines.

      We appreciate your valuable comments and suggestions. Figure 1B shows the results of the follow-up study shown in Figure 1A. As shown in Figure 1A, we used flow cytometry analysis to detect the cell surface levels of stably expressed FZDs and found that Wnt3a/5a specifically reduced the levels of FZD5/8 on the cell surface, suggesting that Wnt3a/5a induces FZD5/8 endocytosis. As shown in Figure 1B and C, we performed immunoblotting to examine whether Wnt3a/5a-induced FZD5/8 internalization resulted in FZD5/8 degradation. Notably, most FZDs exhibit two bands on immunoblots, as also suggested by other published studies, and the upper bands represent the mature form that is fully glycosylated and presented to the cell surface (see also new Figure 2L), whereas the lower bands represent the immature form. Our results clearly indicated that Wnt3a/5a treatment reduced the levels of the mature forms of both FZD5 and FZD8, although the immunoblotting signals of the mature form of FZD8 (upper bands) were relatively weak. The immunoblotting signals of the other FZDs varied, and some of them (including FZD1, -3, -6 and -9) were relatively weak; however, according to the results in Figure 1A, all of the FZDs were expressed and present on the cell surface.

      Commercially available FZD5/8 antibodies, including those used in published studies, cannot detect endogenous FZD5/8 or can only recognize immature FZD5 in our hands, which is why we have to use the CRISPR-CAS9-based KI technique to introduce a V5 tag to FZD5 and FZD7. Notably, in the overexpression experiments, the V5 tag is on the amino terminus, and in the KI experiments, the V5 tag is on the carboxyl terminus of FZDs, which may minimize the potential artificial effects of the V5 tag on the immunoblotting assays.

      The monoclonal antibodies used in this study, such as anti-pan-FZD, anti-FZD5/8, and anti-FZD4 antibodies, are neutralizing antibodies that can compete with Wnt ligands to bind to the FZD CRD. These antibodies have been successfully used to detect the surface levels of FZDs via flow cytometry assays. However, as the binding affinity of the Wnt-FZD CRD is comparable to the binding affinity of the antibody-FZD, we were cautious in using these antibodies to detect the cell surface levels of FZDs when the cells were treated with Wnt3a/5a CM, which contains relatively high concentrations of Wnt3a/5a. As shown in Author response image 1, Wnt3a or Wnt5a treatment dramatically reduced the endogenous cell surface level of FZD5/8, as detected by flow cytometry using the anti-FZD5/8 antibody. However, in another experiment, HEK293A cells were first incubated with cold Wnt3a or Wnt5a CM at 4°C to minimize endocytosis and then analyzed via flow cytometry using the anti-FZD5/8 antibody. The results showed that Wnt3a/5a incubation reduced the floe cytometry signals, suggesting that Wnt3a/5a binding to FZD5/8 might interfere with antibody-FZD5/8 binding, although we cannot exclude the possibility that Wnt3a/5a may induce FZD5/8 endocytosis at 4°C (Author response image 1).

      Author response image 1.

      (A) HEK293A cells were treated with control, Wnt3a or Wnt5a CM for 2 hours at 37°C in a humidified incubator and were analyzed via flow cytometry using the anti-FZD5/8 antibody.

      (B) HEK293A cells were incubated with control, Wnt3a or Wnt5a CM for 1 h at 4°C and analyzed by flow cytometry using the anti-FZD5/8 antibody.

       

      Several experiments rely on gene-edited clonal cell lines, including knockouts of FZD5/8, RNF43/ZNRF3, and DVL. Gene knockouts were confirmed by genomic DNA sequencing and, for DVL and FZD5/8, by loss of protein expression. While these KO lines are powerful tools to study gene function, there is a concern for clonal variability. Each cell line may have acquired additional changes as a result of gene editing. In addition, there may be compensatory changes in gene expression as a consequence of the loss of certain genes. For example, expression of other FZDs may increase in FZD5/8 DKO cells. To address this critique, the authors should show that re-expression of the knocked-out genes rescues the observed effect. This is done in some instances (Fig 5E, G, H) but not in other instances, such as with the DVL TKO (Fig. 3). Since the authors assert that DVL is important for FZD internalization in the absence of WNT, but not for FZD internalization in the presence of WNT, this particular rescue experiment is important. This is a potentially important finding and it should be confirmed by re-expression of DVL in the TKO line. As an alternative, conditional knockdown using Tet-inducible shRNA expression could address concerns for clonal variability.

      We appreciate your valuable comments and suggestions. We re-expressed DVL2 in DVLTKO cells stably expressing V5-linker-FZD5 or V5-linker-FZD7. As shown in Figure 3G-K, re-expression of DVL2 rescued the decreased Wnt-independent endocytosis of FZD5 and FZD7 caused by DVL1/2/3 knockout.

      Given the significant differences in signaling activity by Wnt3a and Wnt5a, it is somewhat surprising that all experiments shown in this manuscript do not identify distinguishing features between Wnt3a and Wnt5a. In addition, it is unclear why the authors switch between Wnt3a and Wnt5a. For example, Figures 1C, 3G-J, 4C-D only use Wnt5a. In contrast, Figures 6E and H use Wnt3a, most likely because b-catenin stabilization is examined, an effect generally not observed with Wnt5a. The choice of which Wnt is examined/used appears to be somewhat arbitrary and the authors never provide any explanations for these choices. In the end, this type of inconsistency becomes puzzling when the authors present, quite convincingly, in Figure 7, that both Wnt3a and 5a promote an interaction between FZD5/8 and RNF43 through proximity biotin labeling.

      Although Wnt3a and Wnt5a are significantly different in triggering intracellular signaling pathways, both bind FZD5/8 and induce FZD5/8 endocytosis and degradation similarly. When FZD5 is stably overexpressed, Wnt5a has slightly stronger effects on inducing FZD5 endocytosis and degradation, possibly because the Wnt5a concentration may be higher than the Wnt3a concentration in our CM, which is why we used Wnt5a CM in some experiments when V5-FZD5 was overexpressed. In the revised manuscript, we used both Wnt3a and Wnt5a CM in the experiments as you suggested, as shown in Figure 1C, 3G-K and Figure 4-Figure supplement 1.

      Minor Points:

      Figure 3G and I: it is curious that individual cells are shown in the "0 h" samples, while the "Con 1 h" and "Wnt5a 1 h" show multiple cells with several making direct contact with each other. This is notable because the V5 staining at sites of cell-cell contact are quite distinct and variable between control and Wnt5a-treated and WT versus DVL TKO cells. Also, sub-cellular localization of FZD5 (V5 tag) puncta is quite distinct between Con and Wnt5a: puncta in Wnt5a-treated cells appear to be more plasma membrane proximal than in Con cells. These points may be easy to address by showing images of cells that are more similar with respect to cell number and density for each condition.

      Thank you for your suggestions. We repeated these experiments and added Wnt3a treatment and adjusted the cell density. Images including an individual cell were selected for presentation.

      Figure 5E: the following statement is confusing/misleading: "Furthermore, reintroducing ZNRF3 or RNF43 into ZRDKO cells efficiently restored the increase in cytosolic β-catenin levels, whereas the expression of RNF130 or RNF150, two structurally similar transmembrane E3 ubiquitin ligases, did not (Fig. 5E)." First, reintroduction of ZNRF3 or RNF43 restores cytosolic b-catenin levels; it does not restore the increase in b-catenin. Second, the claim that RNF130 fails to have this effect is not substantiated since it is barely expressed.

      Thank you for your suggestions and comments. We reorganized the language to make the statement clearer. Notably, the expression level of RNF130 was relatively low compared with that of other E3 ligases, but RNF130 was expressed (Figure 5E darker exposure) and could reduce the cell surface levels of FZDs, as shown in Figure 5G.

      Reviewer #2 (Recommendations for the authors):

      (1) Given their results the authors conclude that upregulation of Frizzled on the plasma membrane is not sufficient to explain the stabilization of beta-catenin seen in the ZNRF3/RNF43 mutant cells. This interpretation is sound, and they suggest in the discussion that ZNRF3/RNF43-mediated ubiquitination could serve as a sorting signal to sort endocytosed FZD to lysosomes for degradation and that absence or inhibition of this process would promote FZD recycling. This should be relatively easy to test using surface biotinylation experiments and would considerably strengthen the manuscript.

      Thank you for your valuable suggestions and comments. We performed cell surface biotinylation experiments in HEK293A FZD5KI cells, as shown in Figure 2L. The results indicated that Wnt3a or Wnt5a treatment induced the degradation of FZD5 on the cell surface, which was antagonized by cotreatment with RSPO1. We did not perform a more detailed endocytosis/recycling biotinylation experiment that requires complex reversible biotinylation and multiple washing steps because HEK293A cells are fragile in culture and not easy to handle. Furthermore, the results shown in Figure 4 indicate that knockout of ZNRF3/RNF43 or RSPO1 significantly blocked the degradation of internalized FZD5 and reduced the colocalization of internalized FZD5 with lysosomal markers, suggesting that Wnt3a/5a induced lysosomal degradation of FZD5 in the presence of ZNRF3/RNF43 and that the internalized FZD5 was most likely recycled back to the cell surface when ZNRF3/RNF43 was knocked out or inhibited by RSPO1.

      (2) The authors show that the FZD5 CRD domain is required for endocytosis since a mutant FZD5 protein in which the CRD is removed does not undergo endocytosis. This is perhaps not surprising since this is the site of Wnt binding, but the authors show that a chimeric FZD5CRD-FZD4 receptor can confer Wnt-dependent endocytosis to an otherwise endocytosis incompetent FZD4 protein. Since the linker region between the CRD and the first TM differs between FZD5 and FZD4, it would be interesting to understand whether the CRD specifically or the overall arrangement (such as the spacing) is the most important determinant.

      Our results in Figure 1D-H clearly show that the CRD of FZD5 specifically is both necessary and sufficient for Wnt3a/5a-induced FZD5 endocytosis, as replacing the CRD alone in FZD5 with the CRD from either FZD4 or FZD7 completely abolished Wnt-induced endocytosis, whereas replacing the CRD alone in FZD4 or FZD7 with the FZD5 CRD alone could confer Wnt-induced endocytosis.

      (3) I find it surprising that only FZD5 and FZD8 appear to undergo endocytosis or be stabilized at the cell surface upon ZNRF3/RNF43 knockout. Is this consistent with previous literature? Is that a cell-specific feature? These findings should be tested in a different cell line, with possibly different relative levels of ZNRF3 and RNF43 expression.

      Thank you for your comments and suggestions. Our finding that ZNRF3/RNF43 specifically regulates FZD5/8 degradation is consistent with recent published studies in which FZD5 is required for the survival of RNF43-mutant PDAC or colorectal cancer cells (Nature Medicine, 2017, PMID: 27869803) and FZD5 is required for the maintenance of intestinal stem cells (Developmental Cell, 2024, PMID: 39579768 and 39579769), and in both cases, FZDs other than FZD5/8 are also expressed but not sufficient to compensate for the function of FZD5. The mechanism by which Wnt3a/5a specifically induces FZD5/8 endocytosis and degradation is currently unknown and needs to be explored in the future. We speculate that Wnt binding to FZD5/8 may recruit another protein on the cell surface to specifically facilitate FZD5/8 endocytosis. On the other hand, we cannot exclude the possibility that Wnts other than Wnt3a/5a may induce the endocytosis and degradation of FZDs other than FZD5/8 since there are 19 Wnts and 10 FZDs in humans. Notably, several previous studies have suggested that ZNRF3/RNF43 may regulate the endocytosis and degradation of all FZDs without selectivity (such as Nature, 2012, PMID: 22575959; Nature, 2012, PMID: 22895187; Mol Cell, 2015, PMID: 25891077). However, their conclusions were drawn mostly on the basis of overexpression studies. According to the results shown in Figure 5E-H, overexpressing a membrane-tethered E3 ligase (such as ZNRF3, RNF43, RNF130, or RNF150) may nonspecifically degrade FZD proteins on the cell surface.

      Furthermore, in the revised manuscript, we showed that Wnt3a/5a induced FZD5/8 endocytosis and degradation in multiple cell lines, including Huh7, U2OS, MCF7, and 769P cells (Figure 1-Figure supplement 1 and Figure 2-Figure supplement 1), suggesting that these phenomena are not specific to 293A cells.

      (4) If FZD7 is not a substrate of ZNRF3/RNF43 and therefore is not ubiquitinated and degraded, how do the authors reconcile that its overexpression does not lead to elevated cytosolic beta-catenin levels in Figure 5B?

      We are currently not sure of the mechanism underlying this result. Considering that most FZDs are expressed in 293A cells, we do not know how much of the mature form of overexpressed FZD7 was presented to the plasma membrane.

      (5) For Figure 5B, it would be interesting if the authors could evaluate whether overexpression of FZD5 in the ZNRF3/RNF43 double knockout lines would synergize and lead to further increase in cytosolic beta-catenin levels. As control if the substrate selectivity is clear FZD7 overexpression in that line should not do anything.

      Thank you for your suggestion. We performed these experiments as suggested, and the results indicated that overexpressing FZD5 further increased cytosolic beta-catenin levels in ZRDKO cells, whereas FZD7 had no effect (Figure 6D).

      (6) In Figure 6G, the authors need to show cytosolic levels of beta-catenin in the absence of Wnt in all cases.

      We did not add Wnt CM in this experiment. RSPO1 activity, which relies on endogenous Wnt, has been well documented in previous studies.

      (7) Since the authors show that DVL is not involved in the Wnt and ZRNF3-dependent endocytosis they should repeat the proximity biotinylation experiment in figure 7 in the DVL triple KO cells. This is an important experiment since previous studies showed that DVL was required for the ZRNF3/RNF43-mediated ubiqtuonation of FZD.

      Thank you for your valuable suggestions. As you suggested, we performed a proximity biotinylation experiment in DVL TKO cells, and the results showed that Wnt3a/5a could still induce the interaction of FZD5 and RNF43 in DVLTKO cells (Figure 7-figure supplement 1), suggesting that the Wnt-induced FZD5‒RNF43 interaction is DVL independent.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      In this manuscript, Roy et al. used the previously published deep transfer learning tool, DEGAS, to map disease associations onto single-cell RNA-seq data from bulk expression data. The authors performed independent runs of DEGAS using T2D or obesity status and identified distinct β-cell subpopulations. β-cells with high obese-DEGAS scores contained two subpopulations derived largely from either non-diabetic or T2D donors. Finally, immunostaining using human pancreas sections from healthy and T2D donors validated the heterogeneous expression and depletion of DLK1 in T2D islets.

      Strengths:

      (1) This meta-analysis of previously published scRNA-seq data using a deep transfer learning tool.

      (2) Identification of novel beta cell subclusters.

      (3) Identified a relatively innovative role of DLK1 in T2D disease progression.

      Thank you for your comments on the strengths of our work.

      Weaknesses :

      “There is little overlap of the DE list of bulk RNA-seq analysis in Figure 1D and 1E overlap with the DE list of pseudo-bulk RNA-seq analysis of all cells in Figure S2C. “

      Thank you for pointing this out. To clarify, we did not perform pseudo-bulk analysis on the scRNAseq data. Instead, we used the Seurat FindClusterMarkers function to identify differentially enriched genes between T2D and ND single cells. Indeed, there are many significant genes in new Fig S2D (original S2C). There is some overlap between those data and the DEGS from bulk RNAseq data in Fig 1D, including IAPP, ENTPD3, and FFAR4. However, the limited overlap supports the notion that improved approaches are necessary to identify candidate DEGs from single cell data, as simply performing a comparison of T2D to ND of all β-cells may miss important genes or include many false positives. We have now added clarification to the text to highlight this point.

      The biological meaning of "beta cells had the lowest scores compared to other cell types" is not clear.

      The relatively lower T2D-DEGAS scores for beta cells overall compared to all other cell types (alpha cells, acinar cells, etc) likely reflects the fact that in T2D, beta cell-specific genes can be downregulated. This affects the DEGAS model which is reflected in the scores of all cells in the scRNAseq data. By subsetting the beta cells and replotting them on their own, we can analyze the relative differences in DEGAS scores between different subsets of beta cells. We have now amended the text to clarify, as follows:

      “We next mapped the T2D-association scores onto the single cells (Fig 3A). β-cells had a wide distribution of scores, possibly reflecting β-cell heterogeneity or altered β-cell gene expression after onset of T2D (Fig 3B).”

      The figures and supplemental figures were not cited following the sequence, which makes the manuscript very difficult to read. Some supplemental figures, such as Figures S1C-S1D, S2B-S2E, S3A-S3B, were not cited or mentioned in the text.

      We apologize for this oversight and have now amended the text to call out all figures/panels in order of first introduction.

      In Figure 7, the current resolution is too low to determine the localization of DLK1.

      We have confirmed that in our Adobe Illustrator file, each microscopy panel has a DPI of >600. We have also provided the highest quality TIFF file versions of our figure set. We hope the reviewer will have access to download the high-quality TIFF file for Fig 7 if possible, or the editorial staff can provide it.

      As a result of addressing the critiques, we identified CDKN1C as another promising candidate enriched in the β<sup>T2D-DEGAS</sup> and β<sup>obese-DEGAS</sup> subpopulations of β-cells. We found that CDKN1C is heterogeneously expressed at the protein level in β-cells and that it is increased in T2D in agreement with the DEGAS predictions. We have amended the manuscript to highlight CDKN1C more prominently while still discussing DLK1. DLK1 is very interesting, but exhibits greater donor to donor variability in its alterations in T2D.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Gitanjali Roy et al. applies deep transfer learning (DEGAS) to assign patient-level disease attributes (metadata) to single cells of T2D and non-diabetic patients, including obese patients. This led to the identification of a singular cluster of T2D-associated β-cells; and two subpopulations of obese- β-cells derived from either non-diabetic or T2D donors. The objective was to identify novel and established genes implicated in T2D and obesity. Their final goal is to validate their findings at the protein level using immunohistochemistry of pancreas tissue from non-diabetic and T2D organ donors.

      Strengths:

      This paper is well-written, and the findings are relevant for β-cell heterogeneity in T2D and obesity.

      Thank you for your comments on the positive aspects of our work.

      Weaknesses:

      The validation they provide is not sufficiently strong: no DLK1 immunohistochemistry is shown of obese patient-derived sections.

      We have acquired additional FFPE pancreas samples from the Integrated Islet Distribution Program (IIDP) from lean, overweight, and obese humans with and without T2D. We have now stained for CDKN1C and DLK1 in these samples and have integrated the data into Fig 7 and Fig S5.

      Because the data with CDKN1C was more striking and consistent with the DEGAS predictions, we have chosen to highlight CDKN1C in the main figure and text. The DLK1 data is still quite interesting, although there is substantial variability between T2D donors when it comes to altered staining intensity. DLK1 presents an interesting challenge, given multiple isoforms and cleavage products, and will require further investigation as the focus of a different manuscript.

      Additional presumptive relevant candidates from this transcriptomic analysis should be screened for, at the protein level.

      Thank you for this suggestion. We also identified CDKN1C as promising candidate enriched in the β<sup>T2D-DEGAS</sup> and β<sup>obese-DEGAS</sup> subpopulations of β-cells. We found that CDKN1C is heterogeneously expressed at the protein level in β-cells and that it is increased in T2D in agreement with the DEGAS predictions. We have amended the manuscript to highlight CDKN1C more prominently while still discussing DLK1. DLK1 is very interesting but exhibits greater donor to donor variability in its alterations in T2D.

      Reviewer #1 (Recommendations For The Authors):

      Please explain and provide the detailed information on what percentage of the DE list of bulk RNA-seq analysis in Figures 1D and 1E overlap with the DE list of pseudo-bulk RNA-seq analysis of all cells in Figure S2C.

      Addressed in response to R1 Comment 1.

      Please provide the definition of each cluster of UMAP of the merged human islet scRNA-seq data.

      In figure panels 2A-B,D-G and 3A, the clusters are now labeled according to the marker genes described in Fig 2C.

      The integrative UMAP needs to be included in the main figure.

      We have now moved previous Fig S2A and S2B into the main figures as new Fig 2A-B.

      All figures and supplemental figures need to be cited following sequence.

      Addressed in response to R1 Comment 3.

      In Figure 7, high-resolution images are needed to determine the colocalization of INS and DLK1.

      Addressed in response to R1 Comment 4.

      Reviewer #2 (Recommendations For The Authors):

      Results: 124-128: Fig 1H_The error bars seem high, please include whether the boxplots are SEM or SD. Also, more detail on statistics is missing.

      Thank you for pointing out the need for clarification here. The whiskers on the box and whiskers plots are not error bars. By default, in geom_boxplot() and stat_boxplot(), the whiskers extend to 1.5 times the interquartile range. The box itself represents 50% of the data, the bottom of the box is the first quartile, the middle horizontal line is the median, and the top line of the box is the third quartile. We have now added a clearer description of this to the figure legend and in the methods section.

      The genes shown in Fig 1H were selected because they are found in the T2D Knowledge Portal, illustrating a clear link to T2D. At the T2DKP (https://t2d.hugeamp.org/research.html?pageid=mccarthy_t2d_247), PAX4 and APOE are listed as causal, SLC2A2 has strong evidence, and CYTIP has a linked SNP. This is now discussed in the results section before the Fig 1H callout. These genes are significantly differentially expressed using edgeR in panel 1D with FDR<0.05. The individual data points for each human are shown.

      Figure 6: In general, the representation of the data is quite misleading. It would be nice to have an alternative way of presenting the data, especially when comparing beta-obese differentially expressed genes and pathways and T2D beta obese. Maybe an additional Venn diagram can help. Also, it would be nice to compare data from T2D beta nonobese to ND beta obese, especially given how the story is presented in the paper.

      Thank you for pointing out this clarity issue. We agree that additional alternate ways to present the data would be helpful. When we performed DEGAS using BMI as the disease feature we noted two major and one minor clusters of high-scoring cells in Fig 6A .

      Author response image 1.

      Author response image 2.<br />

      This contrasted with the score map when we ran DEGAS with T2D as the disease feature

      The main difference seems to be the low scoring β<sup>T2D-DEGAS</sup> cluster is different from the low β<sup>obese-DEGAS</sup> cluster.

      Therefore, we could not easily apply thresholding to the β<sup>obese-DEGAS</sup> scores, so instead we subsetted them for comparison. It was also apparent from the metadata that single cells from the left-hand side of the β-cell cluster came from donors that had T2D.

      To clarify these points and address the reviewer’s concerns, we have added a comparison of the DEGs identified for β<sup>T2D-DEGAS</sup> high vs. low and T2D-β<sup>obese-DEGAS</sup> vs ND-β<sup>obese-DEGAS</sup> in Fig S4J, also shown below. DLK1 and CDKNC1C fall within the intersection, in addition to being two of the most enriched candidates in each DEGAS run (Fig 4C and Fig 6D).

      220-222: Figure 7C_ Is one of the nondiabetic beta samples obese? If so, please clearly label it; if not, that info is missing. One would expect that the DLK1 expression in ND obese beta cells resembles the T2D beta cell and not ND non-obese beta cells. That's a big point of this entire work, and experimentally missing. Additional candidate proteins should be checked.

      We have amended the entire Fig 7 to include more data for DLK1 staining as well as adding staining for CDKN1C. We also used CellProfiler to quantify the intensity distribution of DLK1 staining in β-cells and overall found that our initial conclusions were not supported when considering an increased sample size. DLK1 expression is heterogeneous both within and between donors. While we have data from T2D donors that shows DLK1 is lost, other T2D samples indicate that DLK1 is not always lost. At least in the current sample set we have analyzed, we cannot conclude that there is a clear correlation between diabetes or BMI for DLK1. Why DLK1 labels some β-cells and not others and what the role of this subpopulation is an open question.

      Alternatively, we greatly appreciate the reviewer’s suggestion to validate additional candidates, as this led us to CDKN1C. In new Fig 7E-H we now show that CDKN1C is increased in T2D β-cells, in agreement with the DEGAS predictions.

      This work shows that machine learning approaches are powerful for identifying potential candidates, but it also highlights the need for these predictions to be validated at the protein level in human samples.

      Discussion: Based on lack of supporting IHC data, this is an overstatement:

      “DLK1 expression highly overlapped with high scoring βT2D DEGAS cells (Figure 7A) and with T2D βobese-DEGAS cells (Figure 7B). DLK1 immunostaining primarily colocalized with β-cells in non-diabetic human pancreas (Figure 7C). DLK1 showed heterogeneous expression within islets and between islets within the same pancreas section, wherein some islets had DLK1/INS co-staining in most β-cells and other islets had only a few DLK1+ β-cells. In the T2D pancreas, DLK1 staining was much less intense and in fewer β-cells, yet DLK1+/INS+ cells were observed (Figure 7C). This contrasts with the relatively higher DLK1 gene expression seen in the β-cells from the βT2D-DEGAS and T2D-βobese-DEGAS subpopulations (Figure 4D & 6C) as highlighted in Figure 7A,B. which were up- or down-regulated in subpopulations of β-cells identified by DEGAS, and to validate our findings at the protein level using immunohistochemistry of pancreas tissue from non-diabetic and T2D organ donors.”

      This part was at the very end of the last results subsection. This section has been largely rewritten to better describe the new figure and the language has been tempered to not overinterpret the data shown.

      “Our current findings applying DEGAS to islet data have implications for β-cell heterogeneity in T2D and obesity. The abundance of T2D-related factors and functional β-cell genes in our analysis validates applying DEGAS to islet data to identify disease-associated phenotypes and increase confidence in the novel candidate.”

      This part was found at the end of the Background section. We have removed the second sentence to temper the language.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      The study by Klug et al. investigated the pathway specificity of corticostriatal projections, focusing on two cortical regions. Using a G-deleted rabies system in D1-Cre and A2a-Cre mice to retrogradely deliver channelrhodopsin to cortical inputs, the authors found that M1 and MCC inputs to direct and indirect pathway spiny projection neurons (SPNs) are both partially segregated and asymmetrically overlapping. In general, corticostriatal inputs that target indirect pathway SPNs are likely to also target direct pathway SPNs, while inputs targeting direct pathway SPNs are less likely to also target indirect pathway SPNs. Such asymmetric overlap of corticostriatal inputs has important implications for how the cortex itself may determine striatal output. Indeed, the authors provide behavioral evidence that optogenetic activation of M1 or MCC cortical neurons that send axons to either direct or indirect pathway SPNs can have opposite effects on locomotion and different effects on action sequence execution. The conclusions of this study add to our understanding of how cortical activity may influence striatal output and offer important new clues about basal ganglia function. 

      The conceptual conclusions of the manuscript are supported by the data, but the details of the magnitude of afferent overlap and causal role of asymmetric corticostriatal inputs on behavioral outcomes were not yet fully resolved. 

      We appreciate the reviewer’s thoughtful understanding and acknowledgment that the conceptual conclusion of asymmetric projections from the cortex to the striatum is well supported by our data. We also recognize the importance of further elucidating the extent of afferent overlap and the causal contributions of asymmetric corticostriatal inputs to behavioral outcomes. However, we respectfully note that current technical limitations pose significant challenges to addressing these questions with high precision.

      In response to the reviewer’s comments, we have now clarified the sample size, added proper analysis and elaborated on the experimental design to ensure that our conclusions are presented more transparently and are more accessible to the reader.

      After virally labeling either direct pathway (D1) or indirect pathway (D2) SPNs to optogenetically tag pathway-specific cortical inputs, the authors report that a much larger number of "non-starter" D2-SPNs from D2-SPN labeled mice responded to optogenetic stimulation in slices than "non-starter" D1 SPNs from D1-SPN labeled mice did. Without knowing the relative number of D1 or D2 SPN starters used to label cortical inputs, it is difficult to interpret the exact meaning of the lower number of responsive D2-SPNs in D1 labeled mice (where only ~63% of D1-SPNs themselves respond) compared to the relatively higher number of responsive D1-SPNs (and D2-SPNs) in D2 labeled mice. While relative differences in connectivity certainly suggest that some amount of asymmetric overlap of inputs exists, differences in infection efficiency and ensuing differences in detection sensitivity in slice experiments make determining the degree of asymmetry problematic. 

      Thank you for highlighting this point. As it lies at the core of our manuscript, we agree that it is essential to present it clearly and convincingly. As shown by the statistics (Fig. 2B-F), non-starter D1- and D2-SPNs appear to receive fewer projections from D1-projecting cortical neurons (Input D1-record D1, 0.63; Input D1-record D2, 0.40) compared to D2-projecting cortical neurons (Input D2 - record D1, 0.73; Input D2 -record D2, 0.79).

      While it is not technically feasible to quantify the number of infected cells in brain slices following electrophysiological recordings, we addressed this limitation by collecting data from multiple animals and restricting recordings to cells located within the injection sites. In Figure 2D, we used 7 mice in the D1-projecting to D1 EGFP(+) group, 8 mice in the D1-projecting to D2 EGFP(-) group, 10 mice in the D2-projecting to D2 EGFP(+) group, and 8 mice in the D2-projecting to D1 EGFP(-) group. In Figure 2G, the group sizes were as follows: 8 mice in the D1-projecting to D2 EGFP(+) group, 7 mice in the D1-projecting to D1 EGFP(-) group, 8 mice in the D2-projecting to D1 EGFP(+) group, and 10 mice in the D2-projecting to D2 EGFP(-) group. In both panels, connection ratios were compared using Fisher’s exact test. Comparisons were then made across experimental groups. Furthermore, as detailed in our Methods section (page 20, line 399-401), we assessed cortical expression levels prior to performing whole-cell recordings. Taken together, these precautions help ensure that the calculated connection ratios are unlikely to be confounded by differences in infection efficiency.

      It is also unclear if retrograde labeling of D1-SPN- vs D2-SPN- targeting afferents labels the same densities of cortical neurons. This gets to the point of specificity in the behavioral experiments. If the target-based labeling strategies used to introduce channelrhodopsin into specific SPN afferents label significantly different numbers of cortical neurons, might the difference in the relative numbers of optogenetically activated cortical neurons itself lead to behavioral differences? 

      Thank you for bringing this concern to our attention. While optogenetic manipulation has become a widely adopted tool in functional studies of neural circuits, it remains subject to several technical limitations due to the nature of its implementation. Factors such as opsin expression efficiency, optic fiber placement, light intensity, stimulation spread, and other variables can all influence the specificity and extent of neuronal activation or inhibition. As such, rigorous experimental controls are essential when interpreting the outcomes of optogenetic experiments.

      In our study, we verified both the expression of channelrhodopsin in D1- or D2-projecting cortical neurons and the placement of the optic fiber following the completion of behavioral testing. To account for variability, we compared the behavioral effects of optogenetic stimulation within the same animals, stimulated versus non-stimulated conditions, as shown in Figures 3 and 4. Moreover, Figure S3 includes important controls that rule out the possibility that the behavioral effects observed were due to direct activation of D1- or D2-SPNs in striatum or to light alone in the cortex.

      An additional point worth emphasizing is that the behavioral effects observed in the open field and ICSS tests cannot be attributed to differences in the number of neurons activated. Specifically, activation of D1-projecting cortical neurons promoted locomotion in the open field, whereas activation of D2-projecting cortical neurons did not. However, in the ICSS test, activation of both D1- and D2-projecting cortical neurons reinforced lever pressing. Given that only D1-SPN activation, but not D2-SPN activation, supports ICSS behavior, these effects are unlikely to result merely from differences in the number of neurons recruited.

      This rationale underlies our use of multiple behavioral paradigms to examine the functions of D1- and D2-projecting cortical neurons. By assessing behavior across distinct tasks, we aimed to approach the question from multiple angles and reduce the likelihood of spurious or confounding effects influencing our interpretation.

      In general, the manuscript would also benefit from more clarity about the statistical comparisons that were made and sample sizes used to reach their conclusions.

      We thank the reviewer for the valuable suggestion to improve the manuscript. In response, we have made the following changes and provided additional clarification:

      (1) In Figure 2D, we used 7 mice in the D1-projecting to D1 EGFP(+) group, 8 mice in the D1-projecting to D2 EGFP(-) group, 10 mice in the D2-projecting to D2 EGFP(+) group, and 8 mice in the D2-projecting to D1 EGFP(-) group. In Figure 2G, the group sizes were as follows: 8 mice in the D1-projecting to D2 EGFP(+) group, 7 mice in the D1-projecting to D1 EGFP(-) group, 8 mice in the D2-projecting to D1 EGFP(+) group, and 10 mice in the D2-projecting to D2 EGFP(-) group. In both panels, connection ratios were compared using Fisher’s exact test.

      (2) In Figure 3, we reanalyzed the data in panels O, P, R, and S using permutation tests to assess whether each individual group exhibited a significant ICSS learning effect. The figure legend has been revised accordingly as follows:

      (O-P) D1-SPN (red) but not D2-SPN stimulation (black) drives ICSS behavior in both the DMS (O: D1, n = 6, permutation test, slope = 1.5060, P = 0.0378; D2, n = 5, permutation test, slope = -0.2214, P = 0.1021; one-tailed Mann Whitney test, Day 7 D1 vs. D2, P = 0.0130) and the DLS (P: D1, n = 6, permutation test, slope = 28.1429, P = 0.0082; D2, n = 5, permutation test, slope = -0.3429, P = 0.0463; one-tailed Mann Whitney test, Day 7 D1 vs. D2, P = 0.0390). *, P < 0.05. (Q) Timeline of helper virus injections, rabies-ChR2 injections and optogenetic stimulation for ICSS behavior. (R-S) Optogenetic stimulation of the cortical neurons projecting to either D1- or D2-SPNs induces ICSS behavior in both the MCC (R: MCC-D1, n = 5, permutation test, Day1-Day7, slope = 2.5857, P = 0.0034; MCC-D2, n = 5, Day2-Day7, permutation test, slope = 1.4229, P = 0.0344; no significant effect on Day7, MCC-D1 vs. MCC-D2,  two-tailed Mann Whitney test, P = 0.9999) and the M1 (S: M1-D1, n = 5, permutation test, Day1-Day7, slope = 1.8214, P = 0.0259; M1-D2, n = 5, Day1-Day7, permutation test, slope = 1.8214, P = 0.0025; no significant effect on Day7, M1-D1 vs. M1-D2, two-tailed Mann Whitney test, P = 0.3810). n.s., not statistically significant.

      (3) In Figure 4, we have added a comparison against a theoretical percentage change of zero to better evaluate the net effect of each manipulation. The results showed that in Figure 4D, optogenetic stimulation of D1-projecting MCC neurons significantly increased the pressing rate, whereas stimulation of D2-projecting MCC neurons did not (MCC-D1: n = 8, one-sample two-tailed t-test, t = 2.814, P = 0.0131; MCC-D2: n = 7, t = 0.8481, P = 0.4117). In contrast, in Figure 4H, optogenetic stimulation of both D1- and D2-projecting M1 neurons significantly increased the sequence press rate (M1-D1: n = 6, one-sample two-tailed Wilcoxon signed-rank test, P = 0.0046; M1-D2: n = 7, P = 0.0479).

      Reviewer #2 (Public Review):

      Summary: 

      Klug et al. use monosynaptic rabies tracing of inputs to D1- vs D2-SPNs in the striatum to study how separate populations of cortical neurons project to D1- and D2-SPNs. They use rabies to express ChR2, then patch D1-or D2-SPNs to measure synaptic input. They report that cortical neurons labeled as D1-SPN-projecting preferentially project to D1-SPNs over D2-SPNs. In contrast, cortical neurons labeled as D2-SPN-projecting project equally to D1- and D2-SPNs. They go on to conduct pathway-specific behavioral stimulation experiments. They compare direct optogenetic stimulation of D1- or D2-SPNs to stimulation of MCC inputs to DMS and M1 inputs to DLS. In three different behavioral assays (open field, intra-cranial self-stimulation, and a fixed ratio 8 task), they show that stimulating MCC or M1 cortical inputs to D1-SPNs is similar to D1-SPN stimulation, but that stimulating MCC or M1 cortical inputs to D2-SPNs does not recapitulate the effects of D2-SPN stimulation (presumably because both D1- and D2-SPNs are being activated by these cortical inputs). 

      Strengths: 

      Showing these same effects in three distinct behaviors is strong. Overall, the functional verification of the consequences of the anatomy is very nice to see. It is a good choice to patch only from mCherry-negative non-starter cells in the striatum.

      Thank you for your profound understanding and appreciation of our manuscript’s design and the methodologies employed. In the realm of neuroscience, quantifying synaptic connections is a formidable challenge. While the roles of the direct and indirect pathways in motor control have long been explored, the mechanism by which upstream cortical inputs govern these pathways remains shrouded in mystery at the circuitry level.

      In the ‘Go/No-Go’ model, the direct and indirect pathways operate antagonistically; in contrast, the ‘Co-activation’ model suggests that they work cooperatively to orchestrate movement. These distinct theories raise a compelling question: Do these two pathways receive inputs from the same upstream cortical neurons, or are they modulated by distinct subpopulations? Answering this question could provide vital clues as to whether these pathways collaborate or operate independently.

      Previous studies have revealed both differences and similarities in the cortical inputs to direct and indirect pathways at population level. However, our investigation delves deeper to understand how a singular cortical input simultaneously drives these pathways, or might it regulate one pathway through distinct subpopulations? To address this, we employed rabies virus–mediated retrograde tracing from D1- or D2-SPNs and recorded non-starter SPNs to determine if they receive the same inputs as the starter SPNs. This approach allowed us to calculate the connection ratio and estimate the probable connection properties.

      Weaknesses: 

      One limitation is that all inputs to SPNs are expressing ChR2, so they cannot distinguish between different cortical subregions during patching experiments. Their results could arise because the same innervation patterns are repeated in many cortical subregions or because some subregions have preferential D1-SPN input while others do not.

      Thank you for raising this thoughtful concern. It is indeed not feasible to restrict ChR2 expression to a specific cortical region using the first-generation rabies-ChR2 system alone. A more refined approach would involve injecting Cre-dependent TVA and RG into the striatum of D1- or A2A-Cre mice, followed by rabies-Flp infection. Subsequently, a Flp-dependent ChR2 virus could be injected into the MCC or M1 to selectively label D1- or D2-projecting cortical neurons. This strategy would allow for more precise targeting and address many of the current limitations.

      However, a significant challenge lies in the cytotoxicity associated with rabies virus infection. Neuronal health begins to deteriorate substantially around 10 days post-infection, which provides an insufficient window for robust Flp-dependent ChR2 expression. We have tested several new rabies virus variants with extended survival times (Chatterjee et al., 2018; Jin et al., 2024), but unfortunately, they did not perform effectively or suitably in the corticostriatal systems we examined.

      In our experimental design, the aim is to delineate the connectivity probabilities to D1 or D2-SPNs from cortical neurons. Our hypothesis considered includes the possibility that similar innervation patterns could occur across multiple cortical subregions, or that some subregions might show preferential input to D1-SPNs while others do not, or a combination of both scenarios. This leads us to perform a series behavior test that using optogenetic activation of the D1- or D2-projecting cortical populations to see which could be the case.

      In the cortical areas we examined, MCC and M1, during behavioral testing, there is consistency with our electrophysiological results. Specifically, when we stimulated the D1-projecting cortical neurons either in MCC or in M1, mice exhibited facilitated local motion in open field test, which is the same to the activation of D1 SPNs in the striatum along (MCC: Fig 3C & D vs. I; M1: Fig 3F & G vs. L). Conversely, stimulation of D2-projecting MCC or M1 cortical neurons resulted in behavioral effects that appeared to combine characteristics of both D1- and D2-SPNs activation in the striatum (MCC: Fig 3C & D vs. J; M1: Fig 3F & G vs. M). The similar results were observed in the ICSS test. Our interpretation of these results is that the activation of D1-projecting neurons in the cortex induces behavior changes akin to D1 neuron activation, while activation of D2-projecting neurons in the cortex leads to a combined effect of both D1 and D2 neuron activation. This suggests that at least some cortical regions, the ones we tested, follow the hypothesis we proposed.

      There are also some caveats with respect to the efficacy of rabies tracing. Although they only patch non-starter cells in the striatum, only 63% of D1-SPNs receive input from D1-SPN-projecting cortical neurons. It's hard to say whether this is "high" or "low," but one question is how far from the starter cell region they are patching. Without this spatial indication of where the cells that are being patched are relative to the starter population, it is difficult to interpret if the cells being patched are receiving cortical inputs from the same neurons that are projecting to the starter population. Convergence of cortical inputs onto SPNs may vary with distance from the starter cell region quite dramatically, as other mapping studies of corticostriatal inputs have shown specialized local input regions can be defined based on cortical input patterns (Hintiryan et al., Nat Neurosci, 2016, Hunnicutt et al., eLife 2016, Peters et al., Nature, 2021).

      This is a valid concern regarding anatomical studies. Investigating cortico-striatal connectivity at the single-cell level remains technically challenging due to current methodological limitations. At present, we rely on rabies virus-mediated trans-synaptic retrograde tracing to identify D1- or D2-projecting cortical populations. This anatomical approach is coupled with ex vivo slice electrophysiology to assess the functional connectivity between these projection-defined cortical neurons and striatal SPNs. This enables us to quantify connection ratios, for example, the proportion of D1-projecting cortical neurons that functionally synapse onto non-starter D1-SPNs.

      To ensure the robustness of our conclusions, it is essential that both the starter cells and the recorded non-starter SPNs receive comparable topographical input from the cortex and other brain regions. Therefore, we carefully designed our experiments so that all recorded cells were located within the injection site, were mCherry-negative (i.e., non-starter cells), and were surrounded by ChR2-mCherry-positive neurons. This configuration ensured that the distance between recorded and starter cells did not exceed 100 µm, maintaining close anatomical proximity and thereby preserving the likelihood of shared cortical innervation within the examined circuitry.

      These methodological details are also described in the section on ex vivo brain slice electrophysiology, specifically in the Methods section, lines 396–399:

      “D1-SPNs (eGFP-positive in D1-eGFP mice, or eGFP-negative in D2-eGFP mice) or D2-SPNs (eGFP-positive in D2-eGFP mice, or eGFP-negative in D1-eGFP mice) that were ChR2-mCherry-negative, but in the injection site and surrounded by cells expressing ChR2-mCherry were targeted for recording.”

      This experimental strategy was implemented to control for potential spatial biases and to enhance the interpretability of our connectivity measurements.

      A caveat for the optogenetic behavioral experiments is that these optogenetic experiments did not include fluorophore-only controls.

      Thank you for bringing this to our attention. A fluorophore-only control is indeed a valuable negative control, commonly used to rule out effects caused by light exposure independent of optogenetic manipulation. In this study, however, comparisons were made between light-on and light-off conditions within the same animal. This within-subject design, as employed in recent studies (Geddes et al., 2018; Zhu et al., 2025), is considered sufficient to isolate the effects of optogenetic manipulation.

      Furthermore, as shown in Figure S3, we conducted an additional control experiment in which optogenetic stimulation was applied to M1, while ensuring that ChR2 expression was restricted to the striatum via targeted viral infection. This approach serves as a functional equivalent to the control you suggested. Importantly, we observed no effects that could be attributed solely to light exposure, further supporting the conclusion that the observed outcomes in our main experiments are due to targeted optogenetic manipulation, rather than confounding effects of illumination.

      Lastly, by employing an in-animal comparison, measuring changes between stimulated and non-stimulated trials, we account for subject-specific variability and strengthen the interpretability of our findings.

      Another point of confusion is that other studies (Cui et al, J Neurosci, 2021) have reported that stimulation of D1-SPNs in DLS inhibits rather than promotes movement.

      Thank you for bringing the study by Cui and colleagues to our attention. While that study has generated some controversy, other independent investigations have demonstrated that activation of D1-SPNs in DLS facilitates local motion and lever-press behaviors (Dong et al., 2025; Geddes et al., 2018; Kravitz et al., 2010).

      It is still worth to clarify. The differences in behavioral outcomes observed between our study and that of Cui et al. may be attributable to several methodological factors, including differences in both the stereotaxic targeting coordinates and the optical fiber specifications used for stimulation.

      Specifically, in our experiments, the dorsomedial striatum (DMS) was targeted at coordinates AP +0.5 mm, ML ±1.5 mm, DV –2.2 mm, and the DLS at AP +0.5 mm, ML ±2.5 mm, DV –2.2 mm. In contrast, Cui et al. targeted the DMS at AP +0.9 mm, ML ±1.4 mm, DV –3.0 mm and the DLS at AP +0.7 mm, ML ±2.3 mm, DV –3.0 mm. These coordinates correspond to sites that are slightly more rostral and ventral compared to our own. Even subtle differences in anatomical targeting can result in activation of distinct neuronal subpopulations, which may account for the differing behavioral effects observed during optogenetic stimulation.

      In addition, the optical fibers used in the two studies varied considerably. We employed fibers with a 200 µm core diameter and a numerical aperture (NA) of 0.37, whereas Cui et al. used fibers with a 250 µm core diameter and a higher NA of 0.66. The combination of a larger core and higher NA in their setup implies a broader spatial spread and deeper tissue penetration of light, likely resulting in activation of a larger neural volume. This expanded volume of stimulation may have engaged additional neural circuits not recruited in our experiments, further contributing to the divergent behavioral outcomes. Taken together, these differences in targeting and photostimulation parameters are likely key contributors to the distinct effects reported between the two studies.

      Reviewer #3 (Public Review): 

      In the manuscript by Klug and colleagues, the investigators use a rabies virus-based methodology to explore potential differences in connectivity from cortical inputs to the dorsal striatum. They report that the connectivity from cortical inputs onto D1 and D2 MSNs differs in terms of their projections onto the opposing cell type, and use these data to infer that there are differences in cross-talk between cortical cells that project to D1 vs. D2 MSNs. Overall, this manuscript adds to the overall body of work indicating that there are differential functions of different striatal pathways which likely arise at least in part by differences in connectivity that have been difficult to resolve due to difficulty in isolating pathways within striatal connectivity and several interesting and provocative observations were reported. Several different methodologies are used, with partially convergent results, to support their main points.

      However, I have significant technical concerns about the manuscript as presented that make it difficult for me to interpret the results of the experiments. My comments are below.

      Major:

      There is generally a large caveat to the rabies studies performed here, which is that both TVA and the ChR2-expressing rabies virus have the same fluorophore. It is thus essentially impossible to determine how many starter cells there are, what the efficiency of tracing is, and which part of the striatum is being sampled in any given experiment. This is a major caveat given the spatial topography of the cortico-striatal projections. Furthermore, the authors make a point in the introduction about previous studies not having explored absolute numbers of inputs, yet this is not at all controlled in this study. It could be that their rabies virus simply replicates better in D1-MSNs than D2-MSNs. No quantifications are done, and these possibilities do not appear to have been considered. Without a greater standardization of the rabies experiments across conditions, it is difficult to interpret the results.

      We thank the reviewer for raising these questions, which merit further discussion.

      Firstly, the primary aim of our study is to investigate the connectivity of the corticostriatal pathway. Given the current technical limitations, it is not feasible to trace all the striatal SPNs connected to a single cortical neuron. Therefore, we approached this from the opposite direction, starting from D1- or D2-SPNs to retrogradely label upstream cortical neurons, and then identifying their connected SPNs via functional synaptic recordings. To achieve this, we employed the only available transsynaptic retrograde method: rabies virus-mediated tracing. Because we crossed D1- or D2-GFP mice with D1- or A2A-Cre mice to identify SPN subtypes during electrophysiological recordings, the conventional rabies-GFP system could not be used to distinguish starter cells without conflicting with the GFP labeling of SPNs. To overcome this, we tagged ChR2 expression with mCherry. In this setup, we recorded from mCherry-negative D1- or D2-SPNs within the injection site and surrounded by mCherry-positive neurons. This ensures that the recorded neurons are topographically matched to the starter cell population and receive input from the same cortical regions. We acknowledge that TVA-only and ChR2-expressing cells are both mCherry-positive and therefore indistinguishable in our system. As such, mCherry-positive cells likely comprise a mixture of starter cells and TVA-only cells, representing a somewhat broader population than starter cells alone. Nevertheless, by restricting recordings to mCherry-negative SPNs within the injection site, it is ensured that our conclusions about functional connectivity remain valid and aligned with the primary objective of this study.

      Secondly, if rabies virus replication were significantly more efficient in D1-SPNs than in D2-SPNs, this would likely result in a higher observed connection probability in the D1-projecting group. However, we used consistent genetic strategies across all groups: D1-SPNs were defined as GFP-positive in D1-GFP mice and GFP-negative in D2-GFP mice, with D2-SPNs defined analogously. Recordings from both D1- and D2-SPNs were performed using the same methodology and under the same injection conditions within the same animals. This internal control helps mitigate the possibility that differential rabies infection efficiency biased our results.

      With these experimental safeguards in place, we found that 40% of D2-SPNs received input from D1-SPN-projecting cortical neurons, while 73% of D1-SPNs received input from D2-SPN-projecting cortical neurons. Although the ideal scenario would involve an even larger sample size to refine these estimates, the technical demands of post-rabies-infection electrophysiological recordings inherently limit throughput. Nonetheless, our approach represents the most feasible and accurate method currently available, and provides a significant advance in characterizing the functional connectivity within corticostriatal circuits.

      The authors claim using a few current clamp optical stimulation experiments that the cortical cells are healthy, but this result was far from comprehensive. For example, membrane resistance, capacitance, general excitability curves, etc are not reported. In Figure S2, some of the conditions look quite different (e.g., S2B, input D2-record D2, the method used yields quite different results that the authors write off as not different). Furthermore, these experiments do not consider the likely sickness and death that occurs in starter cells, as has been reported elsewhere. The health of cells in the circuit is overall a substantial concern that alone could invalidate a large portion, if not all, of the behavioral results. This is a major confound given those neurons are thought to play critical roles in the behaviors being studied. This is a major reason why first-generation rabies viruses have not been used in combination with behavior, but this significant caveat does not appear to have been considered, and controls e.g., uninfected animals, infected with AAV helpers, etc, were not included.

      We understand and appreciate the reviewer’s concern regarding the potential cytotoxicity of rabies virus infection. Indeed, this is a critical consideration when interpreting functional connectivity data. We have tested several newer rabies virus variants reported to support extended survival times (Chatterjee et al., 2018; Jin et al., 2024), but unfortunately, these variants did not perform reliably in the corticostriatal circuits we examined.

      Given these limitations, we relied on the rabies virus approach originally developed by Osakada et al. (Osakada et al., 2011), which demonstrated that neurons infected with rabies virus expressing ChR2 remain both viable and functional up to at least 10 days post-infection (Fig. 3, cited below). In our own experiments, we further validated the health and viability of cortical neurons, the presynaptic partners of SPNs, particularly around day 7 post-infection.

      To minimize the risk of viral toxicity, we performed ex vivo slice recordings within a conservative time window, between 4 and 8 days after infection, when the health of labeled neurons is well maintained. Moreover, the recorded SPNs were consistently mCherry-negative, indicating they were not directly infected by rabies virus, thus further reducing the likelihood of recording from compromised cells.

      Taken together, these steps help ensure that our synaptic recordings reflect genuine functional connectivity, rather than artifacts of viral toxicity. We hope this clarifies the rationale behind our experimental design.

      For the behavioral tests, including a naïve uninfected group and an AAV helper virus-only group as negative controls could be beneficial to isolate the specific impact of rabies virus infection. However, our primary focus is on the activation of selected presynaptic inputs to D1- or D2-SPNs by optogenetic method. Therefore, comparing stimulated versus non-stimulated trials within the same animal offers more direct and relevant results for our study objectives.

      It is also important to note that the ICSS test is particularly susceptible to the potential cytotoxic effects of rabies virus, as it spans a relatively extended period, from Day 4 to Day 12 post-infection. To mitigate this issue, we focused our analysis on the first 7 days of ICSS testing, thereby keeping the behavioral observations within 10 days post-rabies injection. This approach minimizes potential confounds from rabies-induced neurotoxicity while still capturing the relevant behavioral dynamics. Accordingly, we have revised Figure 3 and updated the statistical analyses to reflect this adjustment.

      The overall purity (e.g., EnvA-pseudotyping efficiency) of the RABV prep is not shown. If there was a virus that was not well EnvA-pseudotyped and thus could directly infect cortical (or other) inputs, it would degrade specificity.

      We agree that anatomical specificity is crucial for accurately labeling inputs to defined SPN populations in our study. The rabies virus strain employed here has been rigorously validated for its specificity in numerous previous studies from our group and others (Aoki et al., 2019; Klug et al., 2018; Osakada et al., 2011; Smith et al., 2016; Wall et al., 2013; Wickersham et al., 2007). For example, in a recent study by Aoki et al. (Aoki et al., 2019), we tested the same rabies virus strain by co-injecting the glycoprotein-deleted rabies virus and the TVA-expressing helper virus, without glycoprotein expressing AAV, into the SNr. As shown in Figure S1 (related to Figure 2), GFP expression was restricted to starter cells within the SNr, with no evidence of transsynaptic labeling in upstream regions such as the striatum, EPN, GPe, or STN (see panels F–H). These findings provide strong evidence that the rabies virus used in our experiments is properly pseudotyped and exhibits high specificity for starter cell labeling without off-target spread.

      We appreciate the reviewer’s emphasis on specificity, and we hope this clarification further supports the reliability of our anatomical tracing approach.

      While most of the study focuses on the cortical inputs, in slice recordings, inputs from the thalamus are not considered, yet likely contribute to the observed results. Related to this, in in vivo optogenetic experiments, technically, if the thalamic or other inputs to the dorsal striatum project to the cortex, their method will not only target cortical neurons but also terminals of other excitatory inputs. If this cannot be ruled it, stating that the authors are able to selectively activate the cortical inputs to one or the other population should be toned down.

      We agree with the reviewer that the thalamus is also a significant source of excitatory input to the striatum. However, current techniques do not allow for precise and exclusive labeling of upstream neurons in a given brain region, such as the cortex or thalamus. This technical limitation indeed makes it difficult to definitively determine whether inputs from these regions follow the same projection rules. Despite this, our findings show that stimulation of defined cortical populations, specifically, D1- or D2-projecting neurons in MCC and M1, elicits behavioral outcomes that closely mirror those observed in our ex vivo slice recordings, providing strong support for the cortical origin of the effects we observed.

      In our in vivo optogenetic experiments, we acknowledge that stimulating a specific cortical region may also activate axonal terminals from rabies-infected cortical or thalamic neurons. While somatic stimulation is generally more effective than terminal stimulation, we recognize the possibility that terminals on non-rabies-traced cortical neurons could be activated through presynaptic connections. To address this, we considered the finding of a previous study (Cruikshank et al., 2010), which demonstrated that while brief optogenetic stimulation (0.05 ms) of thalamo-cortical terminals can elicit few action potentials in postsynaptic cortical neurons, sustained terminal stimulation (500 ms) also results in only transient postsynaptic firing rather than prolonged activation (Fig. 3C, cited below). This suggests that cortical neurons exhibit only short-lived responses to continuous presynaptic stimulation of thalamic origin.

      In comparison, our behavioral paradigms employed prolonged optogenetic stimulation protocols- 20 Hz, 10 ms pulses for 15 s (open-field test), 1 s (ICSS), and 8 s (FR4/8)—which more closely resemble sustained stimulation conditions. Given these parameters, and the robust behavioral responses observed, it means that the effects are primarily mediated by activation of rabies-labeled, ChR2-expressing D1- or D2-projecting cortical neurons rather than indirect activation through thalamic input.

      We appreciate the reviewer’s valuable comment, and we have now incorporated this point into the revised manuscript (page 13, line 265 to 275) to more clearly address the potential contribution of thalamic inputs in our experimental design.

      The statements about specificity of connectivity are not well-founded. It may be that in the specific case where they are assessing outside of the area of injections, their conclusions may hold (e.g., excitatory inputs onto D2s have more inputs onto D1s than vice versa). However, how this relates to the actual site of injection is not clear. At face value, if such a connectivity exists, it would suggest that D1-MSNs receive substantially more overall excitatory inputs than D2s. It is thus possible that this observation would not hold over other spatial intervals. This was not explored and thus the conclusions are over-generalized. e.g., the distance from the area of red cells in the striatum to recordings was not quantified, what constituted a high level of cortical labeling was not quantified, etc. Without more rigorous quantification of what was being done, it is difficult to interpret the results. 

      We sincerely thank the reviewer for the thoughtful comments and critical insights into our interpretation of connectivity data. These concerns are valid and provide an important opportunity to clarify and reinforce our experimental design and conclusions.

      Firstly, as described in our previous response, all patched neurons were carefully selected to be within the injection site and in close proximity to ChR2-mCherry-positive cells. Specifically, the estimated distance from each recorded neuron to the nearest starter cells did not exceed 100 µm. This design choice was made to minimize variability associated with spatial distance or heterogeneity in viral expression, thereby allowing for a more consistent sampling of putatively connected neurons.

      Secondly, quantifying both the number of starter and input neurons would, in principle, provide a more comprehensive picture of connectivity. However, given the technical limitations of the current approach particularly when combining rabies tracing with functional recordings it is not feasible to obtain such precise cell counts. Instead, we focused on connection ratios derived from targeted electrophysiological recordings, which offer a reliable and practical means of estimating connectivity within these defined circuits.

      Thirdly, regarding the potential influence of rabies-labeled neurons beyond the immediate recording site: while we acknowledge that rabies tracing labels a broad set of upstream neurons, our analysis was confined to a well-defined and localized area. The analogy we find helpful here is that of a spotlight - our recordings were restricted to the illuminated region directly under the beam, where the projection pattern is fixed and interpretable, regardless of what lies outside that area. Although we cannot fully account for all possible upstream connections, our methodology was designed to minimize variability and maintain consistency in the region of interest, which we believe supports the robustness of our conclusions in the ex vivo slice recording experiment.

      We hope this additional explanation addresses the reviewer’s concerns and helps clarify the rationale of our experimental strategy.

      The results in figure 3 are not well controlled. The authors show contrasting effects of optogenetic stimulation of D1-MSNs and D2-MSNs in the DMS and DLS, results which are largely consistent with the canon of basal ganglia function. However, when stimulating cortical inputs, stimulating the inputs from D1-MSNs gives the expected results (increased locomotion) while stimulating putative inputs to D2-MSNs had no effect. This is not the same as showing a decrease in locomotion - showing no effect here is not possible to interpret.

      We apologize for any confusion and appreciate the opportunity to clarify this point. Our electrophysiological recordings demonstrated that D1-projecting cortical neurons preferentially innervate D1-SPNs in the striatum, whereas D2-projecting cortical neurons provide input to both D1- and D2-SPNs, without a clear preference. These synaptic connectivity patterns are further supported by our behavioral experiments: optogenetic stimulation of D1-projecting neurons in cortical areas such as MCC and M1 led to behavioral effects consistent with direct D1-SPN activation. In contrast, stimulation of D2-projecting cortical neurons produced behavioral outcomes that appeared to reflect a mixture of both D1- and D2-SPN activation.

      We acknowledge that interpreting negative behavioral findings poses inherent challenges, as it is difficult to distinguish between a true lack of effect and insufficient experimental manipulation. To mitigate this, we ensured that all animals included in the analysis exhibited appropriate viral expression and correctly placed optic fibers in the targeted regions. These controls help to confirm that the observed behavioral effects - or lack thereof - are indeed due to the activation of the intended neuronal populations rather than technical artifacts such as weak expression or fiber misplacement.

      As shown in Author response image 1 below, our verification of virus expression and fiber positioning confirms effective targeting in MCC and M1 of A2A-Cre mice. Therefore, we interpret the negative behavioral outcomes as meaningful consequences of specific neural circuit activation.

      Author response image 1.

      Confocal image from A2A-Cre mouse showing targeted optogenetic stimulation of D2-projecting cortical neurons in MCC or M1. ChR2-mCherry expression highlights D2-projecting neurons, selectively labeled via rabies-mediated tracing. Optic fiber placement is confirmed above the cortical region of interest. Image illustrates robust expression and anatomical specificity necessary for pathway-selective stimulation in behavioral assays.

      In light of their circuit model, the result showing that inputs to D2-MSNs drive ICSS is confusing. How can the authors account for the fact that these cells are not locomotor-activating, stimulation of their putative downstream cells (D2-MSNs) does not drive ICSS, yet the cortical inputs drive ICSS? Is the idea that these inputs somehow also drive D1s? If this is the case, how do D2s get activated, if all of the cortical inputs tested net activate D1s and not D2s? Same with the results in figure 4 - the inputs and putative downstream cells do not have the same effects. Given the potential caveats of differences in viral efficiency, spatial location of injections, and cellular toxicity, I cannot interpret these experiments.

      We apologize for any confusion in our previous explanation. In our behavioral experiments, the primary objective was to determine whether activation of D1- or D2-projecting cortical neurons would produce behavioral outcomes distinct from those observed with pure D1 or D2 activation.

      Our findings show that stimulation of D1-projecting cortical neurons produced behavioral effects closely resembling those of selective D1 activation in both open field and ICSS tests. This is consistent with our slice recording data, which revealed that D1-projecting cortical neurons exhibit a higher connection probability with D1-SPNs than with D2-SPNs.

      In contrast, interpreting the effects of D2-projecting cortical neuron stimulation is inherently more nuanced. In the open field test, activation of these neurons did not significantly modulate local motion. This could reflect a balanced influence of D1 activation, which facilitates movement, and D2 activation, which suppresses it - resulting in a net neutral behavioral outcome. In the ICSS test, the absence of a strong reinforcement effect typically associated with D2 activation, combined with partial reinforcement likely due to concurrent D1 activation, suggests that stimulation of D2-projecting neurons produces a mixed behavioral signal. This outcome supports the interpretation that these neurons synapse onto both D1- and D2-SPNs, leading to a blended behavioral response that differs from selective D1 or D2 activation alone.

      Together, these two behavioral assays offer complementary perspectives, providing a more complete view of how projection-specific cortical inputs influence striatal output and behavior.

      In Figure 4 of the current manuscript (as cited below), we show that optogenetic activation of MCC neurons projecting to D1-SPNs facilitates sequence lever pressing, whereas activation of MCC neurons projecting to D2-SPNs does not induce significant behavioral changes. Conversely, activation of M1 neurons projecting to either D1- or D2-SPNs enhances lever pressing sequences. These observations align with our prior findings (Geddes et al., 2018; Jin et al., 2014), where we demonstrated that in the striatum, D1-SPN activation facilitates ongoing lever pressing, whereas D2-SPN activation is more involved in suppressing ongoing actions and promoting transitions between sub-sequences, shown in Fig. 4 from (Geddes et al., 2018; Jin et al., 2014) and Fig. 5K from (Jin et al., 2014) . Taken together, the facilitation of lever pressing by D1-projecting MCC and M1 neurons is consistent with their preferential connectivity to D1-SPNs and their established behavioral role.

      What is particularly intriguing, though admittedly more complex, is the behavioral divergence observed upon activation of D2-SPN-projecting cortical neurons. Activation of D2-projecting MCC neurons does not alter lever pressing, possibly reflecting a counterbalancing effect from concurrent D1- and D2-SPN activation. In contrast, stimulation of D2-projecting M1 neurons facilitates lever pressing, albeit less robustly than their D1-projecting counterparts. This discrepancy may reflect regional differences in striatal targets, DMS for MCC versus DLS for M1, as also supported by our open field test results. Furthermore, our recent findings (Zhang et al., 2025) show that synaptic strength from Cg to D2-SPNs is stronger than to D1-SPNs, whereas the M1 pathway exhibits the opposite pattern. These data suggest that beyond projection ratios, synaptic strength also shapes cortico-striatal functional output. Thus, stronger D2-SPN synapses in the DMS may offset D1-SPN activation during MCC-D2 stimulation, dampening lever pressing increase. Conversely, weaker D2 synapses in the DLS may permit M1-D2 projections to facilitate behavior more readily.

      In summary, the behavioral outcomes of our optogenetic manipulations support the proposed asymmetric cortico-striatal connectivity model. While the effects of D2-projecting neurons are not uniform, they reflect varying balances of D1 and D2-SPN influence, which further underscores the asymmetrical connections of cortical inputs to the striatum.

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors): 

      (1) What are the sample sizes for Fig S2? Some trends that are listed as nonsignificant look like they may just be underpowered. Related to this point, S2C indicates that PPR is statistically similar in all conditions. The traces shown in Figure 2 suggest that PPR is quite different in "Input D1"- vs "Input D2" projections. If there is indeed no difference, the exemplar traces should be replaced with more representative ones to avoid confusion. 

      Thank you for your suggestion. The sample size reported in Figure S2 corresponds to the neurons identified as connected in Figure 2. The representative traces shown in Figure 2 were selected based on their close alignment with the amplitude statistics and are intended to reflect typical responses. Given this, it is appropriate to retain the current examples as they accurately illustrate the underlying data.

      (2) Previous studies have described that SPN-SPN collateral inhibition is also asymmetric, with D2->D1 SPN connectivity stronger than the other direction. While cortical inputs to D2-SPNs may also strongly innervate D1-SPNs, it would be helpful to speculate on how collateral inhibition may further shape the biases (or lack thereof) reported here. 

      This would indeed be an interesting topic to explore. SPN-SPN mutual inhibition and/or interneuron inhibition may also play a role in the functional organization and output of the striatum. In the present study, we focused on the primary layer of cortico-striatal connectivity to examine how cortical neurons selectively connect to the striatal direct and indirect pathways, as these pathways have been shown to have distinct yet cooperative functions. To achieve this, we applied a GABAA receptor inhibitor to isolate only excitatory synaptic currents in SPNs, yielding the relevant results.

      To investigate additional circuit organization involving SPN-SPN mutual inhibition, the current available technique would involve single-cell initiated rabies tracing. This approach would help identify the starter SPN and the upstream SPNs that provide input to the starter cell, thereby offering a clearer understanding of the local circuit.

      (3) In Fig 3N-S there are no stats confirming that optogenetic stimulation does indeed increase lever pressing in each group (though it obviously looks like it does). It would be helpful to add statistics for this comparison, in addition to the between-group comparisons that are shown. 

      We thank the reviewer for this thoughtful suggestion. To assess whether optogenetic stimulation increases lever pressing in each group shown in Figures 3O, 3P, 3R, and 3S, we employed a permutation test (10,000 permutations). This non-parametric statistical method does not rely on assumptions about the underlying data distribution and is particularly appropriate for our analysis given the relatively small sample sizes.

      Additionally, in response to Reviewer 3’s concern regarding the potential cytotoxicity of rabies virus affecting behavioral outcomes during in vivo optogenetic stimulation experiments, we focused our analysis on Days 1 through 7 of the ICSS test. This time window remains within 10 days post-rabies infection, a period during which previous studies have reported minimal cytopathic effects (Osakada et al., 2011).

      Accordingly, we have updated Figure 3N-S and revised the associated statistical analyses in the figure legend as follows:

      (O-P) D1-SPN (red) but not D2-SPN stimulation (black) drives ICSS behavior in both the DMS (O: D1, n = 6, permutation test, slope = 1.5060, P = 0.0378; D2, n = 5, permutation test, slope = -0.2214, P = 0.1021; one-tailed Mann Whitney test, Day 7 D1 vs. D2, P = 0.0130) and the DLS (P: D1, n = 6, permutation test, slope = 28.1429, P = 0.0082; D2, n = 5, permutation test, slope = -0.3429, P = 0.0463; one-tailed Mann Whitney test, Day 7 D1 vs. D2, P = 0.0390). *, P < 0.05. (Q) Timeline of helper virus injections, rabies-ChR2 injections and optogenetic stimulation for ICSS behavior. (R-S) Optogenetic stimulation of the cortical neurons projecting to either D1- or D2-SPNs induces ICSS behavior in both the MCC (R: MCC-D1, n = 5, permutation test, Day1-Day7, slope = 2.5857, P = 0.0034; MCC-D2, n = 5, Day2-Day7, permutation test, slope = 1.4229, P = 0.0344; no significant effect on Day7, MCC-D1 vs. MCC-D2,  two-tailed Mann Whitney test, P = 0.9999) and the M1 (S: M1-D1, n = 5, permutation test, Day1-Day7, slope = 1.8214, P = 0.0259; M1-D2, n = 5, Day1-Day7, permutation test, slope = 1.8214, P = 0.0025; no significant effect on Day7, M1-D1 vs. M1-D2, two-tailed Mann Whitney test, P = 0.3810). n.s., not statistically significant.

      We believe this updated analysis and additional context further strengthen the validity of our conclusions regarding the reinforcement effects.

      (4) Line 206: mice were trained for "a few more days" is not a very rigorous description. It would be helpful to state the range of additional days of training. 

      We thank the reviewer for the suggestion. In accordance with the Methods section, we have now specified the number of days, which is 4 days, in the main text (line 207).

      (5) In Fig 4D,H, the statistical comparison is relative modulation (% change) by stimulation of D1- vs D2- projecting inputs. Please show statistics comparing the effect of stimulation on lever presses for each individual condition. For example, is the effect of MCC-D2 stimulation in panel D negative or not significant? 

      Thank you for your suggestion. Below are the statistical results, which we have also incorporated into the figure legend for clarity. To assess the net effects of each manipulation, we compared the observed percentage changes with a theoretical value of zero.

      In Figure 4D, optogenetic stimulation of D1-projecting MCC neurons significantly increased the pressing rate (MCC-D1, n = 8, one-sample two-tailed t-test, t = 2.814, P = 0.0131), whereas stimulation of D2-projecting MCC neurons did not produce a significant effect (MCC-D2, n = 7, one-sample two-tailed t-test, t = 0.8481, P = 0.4117).

      In contrast, Figure 4H shows that optogenetic stimulation of both D1- and D2-projecting M1 neurons significantly increased the sequence press rate (M1-D1, n = 6, one-sample two-tailed Wilcoxon signed-rank test, P = 0.0046; M1-D2, n = 7, one-sample two-tailed Wilcoxon signed-rank test, P = 0.0479).

      These analyses help clarify the distinct behavioral effects of manipulating different corticostriatal projections.

      (6) Are data in Fig 1G-H from a D1- or A2a- cre mouse? 

      The data in Fig 1G-H are from a D1-Cre mouse.

      (7) In Fig S3 it looks like there may actually be an effect of 20Hz simulation of D2-SPNs. Though it probably doesn't affect the interpretation. 

      As indicated by the statistics, there is a slight, but not statistically significant, decrease in local motion when 20 Hz stimulation is delivered to the motor cortex with ChR2 expression in D2-SPNs in the striatum.

      Reviewer #2 (Recommendations For The Authors): 

      The rabies tracing is referred to on several occasions as "new" but the reference papers are from 2011, 2013, and 2018. It is unclear what is new about the system used in the paper and what new feature is relevant to the experiments that were performed. Either clarify or remove "new" terminology. 

      Thank you for bringing this to our attention. We have revised the relevant text accordingly at line 20 in the Abstract, line 31 in the In Brief, line 69 in the Introduction, line 83 in the Results, and line 226 in the Discussion to improve clarity and accuracy.

      In Figure 2 D and G, D1 eGFP (+) and D2 eGFP(-) are plotted separately. These are the same cell type; therefore it may work best to combine that data. This could also be done for 'input to D2- Record D2' in panel D as well as 'input D1-Record D2' and 'input D2-Record D1' in panel G. Combining the information in panel D and G and comparing all 4 conditions to each other would give a better understanding of the comparison of functional connectivity between cortical neurons and D1 and D2 SPNs. 

      We thank the reviewer for the thoughtful suggestion. While presenting single bars for each condition (e.g., ‘input D1 - record D1’) might improve visual simplicity, it would obscure an important aspect of our experimental design. Specifically, we aimed to highlight that the comparisons between D1- and D2-projecting neurons to D1 and D2 SPNs were counterbalanced within the same animals - not just across different groups. By showing both D1-eGFP(+) and D2-eGFP(-), or vice versa, within each group and at similar proportions, we provide a more complete picture of the internal control built into our design. This format helps ensure the audience that our conclusions are not biased by group-level differences, but are supported by within-subject comparisons. Therefore, that the current presentation better could serve to communicate the rigor and balance of our experimental approach.

      The findings in Figure 2 are stated as D1 projecting excitatory inputs have a higher probability of targeting D1 SPNs while D2 projecting excitatory inputs target both D1 SPNs and D2 SPNs. It may be more clear to say that some cortical neurons project specifically to D1 SPNs while other cortical neurons project to both D1 and D2 SPNs equally. A better summary diagram could also help with clarity. 

      Thank you for bringing this up. The data we present reflect the connection probabilities of D1- or D2-projecting cortical neurons to D1 or D2 SPNs. One possible interpretation is like the reviewer said that a subset of cortical neurons preferentially target D1 SPNs, while others exhibit more balanced projections to both D1 and D2 SPNs. However, we cannot rule out alternative explanations - for example, that some D2-projecting neurons preferentially target D2 SPNs, or that the observed differences arise from the overall proportions of D1- and D2-projecting cortical neurons connecting to each striatal subtype.

      There are multiple possible patterns of connectivity that could give rise to the observed differences in connection ratios. Based on our current data, we can confidently conclude the existence of asymmetric cortico-striatal projections to the direct and indirect pathways, but the precise nature of this asymmetry will require further investigation.

      Figure 4 introduces the FR8 task, but there are similar takeaways to the findings from Figure 3. Is there another justification for the FR8 task or interesting way of interpreting that data that could add richness to the manuscript?

      The FR8 task is a self-initiated operant sequence task that relies on motor learning mechanisms, whereas the open field test solely assesses spontaneous locomotion. Furthermore, the sequence task enables us to dissect the functional role of specific neuronal populations in the initiation, maintenance, and termination of sequential movements through closed-loop optogenetic manipulations integrated into the task design. These methodological advantages underscore the rationale for including Figure 4 in the manuscript, as it highlights the unique insights afforded by this experimental paradigm.

      I am somewhat surprised to see that D1-SPN stimulation in DLS gave the results in Figure 3 F and P, as mentioned in the public review. These contrast with some previous results (Cui et al, J Neurosci, 2021). Any explanation? Would be useful to speculate or compare parameters as this could have important implications for DLS function.

      Thank you for raising this point. While Cui’s study has generated some debate, several independent investigations have consistently demonstrated that stimulation of D1-SPNs in the dorsolateral striatum (DLS) facilitates local motion and lever-press behaviors (Dong et al., 2025; Geddes et al., 2018; Kravitz et al., 2010). These findings support the functional role of D1-SPNs in promoting movement and motivated actions.

      The differences in behavioral outcomes observed between our study and that of Cui et al. may stem from several methodological factors, particularly related to anatomical targeting and optical stimulation parameters.

      Specifically, our experiments targeted the DMS at AP +0.5 mm, ML ±1.5 mm, DV –2.2 mm, and the DLS at AP +0.5 mm, ML ±2.5 mm, DV –2.2 mm. In contrast, Cui’s study targeted the DMS at AP +0.9 mm, ML ±1.4 mm, DV –3.0 mm, and the DLS at AP +0.7 mm, ML ±2.3 mm, DV –3.0 mm. These differences indicate that their targeting was slightly more rostral and more ventral than ours, which could have led to stimulation of distinct neuronal populations within the striatum, potentially accounting for variations in behavioral effects observed during optogenetic activation.

      In addition, the optical fibers used in the two studies differed markedly. We employed optical fibers with a 200 µm core diameter and a numerical aperture (NA) of 0.37. Cui’s study used fibers with a larger core diameter (250 µm) and a higher NA (0.66), which would produce a broader spread and deeper penetration of light. This increased photostimulation volume may have recruited a more extensive network of neurons, possibly including off-target circuits, thus influencing the behavioral outcomes in a manner not seen in our more spatially constrained stimulation paradigm.

      Taken together, these methodological differences, both in anatomical targeting and optical stimulation parameters, likely contribute to the discrepancies in behavioral results observed between the two studies. Our findings, consistent with other independent reports, support the role of D1-SPNs in facilitating movement and reinforcement behaviors under more controlled and localized stimulation conditions.

      Reviewer #3 (Recommendations For The Authors): 

      Minor: 

      The authors repeatedly state that they are using a new rabies virus system, but the system has been in widespread use for 16 years, including in the exact circuits the authors are studying, for over a decade. I would not consider this new. 

      Thank you for bringing this to our attention. We have revised the relevant text accordingly at line 20 in the Abstract, line 31 in the In Brief, line 69 in the Introduction, line 83 in the Results, and line 226 in the Discussion to improve clarity and accuracy.

      Figure 2G, how many mice were used for recordings?

      In Fig. 2G, we used 8 mice in the D1-projecting to D2 EGFP(+) group, 7 mice in the D1-projecting to D1 EGFP(-) group, 8 mice in the D2-projecting to D1 EGFP(+) group, and 10 mice in the D2-projecting to D2 EGFP(-) group.

      The amplitude of inputs was not reported in figure 2. This is important, as the strength of the connection matters. This is reported in Figure S2, but how exactly this relates to the presence or absence of connections should be made clearer.

      The amplitude data presented in Figure S2 summarize all recorded currents from confirmed connections, as detailed in the Methods section. A connection is defined by the presence of a detectable and reliable postsynaptic current with an onset latency of less than 10 ms following laser stimulation.

      Reference in the reply-to-review comments:

      Aoki, S., Smith, J.B., Li, H., Yen, X.Y., Igarashi, M., Coulon, P., Wickens, J.R., Ruigrok, T.J.H., and Jin, X. (2019). An open cortico-basal ganglia loop allows limbic control over motor output via the nigrothalamic pathway. Elife 8, e49995.

      Chatterjee, S., Sullivan, H.A., MacLennan, B.J., Xu, R., Hou, Y.Y., Lavin, T.K., Lea, N.E., Michalski, J.E., Babcock, K.R., Dietrich, S., et al. (2018). Nontoxic, double-deletion-mutant rabies viral vectors for retrograde targeting of projection neurons. Nat Neurosci 21, 638-646.

      Cruikshank, S.J., Urabe, H., Nurmikko, A.V., and Connors, B.W. (2010). Pathway-Specific Feedforward Circuits between Thalamus and Neocortex Revealed by Selective Optical Stimulation of Axons. Neuron 65, 230-245.

      Dong, J., Wang, L.P., Sullivan, B.T., Sun, L.X., Smith, V.M.M., Chang, L.S., Ding, J.H., Le, W.D., Gerfen, C.R., and Cai, H.B. (2025). Molecularly distinct striatonigral neuron subtypes differentially regulate locomotion. Nat Commun 16, 2710.

      Geddes, C.E., Li, H., and Jin, X. (2018). Optogenetic Editing Reveals the Hierarchical Organization of Learned Action Sequences. Cell 174, 32-43.

      Jin, L., Sullivan, H.A., Zhu, M., Lavin, T.K., Matsuyama, M., Fu, X., Lea, N.E., Xu, R., Hou, Y.Y., Rutigliani, L., et al. (2024). Long-term labeling and imaging of synaptically connected neuronal networks in vivo using double-deletion-mutant rabies viruses. Nat Neurosci 27, 373-383.

      Jin, X., Tecuapetla, F., and Costa, R.M. (2014). Basal ganglia subcircuits distinctively encode the parsing and concatenation of action sequences. Nat Neurosci 17, 423-430.

      Klug, J.R., Engelhardt, M.D., Cadman, C.N., Li, H., Smith, J.B., Ayala, S., Williams, E.W., Hoffman, H., and Jin, X. (2018). Differential inputs to striatal cholinergic and parvalbumin interneurons imply functional distinctions. Elife 7, e35657.

      Kravitz, A.V., Freeze, B.S., Parker, P.R.L., Kay, K., Thwin, M.T., Deisseroth, K., and Kreitzer, A.C. (2010). Regulation of parkinsonian motor behaviours by optogenetic control of basal ganglia circuitry. Nature 466, 622-626.

      Osakada, F., Mori, T., Cetin, A.H., Marshel, J.H., Virgen, B., and Callaway, E.M. (2011). New Rabies Virus Variants for Monitoring and Manipulating Activity and Gene Expression in Defined Neural Circuits. Neuron 71, 617-631.

      Smith, J.B., Klug, J.R., Ross, D.L., Howard, C.D., Hollon, N.G., Ko, V.I., Hoffman, H., Callaway, E.M., Gerfen, C.R., and Jin, X. (2016). Genetic-Based Dissection Unveils the Inputs and Outputs of Striatal Patch and Matrix Compartments. Neuron 91, 1069-1084.

      Wall, N.R., De La Parra, M., Callaway, E.M., and Kreitzer, A.C. (2013). Differential Innervation of Direct- and Indirect-Pathway Striatal Projection Neurons. Neuron 79, 347-360.

      Wickersham, I.R., Lyon, D.C., Barnard, R.J.O., Mori, T., Finke, S., Conzelmann, K.K., Young, J.A.T., and Callaway, E.M. (2007). Monosynaptic restriction of transsynaptic tracing from single, genetically targeted neurons. Neuron 53, 639-647.

      Zhang, B.B., Geddes, C.E., and Jin, X. (2025) Complementary corticostriatal circuits orchestrate action repetition and switching. Sci Adv, in press.

      Zhu, Z.G., Gong, R., Rodriguez, V., Quach, K.T., Chen, X.Y., and Sternson, S.M. (2025). Hedonic eating is controlled by dopamine neurons that oppose GLP-1R satiety. Science 387, eadt0773.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Response to eLife Assessment:

      We sincerely appreciate your recognition of the novelty and potential significance of our study, and we are grateful for your constructive and valuable comments.

      With regard to your concern that cast immobilization (CI) may itself act as a stressor—potentially influencing skeletal muscle, brown adipose tissue (BAT), and locomotor energy expenditure—we fully recognize this as a highly important issue. In our study, we sought to interpret the findings in light of oxygen consumption and activity data; however, it is inherently difficult to disentangle systemic stress responses and the increased energetic costs associated with CI. We have therefore revised the manuscript to explicitly acknowledge this point as a limitation, and to identify it as a subject for future investigation.

      We also greatly value your suggestion concerning the potential involvement of branched-chain amino acids (BCAAs) derived from adipose tissue in BAT thermogenesis. While our present work primarily focused on muscle-derived amino acids, previous studies have reported that impaired BCAA catabolism in white adipose tissue (WAT) is associated with elevated circulating BCAA levels and metabolic dysfunction [1]. Thus, the possibility that adipose tissue contributes to the BCAA pool used by BAT cannot be disregard. We fully agree that directly addressing this possibility would be highly valuable, and in future work we plan to locally administer isotope-labeled BCAAs into skeletal muscle or adipose tissue and analyze their contribution to circulating BCAA levels and BAT utilization. Although such experiments could not be performed within the timeframe of this resubmission, we have explicitly stated this limitation in the revised manuscript.

      In summary, we have revised the text to acknowledge the limitations highlighted in your comments and to better clarify future research directions. We believe these revisions more accurately position our current study within the broader context. Once again, we are deeply grateful for your recognition of the originality of our work and for your constructive guidance in refining it.

      Response to Reviewers:

      We sincerely appreciate the reviewers’ thoughtful evaluations and constructive comments, and we are grateful for their recognition of the novelty and significance of our study.

      Response to Reviewer 1:

      We thank the reviewer for the detailed and thoughtful comments regarding the potential systemic effects of CI, including stress responses, energy balance, and tissue wasting. These factors are indeed critical when interpreting our findings, and we agree that CI is not merely a passive loss-of-function model but also introduces stress-related influences.

      The principal aim of our study was to investigate the “physiological compensatory mechanisms” that are triggered by loss of muscle function induced by CI. Although CI inevitably elicits systemic metabolic alterations—including stress-related responses—our study is, to our knowledge, the first to demonstrate that a compensatory thermogenic pathway, mediated by the supply of amino acids from skeletal muscle to BAT, is activated under such conditions. We regard this as the central novelty of our work, and it is consistent with the reviewer’s observation that CI results in a “gain of function.”

      Our intention is not to exclude stress as a contributing factor. Rather, we emphasize that under physiological stress conditions requiring BAT thermogenesis—such as reduced energy stores or decreased heat production from skeletal muscle—amino acid supply from muscle to BAT is induced. Importantly, this mechanism is not unique to CI, as we have confirmed similar metabolic crosstalk under acute cold exposure.

      At the same time, we acknowledge that our current data do not allow us to conclude that “stress is not a primary driver” of BAT thermogenesis induced by CI. Chronic stress induced by CI appeared to be limited in our study (Fig. 2_figure supplement 2), but we cannot fully exclude stress-related effects. Accordingly, we now describe the potential triggers of BAT thermogenesis in the manuscript as either decreased body temperature due to muscle functional loss or stress, explicitly noting in the Discussion that stress and reductions in energy reserves may both contribute, as the reviewer suggested. We also modified the original overstatement that “suppression of muscle thermogenesis induces hypothermia,” and now limit the description to the observed phenomenon that “CI-induced restriction of muscle activity leads to reduced cold tolerance,” while recognizing that multiple factors—including stress, substrate availability, and BAT functional capacity—may underlie this effect.

      We further appreciate the reviewer’s comment regarding the energetic burden imposed by CI. The cast weighed less than 2 g (5–10% of body weight), and thus increased locomotor costs cannot be excluded. However, locomotor activity during the dark phase was reduced by approximately 50%, making the net energetic effect difficult to quantify. In the manuscript, we now present oxygen consumption data and restrict our description to “an increase in oxygen consumption per body weight.” Moreover, as food intake remained almost unchanged compared with controls, the animals appear to have compensated for additional energetic demands, supporting the interpretation that the observed effects were not solely attributable to starvation.

      We also find the reviewer’s suggestion—that CI induces BAT overactivation but impairs its functional capacity—extremely important. Indeed, although CI increased thermogenic gene expression in BAT, body temperature maintenance was impaired. We interpret this reduction in thermoregulation as reflecting decreased heat production from skeletal muscle; however, as the reviewer noted, under prolonged CI, depletion of energy stores could further prevent BAT from fully exerting its thermogenic function.

      We have clarified in the revised Discussion that BAT activation under CI is transient, and that long-term outcomes may be influenced by contributions from other thermogenic organs, and that we recognize the impact of energy depletion as an important issue to be addressed in future studies. We also agree that detailed analyses of metabolic changes and BCAA dynamics following prolonged CI will be an important next step.

      Regarding the reviewer’s concern about potential anesthesia effects on acute cold exposure experiments, we confirmed that body temperature had returned to baseline one hour before testing, and that mice displayed spontaneous feeding and grooming behaviors, which suggested adequate recovery. Moreover, the differences observed compared with sham-anesthetized controls support our interpretation that the results reflect CI-specific effects. Nonetheless, we acknowledge this potential confounding factor as an additional limitation.

      Response to Reviewer 2:

      We thank the reviewer for the constructive comments and clear summary of our findings. We fully agree that the impact of immobilization on skeletal muscle and BAT function under cold exposure represents a key future direction. In the present study, we performed acute cold exposure following short-term immobilization and assessed UCP1 expression and metabolic changes in BAT. However, we acknowledge that we did not fully examine coordinated functional adaptations between skeletal muscle and BAT under cold stress. In particular, how skeletal muscle–derived amino acid supply and IL-6–dependent mechanisms operate during cold exposure remains unresolved. We have therefore noted this explicitly as a limitation and highlighted it as a focus for future work. Going forward, we plan to investigate muscle–BAT metabolic crosstalk and IL-6 signaling in detail under cold conditions to clarify whether the observed responses are specific to CI or represent more general physiological adaptations.

      (1) Herman MA, She P, Peroni OD, Lynch CJ, Kahn BB. Adipose tissue branched chain amino acid (BCAA) metabolism modulates circulating BCAA levels. J Biol Chem. 2010;285(15):11348-56. doi:10.1074/jbc.M109.075184.


      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      Heat production mechanisms are flexible, depending on a wide variety of genetic, dietary, and environmental factors. The physiology associated with each mechanism is important to understand since loss of flexibility is associated with metabolic decline and disease. The phenomenon of compensatory heat production has been described in some detail in publications and reviews, notably by modifying BAT-dependent thermogenesis (for example by deleting UCP1 or impairing lipolysis, cited in this paper). These authors chose to eliminate exercise as an alternative means of maintaining body temperature. To do this, they cast either one or both mouse hindlimbs. This paper is set up as an evaluation of a loss of function of muscle on the functionality of BAT.

      Strengths:

      The study is supported by a variety of modern techniques and procedures.

      Weaknesses:

      The authors show that cast immobilization (CI) does not work as a (passive) loss of function, instead, this procedure produces a dramatic gain of function, putting the animal under considerable stress, inducing b-adrenergic effectors, increased oxygen consumption, and IL6 expression in a variety of tissues, together with commensurate cachectic effects on muscle and fat. The BAT is put under considerable stress, super-induced but relatively poor functioning. Thus within hours and days of CI, there is massive muscle loss (leading to high circulating BCAAs), and loss of lipid reserves in adipose and liver. The lipid cycle that maintains BAT thermogenesis is depleted and the mouse is unable to maintain body temperature.

      I cannot agree with these statements in the Discussion:  

      "We have here shown that cast immobilization suppressed skeletal muscle thermogenesis, resulting in failure to maintain core body temperature in a cold environment."

      This result could also be attributed to high stress and decreased calorie reserves. Note also: CI suppresses 50% of locomotor activity, but the actual work done by the mouse carrying bilateral casts is not taken into account.

      We appreciate the reviewer's suggestion. We thank you for raising this issue. As the reviewers suggest, we also consider that cold intolerance resulting from cast immobilization may be attributed to high stress levels, decreased calorie reserves, or reduced systemic locomotor activity. Indeed, reductions in the weight of visceral adipose tissue weight and increases in lipid utilization were observed in the early phase of cast immobilization (Fig.2G and 2F). This suggests that the depletion of calorie reserves induced by stress may affect cold intolerance in cast immobilized mice (Fig.1A-1B). On the other hand, the experiment shown in Fig.1C involved acute cold exposure of mice 2 h after cast immobilization. This result suggests that, even before the depletion of energy stores by immobilization of skeletal muscle, cast immobilization may cause cold intolerance in mice. In addition, as the reviewer suggests, cast immobilization may result in BAT thermogenesis and cachectic effects on muscle and fat. However, circulating corticosterone concentrations and hypothalamic CRH gene expression are not significantly altered after cast immobilization (Figure 2_figure supplement 2D-F). This raises questions about the contribution of stress to the changes in the systemic energy metabolism in this model. As such, we responded to the reviewers’ comments by revising this statement at the beginning of the ‘Discussion’ section and adding a discussion on pages 16, in addition to the existing discussion on pages 17–18.

      Furthermore, to respond as best we could to the reviewer's comments, we performed additional experiments using the restraint stress model (Figure 7). We found that short-term restraint stress may recruit substrate supply from skeletal muscle for BAT thermogenesis via Il6 gene expression. Based on these data, we speculate that the interaction between BAT and skeletal muscle amino acid metabolism may operate under various physiological stress conditions, including infection and exercise, as well as skeletal muscle immobilization, stress, and cold exposure. This interaction may play a significant role in regulating body temperature and energy metabolism. We are currently investigating the effects of sympathetic activation on skeletal muscle amino acid metabolism and systemic thermoregulation via IL-6 secretion from skeletal muscle using a new model. These data will be reported in a subsequent report.

      "Thermoregulatory system in endotherms cannot be explained by thermogenesis based on muscle contraction alone, with nonshivering thermogenesis being required as a component of the ability to tolerate cold temperatures in the long term."

      This statement is correct, and it clearly showcases how difficult it is to interpret results using this CI strategy. The question to the author is- which components of muscle thermogenesis are actually inhibited by CI, and what is the relative heat contribution?

      We appreciate raising this important issue. This study required the measurements of skeletal muscle temperature and electromyography in mice with cast immobilization, but we were unable to perform these measurements. We have therefore described the reviewers suggest on page 18 as limitations of this study.

      In our additional experiments, we found that several genes that are usually activated in skeletal muscle during cold exposure are repressed in mice with cast immobilization (Figure 1_figure supplement 1_G-1K). Skeletal muscle is an important thermogenic organ. Although the role of the sarcolipin gene in non-shivering thermogenesis is well understood, the primary regulator of thermogenesis in metabolism and shivering remains unclear. In Future, we would like to use models in which key signals for energy metabolism are inhibited, such as muscle-specific PGC-1α-deficient mice and muscle-specific AMPK-deficient mice, to clarify important factors in skeletal muscle heat thermogenesis. We expect this approach to enable us to analyze the relative thermal contributions of each component of the heat production process in skeletal muscle, which has proven difficult in immobilized muscle models.

      This conclusion is overinterpreted:

      "In conclusion, we have shown that cast immobilization induced thermogenesis in BAT that was dependent on the utilization of free amino acids derived from skeletal muscle, and that muscle-derived IL-6 stimulated BCAA metabolism in skeletal muscle. Our findings may provide new insights into the significance of skeletal muscle as a large reservoir of amino acids in the regulation of body temperature".

      In terms of the production of the article - the data shown in the heat maps has oddly obscure log10 dimensions. The changes are minimal, approx. 1.5x increase/decrease and therefore significance would be key to reporting these data. Fig.3C heatmap is not suitable. What are the 6 lanes to each condition? Overall, this has little/no information.

      Rather than cherry-picking for a few genes, the results could be made more rigorous using RNA-seq profiling of BAT and muscle tissues.

      We agree that this is an important point. Indeed, our model of skeletal muscle immobilization reveals only modest changes in metabolomics and gene expression analysis. We consider this to be a weakness of the study. However, the interactive thermogenic system that we discovered between skeletal muscle and BAT may also function under other conditions, such as acute stress and cold exposure. We should investigate this further in future models involving such dramatic metabolic changes. In fact, it has been shown that the levels of several metabolites are significantly altered in BAT after acute cold exposure.[1] Therefore, we have corrected the conclusion of this section, as stated on page 18, and added it. We also performed an enrichment analysis on the metabolomics data in BAT following cast immobilization and included the results in Figure 2_figure Supplement 1A. In addition, we excluded the heatmap from Fig. 3C of the pre-revision manuscript, as advised by the reviewer. Although we excluded the results in Figure 3C, we consider Figure 3_figure supplement_1 to be sufficient for the text.  

      In addition, we agree with the reviewer's remarks on our gene expression analysis. In this study, we were unable to examine RNA-seq profiling of BAT and muscle tissue. Therefore, we have described this as a limitation of the study on page 20. However, we are interested in investigating the effect of IL-6 derived from skeletal muscle on RNA-seq profiling of skeletal muscle and BAT. We will conduct future RNA-seq analyses of BAT and skeletal muscle, using models of skeletal muscle immobilization, acute cold exposure and restraint stress.

      Reviewer #2 (Public Review):

      Summary:

      In this study, the authors identified a previously unrecognized organ interaction where limb immobilization induces thermogenesis in BAT. They showed that limb immobilization by cast fixation enhances the expression of UCP1 as well as amino acid transporters in BAT, and amino acids are supplied from skeletal muscle to BAT during this process, likely contributing to increased thermogenesis in BAT. Furthermore, the experiments with IL-6 knockout mice and IL-6 administration to these mice suggest that this cytokine is likely involved in the supply of amino acids from skeletal muscle to BAT during limb immobilization.

      Strengths:

      The function of BAT plays a crucial role in the regulation of an individual's energy and body weight. Therefore, identifying new interventions that can control BAT function is not only scientifically significant but also holds substantial promise for medical applications. The authors have thoroughly and comprehensively examined the changes in skeletal muscle and BAT under these conditions, convincingly demonstrating the significance of this organ interaction.

      Weaknesses:

      Through considerable effort, the authors have demonstrated that limb-immobilized mice exhibit changes in thermogenesis and energy metabolism dynamics at their steady state. However, The impact of immobilization on the function of skeletal muscle and BAT during cold exposure has not been thoroughly analyzed.

      Reviewer #3 (Public Review):

      Summary:

      In this manuscript, the authors show that impairment of hind limb muscle contraction by cast immobilization suppresses skeletal muscle thermogenesis and activates thermogenesis in brown fat. They also propose that free BCAAs derived from skeletal muscle are used for BAT thermogenesis, and identify IL-6 as a potential regulator.

      Strengths:

      The data support the conclusions for the most part.

      Weaknesses: The data provided in this manuscript are largely descriptive. It is therefore difficult to assess the potential significance of the work. Moreover, many of the described effects are modest in magnitude, questioning the overall functional relevance of this pathway. There are no experiments that directly test whether BCAAs derived from adipose tissue are used for thermogenesis, which would require more robust tracing experiments. In addition, the rigor of the work should be improved. It is also recommended to put the current work in the context of the literature.

      We appreciate the reviewer's valuable feedback. As the reviewer pointed out, many of the effects described in this study are modest in magnitude. This reflects a limitation of our study, which used skeletal muscle immobilization as a model. To clarify the overall functional relevance of this pathway, we therefore plan to use alternative models in which BAT thermogenesis and systemic cachectic effect are more strongly induced. We have added this point to the 'Conclusion' section on page 18.

      In addition, previous findings reported that mitochondrial BCAA catabolism in brown adipocytes promotes systemic BCAA clearance, suggesting that BCAAs may be supplied to BAT from other organs during BAT thermogenesis.[5] However, as the reviewer rightly pointed out, the current study did not directly investigate whether BCAAs derived from adipose tissue contribute to thermogenic processes. In light of this, we have revised the manuscript to include a statement in the limitations section on page 20 that addresses this point. 

      Metabolomic analysis of white adipose tissue (WAT) following skeletal muscle immobilization revealed alterations in amino acid concentrations in WAT in response to cast immobilization (Author response image 1A). Notably, levels of BCAAs in WAT remained largely unchanged at 24 hours after cast immobilization, but increased significantly by day 7 (Author response image 1B). At the 24-hour time point, when BAT thermogenesis is known to be activated, WAT weights was found to be reduced (Fig. 2H). Gene expression analysis of amino acid metabolism-related genes in WAT at this time point revealed a modest upregulation of several genes (Author response image 1C). Furthermore, a slight increase in the uptake of [<sup>3</sup>H] leucine into WAT was observed following immobilization (Fig. 3C). Collectively, these findings suggest that BCAAs within WAT may be primarily metabolized locally rather than being mobilized and supplied to BAT. In addition, given the relatively low levels of BCAAs per tissue mass and the limited capacity for BCAA uptake in WAT compared to other tissues, we consider it unlikely that WAT serves as a major reservoir of BCAAs.

      Author response image 1.

      (A) Amino acids in epididymal white adipose tissue (eWAT) of IL-6 KO (–/–) and WT (+/+) mice without (control) or with bilateral cast immobilization for the indicated times. Results are presented as heat maps of the log10 value of the fold change relative to control WT mice and are means of four mice in each group. (B) BCAA concentrations in eWAT of IL-6 KO and WT mice without (control) or with bilateral cast immobilization for 1 or 7 days. (n = 4 per group) (C) RT and real-time PCR analysis of the expression of SLC1A5, SLC7A1, SLC38A2, SLC43A1, BCAT2 and BCKDHA genes in eWAT of mice without (control) or with bilateral cast immobilization for 24 h. (n = 6 per group). All data other than in (A) are means ± SEM. *p < 0.05, **p < 0.01, ***p < 0.001 as determined by Dunnett's test (B) or by the unpaired t test (C).

      Reviewer #1 (Recommendations for the authors): 

      • Gypsum is an irrelevant label. Label consistently, with a procedure acronym, like CI or Imm.

      We apologize for any confusion that our notation may have caused. We corrected all labels relating to the skeletal muscle immobilization model in mice to 'Imm'.

      There are many grammatical errors and typos. Search for an example on Fudure1. The sense of some sentences is enough to obscure their meaning.

      We appreciate the reviewer's points. We have checked the article for grammatical and typographical errors, correcting them where necessary.

      • Figures 6E and F need to be re-annotated in the legend and on figures.

      Following the peer reviewer's advice, we have re-annotated the Figure legends of this result.

      Reviewer #2 (Recommendations for the authors): 

      (1) It is difficult to understand how the data presented in Supplemental Table 1 were obtained. This appears to be data showing that the skeletal muscle weight of the hind limbs in mice accounts for 40 to 50% of the total skeletal muscle weight. How did the authors calculate the muscle weight? Specifically, how did they measure the weight of muscles that are neither in the hind limbs nor in the forelimbs ("Other")? Was this estimated from whole-body CT or MRI data?

      In the legend, it mentions "the posterior cervical region," but what exactly was measured in the posterior cervical region? The methods for this data should be clearly described.

      We appreciate the reviewers' comments. We apologize for any confusion caused by inadequate explanation of this data. This data was obtained by removing skeletal muscle from the posterior cervical region and measuring the weight of the wet tissue. We have taken care to remove most of the skeletal muscle, but some will remain. However, we do not believe that these errors are significant enough to alter the interpretation of the results. This has now been added to the 'Methods' section on page 21.

      (2) Through considerable effort, the authors have demonstrated that limb-immobilized mice exhibit changes in thermogenesis and energy metabolism dynamics at their steady state. However, it remains unclear why limb-immobilized mice have reduced tolerance to cold exposure. Was there any change in the abundance of energy metabolism-related genes during cold exposure between the immobilized and control mice? For example, if the gene expression of UCP1 and UCP2, which are typically upregulated in brown adipose tissue (BAT) and skeletal muscle during cold exposure, was suppressed in the immobilized mice, it might explain their reduced cold tolerance. Thus, the changes in the response of skeletal muscle and BAT to cold exposure between immobilized and control mice should also be analyzed.

      We thank the reviewer for the constructive comments. We consider the main weakness of this study to be the fact that we were unable to measure the temperature and electromyography (EMG) of the skeletal muscles of the cast-immobilized mice. Following the reviewers' advice, we analyzed the expression levels of several genes related to heat production or energy metabolism (Ucp1, Ucp2, Ucp3, Sln and Ppargc1a) in BAT and skeletal muscle of cast-immobilized mice after acute cold exposure (Figure1_figure supplement 1G-1K). The results showed that the expression of several genes that are usually increased in BAT and skeletal muscle during cold exposure was repressed in cast-immobilized mice. Notably, cast immobilization did not induce the UCP2 and PGC-1α genes at room temperature, and their upregulation during cold exposure was also suppressed in cast-immobilized mice. UCP2 is known to alter its expression in relation to energy metabolism, but it is unclear whether it regulates energy metabolism.[2] Additionally, UCP2 is understood to play a non-role in thermogenesis, and the function of the UCP2 in skeletal muscle remains unclear.[3] On the other hands, PGC-1α is widely recognized as a transcriptional coactivator that regulates various metabolic processes, including thermogenesis.[4] In our study, we found that the amounts of metabolites in the TCA cycle and the expression of the PGC-1α gene were decreased rapidly in immobilized skeletal muscle. This suggests that the metabolic rate is reduced in immobilized skeletal muscle (Figure 1_figure supplement 2A and 2F). In endothermic animals, energy expenditure in skeletal muscle plays a significant role in maintaining body temperature during both activity and rest. Hence, it is assumed that the reduced metabolic rate in skeletal muscle significantly impacts the maintenance of body temperature in cold conditions. Further investigation is required into the function of these genes in skeletal muscle thermogenesis, but we expect that the additional data suggest that the loss of muscle function due to immobilization affects the maintenance of body temperature under cold temperature. These results were discussed further on page 15.

      Reviewer #3 (Recommendations for the authors): 

      There are also more specific concerns related to the data supporting the claims.

      (1) The relevance of increasing thermogenesis in BAT after cast immobilization is unclear, as adult humans have very little BAT. Thermogenesis gene and protein expression should be measured in white adipose tissue.

      We would like to thank the reviewers for highlighting this important issue. We agree with the reviewer's comments. We did not observe significant changes in UCP1 expression in the subcutaneous adipose tissue of the inguinal region following skeletal muscle immobilization. We suspect that this is because skeletal muscle immobilization in mice did not exert a strong enough effect to induce browning of white adipose tissue. The ability of immobilizing skeletal muscle to activate thermogenesis in brown or beige adipocytes in adults remains unclear. We have therefore noted this limitation in our study in line 6.

      Additionally, in this study, we aimed to clarify the role of skeletal muscle as an amino acid reservoir under metabolic stress conditions that increase BAT thermogenesis. To this end, we employed models of skeletal muscle immobilization, acute cold exposure, and restraint stress. We also intend to analyze the metabolic interactions between beige adipose tissue and skeletal muscle in more detail using models that induce browning, such as exercise or cold acclimation.

      (2) In Figures 1E-G, there is no significant difference in UCP1 levels relative to the control, but body temperature is lowered from day 2 to day 7. How do the authors explain this?

      This is an important point. We consider the decrease in body temperature of mice following cast immobilization at room temperature to be the result of a reduction in systemic locomotor activity.

      (3) The small induction of PGC1a seen at 10 hours goes away after day 3. Why is this?

      This is an important point. Our investigation showed that the norepinephrine concentration in BAT and blood of cast-immobilized mice tends to increase, peaking at 24 hours of immobilization (Fig. 1H and Figure 2_figure supplement 2D), and then gradually returns to baseline. We speculate that this transient activation of the sympathetic nervous system may affect the expression of PGC1α in BAT. Additionally, although thermogenesis in BAT temporarily increases after skeletal muscle immobilization, studies from other research groups suggest that long-term skeletal muscle immobilization (two weeks) may increase non-shivering thermogenesis in skeletal muscle via high expression SLN.[6] Therefore, we hypothesize that other thermogenic mechanisms besides BAT might be involved during prolonged cast immobilization. We have added a discussion of these topics on page 16.

      (4) The metabolic cage data are marked in multiple places as significant, but the effect size is extremely small. Please describe how significance was calculated (Figure 5 supplement 1B, E, F).

      This is a valid point. This data was statistically analyzed using daily averages, with the results then being compiled. However, the figure was amended because it was not appropriate to use the original to demonstrate significant differences.

      (5) How does IL-6 increase BCAA levels in muscle?

      This is an important point. We are also investigating this issue with great interest. In future, we will use RNA-seq profiling to investigate the mechanism by which IL-6 regulates amino acid metabolism in skeletal muscle. This point was added as a

      limitation of the study on page 19.

      (6) What is the mechanism behind the elevated il6 levels after cast immobilization?

      We appreciate the reviewer's points. Since IL-6 gene expression in skeletal muscle increases in response to acute cold exposure and acute stress, we hypothesize that IL-6 is regulated by β-adrenergic effectors. In our preliminary experiments, stimulation with norepinephrine or with clenbuterol, a β2-adrenergic receptor agonist, suggests an increase in IL-6 gene expression and the intracellular free BCAA concentration in cultured mouse muscle cells (Author response image 2A-2D). Going forward, our plans include conducting further studies using a mouse model in which the sympathetic nervous system is activated by administering LPS intracerebroventricularly, as well as using muscle-specific β2-adrenergic receptor knockout mice.  

      Reference:

      (1) Okamatsu-Ogura, Y., et al. UCP1-dependent and UCP1-independent metabolic changes induced by acute cold exposure in brown adipose tissue of mice. Metabolism. 2020 113:  154396 doi: 10.1016/j.metabol.2020.154396.

      (2) Patrick Schrauwen and Matthijs Hesselink, UCP2 and UCP3 in muscle controlling body metabolism., J Exp Biol. 2002 Aug;205(Pt 15):2275-85. doi: 10.1242/jeb.205.15.2275.

      (3) C Y Zhang, et al., Uncoupling protein-2 negatively regulates insulin secretion and is a major link between obesity, beta cell dysfunction, and type 2 diabetes., Cell. 2001 Jun 15;105(6):745-55. doi: 10.1016/s0092-8674(01)00378-6.

      (4) Christophe Handschin and Bruce M Spiegelman, Peroxisome proliferator-activated receptor gamma coactivator 1 coactivators, energy homeostasis, and metabolism., Endocr Rev. 2006 Dec;27(7):728-35. doi: 10.1210/er.2006-0037.

      (5) Yoneshiro, et al., BCAA catabolism in brown fat controls energy homeostasis through SLC25A44. Nature. 2019 572(7771): 614-619 doi: 10.1038/s41586-019-1503-x.

      (6) Shigeto Tomiya, et al., Cast immobilization of hindlimb upregulates sarcolipin expression in atrophied skeletal muscles and increases thermogenesis in C57BL/6J mice., Am J Physiol Regul Integr Comp Physiol. 2019 Nov1;317(5):R649-R661.doi:10.1152/ajpregu.00118.2019.

    1. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public review): 

      Overall, the manuscript reveals the role of actin polymerization to drive the fusion of myoblasts during adult muscle regeneration. This pathway regulates fusion in many contexts, but whether it was conserved in adult muscle regeneration remained unknown. Robust genetic tools and histological analyses were used to support the claims convincingly. 

      We very much appreciate the positive comments from this Reviewer.

      There are a few interpretations that could be adjusted. 

      The beginning of the results about macrophages traversing ghost fibers after regeneration was a surprise given the context in the abstract and introduction. These results also lead to new questions about this biology that would need to be answered to substantiate the claims in this section. Also, it is unclear the precise new information learned here because it seems obvious that macrophages would need to extravasate the basement membrane to enter ghost fibers and macrophages are known to have this ability. Moreover, the model in Figure 4D has macrophages and BM but there is not even mention of this in the legend. The authors may wish to consider removing this topic from the manuscript. 

      We appreciate this comment and acknowledge that the precise behavior of macrophages when they infiltrate and/or exit the ghost fibers during muscle regeneration is not the major focus of this study. However, we think that visualizing macrophages squeezing through tiny openings on the basement membrane to infiltrate and/or exit from the ghost fibers is valuable. Thus, we have moved the data from the original main Figure 2 to the new Figure S1. 

      Regarding the model in Figure 4D, we have removed the macrophages because the depicted model represents a stage after the macrophages’ exit from the ghost fiber. 

      Which Pax7CreER line was used? In the methods, the Jax number provided is the Gaka line but in the results, Lepper et al 2009 are cited, which is not the citation for the Gaka line. 

      The Pax7<sup>CreER</sup> line used in this study is the one generated in Lepper et al. 2009. We corrected this information in “Material and Methods” of the revised manuscript. 

      Did the authors assess regeneration in the floxed mice that do not contain Cre as a control? Or is it known these alleles do not perturb the function of the targeted gene? 

      We examined muscle regeneration in the floxed mice without Cre. As shown in Figure 1 below, none of the homozygous ArpC2<sup>fl/fl</sup>, N-WASP<sup>fl/fl</sup>, CYFIP1<sup>fl/fl</sup> or N-WASP<sup>fl/fl</sup>;CYFIP1<sup>fl/fl</sup> alleles affected  muscle regeneration, indicating that these alleles do not perturb the function of the targeted gene.  

      Author response image 1.

      The muscle regeneration was normal in mice with only floxed target gene(s). Cross sections of TA muscles were stained with anti-Dystrophin and DAPI at dpi 14. n = 3 mice of each genotype, and > 80 ghost fibers in each mouse were examined. Mean ± s.d. values are shown in the dot-bar plot, and significance was determined by two-tailed student’s t-test. ns: not significant. Scale bar: 100 μm.

      The authors comment: 'Interestingly, expression of the fusogenic proteins, MymK and MymX, was up-regulated in the TA muscle of these mice (Figure S4F), suggesting that fusogen overexpression is not able to rescue the SCM fusion defect resulted from defective branched actin polymerization.' It is unclear if fusogens are truly overexpressed because the analysis is performed at dpi 4 when the expression of fusogens may be decreased in control mice because they have already fused. Also, only two animals were analyzed and it is unclear if MymX is definitively increased. The authors should consider adjusting the interpretation to SCM fusion defect resulting from defective branched actin polymerization is unlikely to be caused by a lack of fusogen expression. 

      We agree with the Reviewer that fusogen expression may simply persist till later time points in fusion mutants without being up-regulated. We have modified our interpretation according to the Reviewer’s suggestion. 

      Regarding the western blots in the original Figure S4F, we now show one experiment from each genotype, and include the quantification of MymK and MymX protein levels from 3 animals in the revised manuscript (new Figure S5F-S5H). 

      Reviewer #1 (Recommendations for the authors): 

      (1) The ArpC2 cKO data could be presented in a clearer fashion. In the text, ArpC2 is discussed but in the figure, there are many other KOs presented and ArpC2 is the fourth one shown in the figure. The other KOs are discussed later. It may be worthwhile for the authors to rearrange the figures to make it easier for readers. 

      Thank you for this suggestion. We have rearranged the genotypes in the figures accordingly and placed ArpC2 cKO first. 

      The authors comment: 'Since SCM fusion is mostly completed at dpi 4.5 (Figure 1B) (Collins et al. 2024)'. This is not an accurate statement of the cited paper. While myofibers are formed by dpi 4.5 with centralized nuclei, there are additional fusion events through at least 21dpi. The authors should adjust their statement to better reflect the data in Collins et al 2024, which could include mentioning that primary fusions could be completed at dpi 4.5 and this is the process they are studying. 

      We have adjusted our statement accordingly in the revised manuscript.

      The authors comment: 'Consistent with this, the frequency distribution of SCM number per ghost fiber displayed a dramatic shift toward higher numbers in the ArpC2<sup>cKO</sup> mice (Figure S5C). These results indicate that the actin cytoskeleton plays an essential role in SCM fusion as the fusogenic proteins. Should it read 'These results indicate that the actin cytoskeleton plays AS an essential role in SCM fusion as the fusogenic proteins'? 

      Yes, and we adjusted this statement accordingly in the revised manuscript. 

      Minor comments 

      (1) In the results the authors state 'To induce genetic deletion of ArpC2 in satellites....'; 'satellites' is a term not typically used for satellite cells. 

      Thanks for catching this. We changed “satellites” to satellite cells.

      (2) In the next sentence, the satellite should be capitalized. 

      Done.

      (3) The cross-section area should be a 'cross-sectional area'. 

      Changed.

      Reviewer #2 (Public review):

      To fuse, differentiated muscle cells must rearrange their cytoskeleton and assemble actinenriched cytoskeletal structures. These actin foci are proposed to generate mechanical forces necessary to drive close membrane apposition and fusion pore formation. 

      While the study of these actin-rich structures has been conducted mainly in drosophila, the present manuscript presents clear evidence this mechanism is necessary for the fusion of adult muscle stem cells in vivo, in mice. 

      We thank this Reviewer for the positive comment.

      However, the authors need to tone down their interpretation of their findings and remember that genetic proof for cytoskeletal actin remodeling to allow muscle fusion in mice has already been provided by different labs (Vasyutina E, et al. 2009 PMID: 19443691; Gruenbaum-Cohen Y, et al., 2012 PMID: 22736793; Hamoud et al., 2014 PMID: 24567399). In the same line of thought, the authors write they "demonstrated a critical function of branched actin-propelled invasive protrusions in skeletal muscle regeneration". I believe this is not a premiere, since Randrianarison-Huetz V, et al., previously reported the existence of finger-like actin-based protrusions at fusion sites in mice myoblasts (PMID: 2926942) and Eigler T, et al., live-recorded said "fusogenic synapse" in mice myoblasts (PMID: 34932950). Hence, while the data presented here clearly demonstrate that ARP2/3 and SCAR/WAVE complexes are required for differentiating satellite cell fusion into multinucleated myotubes, this is an incremental story, and the authors should put their results in the context of previous literature. 

      In this study, we focused on elucidating the mechanisms of myoblast fusion during skeletal muscle regeneration, which remained largely unknown. Thus, we respectfully disagree with this Reviewer that “this is an incremental story” for the following reasons – 

      First, while we agree with this Reviewer that “genetic proof for cytoskeletal actin remodeling to allow muscle fusion in mice has already been provided by different labs”, most of the previous genetic studies, including ours (Lu et al. 2024), characterizing the roles of actin regulators (Elmo, Dock180, Rac, Cdc42, WASP, WIP, WAVE, Arp2/3) in mouse myoblast fusion were conducted during embryogenesis (Laurin et al. 2008; Vasyutina et al. 2009; Gruenbaum-Cohen et al. 2012; Tran et al. 2022; Lu et al. 2024), instead of during adult muscle regeneration, the latter of which is the focus of this study. 

      Second, prior to this study, several groups tested the roles of SRF, CaMKII theta and gemma, Myo10, and Elmo, which affect actin cytoskeletal dynamics, in muscle regeneration. These studies have shown that knocking out SRF, CaMKII, Myo10, or Elmo caused defects in mouse muscle regeneration, based on measuring the cross-sectional diameters of regenerated myofibers only (Randrianarison-Huetz et al. 2018; Eigler et al. 2021; Hammers et al. 2021; Tran et al. 2022). However, none of these studies visualized myoblast fusion at the cellular and subcellular levels during muscle regeneration in vivo. For this reason, it remained unclear whether the muscle regeneration defects in these mutants were indeed due to defects in myoblast fusion, in particular, defects in the formation of invasive protrusions at the fusogenic synapse. Thus, the previous studies did not demonstrate a direct role for the actin cytoskeleton, as well as the underlying mechanisms, in myoblast fusion during muscle regeneration in vivo.

      Third, regarding actin-propelled invasive protrusions at the fusogenic synapse, our previous study (Lu et al. 2024) revealed these structures by fluorescent live cell imaging and electron microscopy (EM) in cultured muscle cells, as well as EM studies in mouse embryonic limb muscle, firmly establishing a direct role for invasive protrusions in mouse myoblast fusion in cultured muscle cells and during embryonic development. Randrianarison-Huetz et al. (2018) reported the existence of finger-like actin-based protrusions at cell contact sites of cultured mouse myoblasts. It was unclear from their study, however, if these protrusions were at the actual fusion sites and if they were invasive (Randrianarison-Huetz et al. 2018). Eigler et al. (2021) reported protrusions at fusogenic synapse in cultured mouse myoblasts. It was unclear from their study, however, if the protrusions were actin-based and if they were invasive (Eigler et al. 2021). Neither Randrianarison-Huetz et al. (2018) nor Eigler et al. (2021) characterized protrusions in developing mouse embryos or regenerating adult muscle. 

      Taken together, to our knowledge, this is the first study to characterize myoblast fusion at the cellular and subcellular level during mouse muscle regeneration. We demonstrate that branched actin polymerization promotes invasive protrusion formation and myoblast fusion during the regeneration process. We believe that this work has laid the foundation for additional mechanistic studies of myoblast fusion during skeletal muscle regeneration.

      The citations in the original manuscript were primarily focused on previous in vivo studies of Arp2/3 and the actin nucleation-promoting factors (NPFs), N-WASP and WAVE (Richardson et al. 2007; Gruenbaum-Cohen et al. 2012), and of invasive protrusions mediating myoblast fusion in intact animals (Drosophila, zebrafish and mice) (Sens et al. 2010; Luo et al. 2022; Lu et al. 2024). We agree with this reviewer, however, that it would be beneficial to the readers if we provide a more comprehensive summary of previous literature, including studies of both intact animals and cultured cells, as well as studies of additional actin regulators upstream of the NPFs, such as small GTPases and their GEFs. Thus, we have significantly expanded our Introduction to include these studies and cited the corresponding literature in the revised manuscript.

      Reviewer #2 (Recommendations for the authors): 

      (1) I am concerned that the authors did not evaluate the efficiency of the target allele deletion efficiency following Pax7-CreER activation. The majority, if not all, of the published work focusing on this genetic strategy presents the knock-down efficiency using either genotyping PCR, immunolocalization, western-blot; etc... 

      (2) Can the authors provide evidence that the N-WASP, CYFIP1, and ARPC2 proteins are depleted in TAM-treated tissue? Alternatively, can the author perform RT-qPCR on freshly isolated MuSCs to validate the absence of N-WASP, CYFIP1, and ARPC2 mRNA expression?

      Thank you for these comments. We have assessed the target allele deletion efficiency with isolated satellite cells from TAM-injected mice in which Pax7-CreER is activated. Western blot analyses showed that the protein levels of N-WASP, CYFIP1, and ArpC2 significantly decreased in the satellite cells of knockout mice. Please see the new Figure S2.

      Reviewer #3 (Public review): 

      The manuscript by Lu et al. explores the role of the Arp2/3 complex and the actin nucleators NWASP and WAVE in myoblast fusion during muscle regeneration. The results are clear and compelling, effectively supporting the main claims of the study. However, the manuscript could benefit from a more detailed molecular and cellular analysis of the fusion synapse. Additionally, while the description of macrophage extravasation from ghost fibers is intriguing, it seems somewhat disconnected from the primary focus of the work. 

      Despite this, the data are robust, and the major conclusions are well supported. Understanding muscle fusion mechanism is still a widely unexplored topic in the field and the authors make important progress in this domain. 

      We appreciate the positive comments from this Reviewer.

      We agree with this Reviewer and Reviewer #1 that the macrophage study is not the primary focus of the work. However, we think that visualizing macrophages squeezing through tiny openings on the basement membrane to infiltrate and/or exit from the ghost fibers is valuable. Thus, we have moved the data from the original main Figure 2 to the new Figure S1. 

      I have a few suggestions that might strengthen the manuscript as outlined below.  

      (1) Could the authors provide more detail on how they defined cells with "invasive protrusions" in Figure 4C? Membrane blebs are commonly observed in contacting cells, so it would be important to clarify the criteria used for counting this specific event. 

      Thanks for this suggestion. We define invasive protrusions as finger-like protrusions projected by a cell into its fusion partner. Based on our previous studies (Sens et al. 2010; Luo et al. 2022; Lu et al. 2024), these invasive protrusions are narrow (with 100-250 nm diameters) and propelled by mechanically stiff actin bundles. In contrast, membrane blebs are spherical protrusions formed by the detachment of the plasma membrane from the underlying actin cytoskeleton. In general, the blebs are not as mechanically stiff as invasive protrusions and would not be able to project into neighboring cells. Thus, we do not think that the protrusions in Figure 4B are membrane blebs. We clarified the criteria in the text and figure legends of the revised manuscript.

      (2) Along the same line, please clarify what each individual dot represents in Figure 4C. The authors mention quantifying approximately 83 SCMs from 20 fibers. I assume each dot corresponds to data from individual fibers, but if that's the case, does this imply that only around four SCMs were quantified per fiber? A more detailed explanation would be helpful. 

      To quantitatively assess invasive protrusions in Ctrl and mutant mice, we analyzed 20 randomly selected ghost fibers per genotype. Within each ghost fiber, we examined randomly selected SCMs in a single cross section (a total of 83, 147 and 93 SCMs in Ctrl, ArpC2<sup>cKO</sup> and MymX<sup>cKO</sup> mice were examined, respectively). 

      In Figure 4C, each dot was intended to represent the percentage of SCMs with invasive protrusions in a single cross section of a ghost fiber. However, we mistakenly inserted a wrong graph in the original Figure 4C. We sincerely apologize for this error and have replaced it with the correct graph in the new Figure 4C.

      (3) Localizing ArpC2 at the invasive protrusions would be a strong addition to this study. Furthermore, have the authors examined the localization of Myomaker and Myomixer in ArpC2 mutant cells? This could provide insights into potential disruptions in the fusion machinery.

      We have examined the localization of the Arp2/3 complex on the invasive protrusions in cultured SCMs and included the data in Figure 4A of the original manuscript. Specifically, we showed enrichment of mNeongreen-tagged Arp2, a subunit of the Arp2/3 complex, on the invasive protrusions at the fusogenic synapse of cultured SCMs (see the enlarged panels on the right; also see supplemental video 4). The small size of the invasive protrusions on SCMs prevented a detailed analysis of the precise Arp2 localization along the protrusions.  Please see our recently published paper (Lu et al. 2024) for the detailed localization and function of the Arp2/3 complex during invasive protrusion formation in cultured C2C12 cells. 

      We have also attempted to localize the Arp2/3 complex in the regenerating muscle in vivo using an anti-ArpC2 antibody (Millipore, 07-227-I), which was used in many studies to visualize the Arp2/3 complex in cultured cells. Unfortunately, the antibody detected non-specific signals in the regenerating TA muscle of the ArpC2<sup>cKO</sup> animals. Thus, it cannot be used to detect specific ArpC2 signals in muscle tissues. Besides the specificity issue of the antibody, it is technically challenging to visualize invasive protrusions with an F-actin probe at the fusogenic synapses of regenerating muscle by light microscopy, due to the high background of F-actin signaling within the muscle cells. 

      Regarding the fusogens, we show that both are present in the TA muscle of the ArpC2<sup>cKO</sup> animals by western blot (Figure S5F-S5H). Thus, the fusion defect in these animals is not due to the lack of fusogen expression. Since the focus of this study is on the role of the actin cytoskeleton in muscle regeneration, the subcellular localization of the fusogens was not investigated in the current study. 

      (4) As a minor curiosity, can ArpC2 WT and mutant cells fuse with each other?

      Our previous work in Drosophila embryos showed that Arp2/3-mediated branched actin polymerization is required in both the invading and receiving fusion partners (Sens et al. 2010).  To address this question in mouse muscle cells, we co-cultured GFP<sup>+</sup> WT cells with mScarleti<sup>+</sup> WT (or mScarleti<sup>+</sup> ArpC2<sup>cKO</sup> cells) in vitro and assessed their ability to fuse with one another. We found that ArpC2<sup>cKO</sup> cells could barely fuse with WT cells (new Figure 3F and 3G), indicating that the Arp2/3-mediated branched actin polymerization is required in both fusion partners. This result is consistent with our findings in Drosophila embryos. 

      (5) The authors report a strong reduction in CSA at 14 dpi and 28 dpi, attributing this defect primarily to failed myoblast fusion. Although this claim is supported by observations at early time points, I wonder whether the Arp2/3 complex might also play roles in myofibers after fusion. For instance, Arp2/3 could be required for the growth or maintenance of healthy myofibers, which could also contribute to the reduced CSA observed, since regenerated myofibers inherit the ArpC2 knockout from the stem cells. Could the authors address or exclude this possibility? This is rather a broader criticism of how things are being interpreted in general beyond this paper. 

      This is an interesting question. It is possible that Arp2/3 may play a role in the growth or maintenance of healthy myofibers. However, the muscle injury and regeneration process may not be the best system to address this question because of the indispensable early step of myoblast fusion. Ideally, one may want to knockout Arp2/3 in myofibers of young healthy mice and observe fiber growth in the absence of muscle injury and compare that to the wild-type littermates. Since these experiments are out of the scope of this study, we revised our conclusion that the fusion defect in ArpC2<sup>cKO</sup> mice should account, at least in part, for the strong reduction in CSA at 14 dpi and 28 dpi, without excluding additional possibilities such as Arp2/3’s potential role in the growth or maintenance of healthy myofibers.  

      References:

      Eigler T, Zarfati G, Amzallag E, Sinha S, Segev N, Zabary Y, Zaritsky A, Shakked A, Umansky KB, Schejter ED et al. 2021. ERK1/2 inhibition promotes robust myotube growth via CaMKII activation resulting in myoblast-to-myotube fusion. Dev Cell 56: 3349-3363 e3346.

      Gruenbaum-Cohen Y, Harel I, Umansky KB, Tzahor E, Snapper SB, Shilo BZ, Schejter ED. 2012. The actin regulator N-WASp is required for muscle-cell fusion in mice. Proc Natl Acad Sci U S A 109: 11211-11216.

      Hammers DW, Hart CC, Matheny MK, Heimsath EG, Lee YI, Hammer JA, 3rd, Cheney RE, Sweeney HL. 2021. Filopodia powered by class x myosin promote fusion of mammalian myoblasts. Elife 10.

      Laurin M, Fradet N, Blangy A, Hall A, Vuori K, Cote JF. 2008. The atypical Rac activator Dock180 (Dock1) regulates myoblast fusion in vivo. Proc Natl Acad Sci U S A 105: 15446-15451.

      Lu Y, Walji T, Ravaux B, Pandey P, Yang C, Li B, Luvsanjav D, Lam KH, Zhang R, Luo Z et al. 2024. Spatiotemporal coordination of actin regulators generates invasive protrusions in cell-cell fusion. Nat Cell Biol 26: 1860-1877.

      Luo Z, Shi J, Pandey P, Ruan ZR, Sevdali M, Bu Y, Lu Y, Du S, Chen EH. 2022. The cellular architecture and molecular determinants of the zebrafish fusogenic synapse. Dev Cell 57: 1582-1597 e1586.

      Randrianarison-Huetz V, Papaefthymiou A, Herledan G, Noviello C, Faradova U, Collard L, Pincini A, Schol E, Decaux JF, Maire P et al. 2018. Srf controls satellite cell fusion through the maintenance of actin architecture. J Cell Biol 217: 685-700.

      Richardson BE, Beckett K, Nowak SJ, Baylies MK. 2007. SCAR/WAVE and Arp2/3 are crucial for cytoskeletal remodeling at the site of myoblast fusion. Development 134: 4357-4367.

      Sens KL, Zhang S, Jin P, Duan R, Zhang G, Luo F, Parachini L, Chen EH. 2010. An invasive podosome-like structure promotes fusion pore formation during myoblast fusion. J Cell Biol 191: 1013-1027.

      Tran V, Nahle S, Robert A, Desanlis I, Killoran R, Ehresmann S, Thibault MP, Barford D, Ravichandran KS, Sauvageau M et al. 2022. Biasing the conformation of ELMO2 reveals that myoblast fusion can be exploited to improve muscle regeneration. Nat Commun 13: 7077.

      Vasyutina E, Martarelli B, Brakebusch C, Wende H, Birchmeier C. 2009. The small G-proteins Rac1 and Cdc42 are essential for myoblast fusion in the mouse. Proc Natl Acad Sci U S A 106: 8935-8940.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      EnvA-pseudotyped glycoprotein-deleted rabies virus has emerged as an essential tool for tracing monosynaptic inputs to genetically defined neuron populations in the mammalian brain. Recently, in addition to the SAD B19 rabies virus strain first described by Callaway and colleagues in 2007, the CVS N2c rabies virus strain has become popular due to its low toxicity and high trans-synaptic transfer efficiency. However, despite its widespread use in the mammalian brain, particularly in mice, the application of this cell-type-specific monosynaptic rabies tracing system in zebrafish has been limited by low labeling efficiency and high toxicity. In this manuscript, the authors aimed to develop an efficient retrograde monosynaptic rabies-mediated circuit mapping tool for larval zebrafish. Given the translucent nature of larval zebrafish, whole-brain neuronal activities can be monitored, perturbed, and recorded over time. Introducing a robust circuit mapping tool for larval zebrafish would enable researchers to simultaneously investigate the structure and function of neural circuits, which would be of significant interest to the neural circuit research community. Furthermore, the ability to track rabies-labeled cells over time in the transparent brain could enhance our understanding of the trans-synaptic retrograde tracing mechanism of the rabies virus. 

      To establish an efficient rabies virus tracing system in the larval zebrafish brain, the authors conducted meticulous side-by-side experiments to determine the optimal combination of trans-expressed rabies G proteins, TVA receptors, and recombinant rabies virus strains. Consistent with observations in the mouse brain, the CVS N2c strain trans-complemented with N2cG was found to be superior to the SAD B19 combination, offering lower toxicity and higher efficiency in labeling presynaptic neurons. Additionally, the authors tested various temperatures for the larvae post-virus injection and identified 36℃ as the optimal temperature for improved virus labeling. They then validated the system in the cerebellar circuits, noting evolutionary conservation in the cerebellar structure between zebrafish and mammals. The monosynaptic inputs to Purkinje cells from granule cells were neatly confirmed through ablation experiments.

      However, there are a couple of issues that this study should address. Additionally, conducting some extra experiments could provide valuable information to the broader research field utilizing recombinant rabies viruses as retrograde tracers.

      (1) It was observed that many radial glia were labeled, which casts doubt on the specificity of trans-synaptic spread between neurons. The issues of transneuronal labeling of glial cells should be addressed and discussed in more detail. In this manuscript, the authors used a transgenic zebrafish line carrying a neuron-specific Cre-dependent reporter and EnvA-CVS N2c(dG)-Cre virus to avoid the visualization of virally infected glial cells. However, this does not solve the real issue of glial cell labeling and the possibility of a nonsynaptic spread mechanism.

      In agreement with the reviewer’s suggestion, we have incorporated a standalone section in the revised Discussion (page 9) to address the issue of transneuronal glial labeling, including its spatial distribution, temporal dynamics, potential mechanisms, and possible strategies for real resolution.

      Regarding the specificity of trans-synaptic spread between neurons, we have demonstrated that our transsynaptic tracing system reliably and specifically labels input neurons. Structurally, we only observed labeling of inferior olivary cells (IOCs) outside the cerebellum, which are the only known extracerebellar inputs to Purkinje cells (PCs), while all other traced neurons remained confined within the cerebellum throughout the observation period (see Figure 2G–I). Functionally, we verified that the traced neurons formed synaptic connections with the starter PCs (see Figure 2J–M). Together, these findings support the conclusion that our system enables robust and specific retrograde monosynaptic tracing of neurons in larval zebrafish.

      Regarding the transneuronal labeling of radial glia cells, we observed that their distribution closely correlates with the location of neuronal somata and dendrites (see Author response image 2). In zebrafish, radial glial cells are considered functional analogs of astrocytes and are often referred to as radial astroglia. The adjacent labeled astroglia may participate in tripartite synapses with the starter neurons and express viral receptors that enable RV particle entry at postsynaptic sites. This suggests that rabies-based tracing in zebrafish may serve as a valuable tool for identifying synaptically associated and functionally connected glia. Leveraging this approach to investigate glia–neuron interactions represents a promising direction for future research.

      In our system, the glial labeling diminishes at later larval stages, likely due to abortive infection (see Author response image 3 and relevant response). However, the eventual clearance of infection does not preclude the initial infection of glial cells, which may compete with neuronal labeling and reduce overall tracing efficiency. Notably, transneuronal infection of glial cells by RV has also been observed in mammals (Marshel et al., 2010). To minimize such off-target labeling, future work should focus on elucidating the mechanisms underlying glial susceptibility—such as receptor-mediated viral entry— and developing strategies to suppress receptor expression specifically in glia, thereby improving the specificity and efficiency of neuronal circuit tracing.

      In addition, wrong citations in Line 307 were made when referring to previous studies discovering the same issue of RVdG-based transneuronal labeling radial glial cells. "The RVdG-based transneuronal labeling of radial glial cells was commonly observed in larval zebrafish29,30".

      The cited work was conducted using vesicular stomatitis virus (VSV). A more thorough analysis and/or discussion on this topic should be included.

      We thank the reviewer for pointing out the citation inaccuracy. The referenced study employed vesicular stomatitis virus (VSV), which, like RV, is a member of the Rhabdoviridae family. We have revised the text accordingly—from "RVdG-based transneuronal labeling of radial glial cells…" to " Transneuronal labeling of radial glial cells mediated by VSV, a member of the Rhabdoviridae family like RV, has been commonly observed in larval zebrafish" (page 9, line 347).

      Several key questions should be addressed:

      Does the number of labeled glial cells increase over time? 

      Yes, as shown in Figure 2—figure supplement 1C and G, the number of labeled radial glial cells significantly increased from 2 to 6 days post-injection (dpi). This phenomenon has been addressed in the revised Discussion section (page 9, line 357).

      Do they increase at the same rate over time as labeled neurons?

      Although glial cell labeling continued to increase over time, we observed a slowdown in labeling rate between 6 and 10 dpi, as shown in Figure 2—figure supplement 1C and G. Therefore, we divided the timeline into two intervals (2–6 and 6–10 dpi) to compare the rate of increase in labeling between neurons and glia. The rate (R) was defined as the daily change in convergence index. To quantify the difference between neuronal and glial labeling rates, we calculated a labeling rate index: R<sub>g</sub>−R<sub>n</sub>, where R<sub>g</sub> and R<sub>n</sub> denote the rates for glia and neurons, respectively) (Author response image1). Our analysis revealed that, between 2 and 6 dpi, glial cells exhibited a higher labeling rate than neurons. However, this trend reversed between 6 and 10 dpi, with neurons surpassing glial cells in labeling rate. These findings have been included in the revised Discussion section (page 9).

      Author response image 1.

      Labeling rate index of glia and neurons across two time intervals. Data points represent the mean labeling rate index for each tracing strategy within each time interval. *P < 0.05 (nonparametric two-tailed Mann-Whitney test).  

      Are the labeled glial cells only present around the injection site?

      We believe the reviewer is inquiring whether labeled glial cells are spatially restricted to the vicinity of starter neurons. The initial infection is determined by the expression of TVA rather than the injection site. For example, injecting a high volume of virus into the anterior hindbrain resulted in the infection of TVA-expressing cells in distant regions, including the 109 tectum and posterior hindbrain (Author response image 2). 

      Regarding glial labeling, PC starter experiments showed that labeled glial cells (i.e. Bergmann glia) were predominantly localized within the cerebellum, likely due to the confinement of PC dendrites to this region. When using vglut2a to define starter neurons, glial labeling was frequently observed near the soma and dendrites of starter cells (14 out 114 of 17 cases; Author response image 2). These observations suggest that transneuronal labeled glial cells may be synaptically associated with the starter neurons. We have included this point in the revised Discussion section (page 9).

      Author response image 2.

      Location of transneuronal labeled glial cells. (a and b) Confocal images showing the right tectum (a) and posterior hindbrain (b) of different WT larvae expressing EGFP and TVA using UGNT in randomly sparse neurons (vglut2a<sup>+</sup>) and infected with CVSdGtdTomato[EnvA] (magenta) injected into the anterior hindbrain. Dashed yellow circles, starter neurons (EGFP<sup>+</sup>/tdTomato<sup>+</sup>); gray arrows, transneuronally labeled radial glia (tdTomato<sup>+</sup>/EGFP<sup>−</sup>); dashed white lines, tectum or hindbrain boundaries. C, caudal; R, rostral. Scale bars, 20 μm.

      Can the phenomenon of transneuronal labeling of radial glial cells be mitigated if the tracing is done in slightly older larvae?

      Yes, we agree. As elaborated in the following response, we hypothesize that the loss of fluorescence in radial glial cells at later developmental stages is due to abortive infection (see Author response image 3 and associated response). This supports the notion that abortive infection becomes increasingly pronounced as larvae mature, potentially explaining the negligible glial labeling observed in adult zebrafish (Dohaku et al., 2019; Satou et al., 2022). However, as noted in our response to the first comment, the disappearance of fluorescence does not indicate the absence of viral entry. Viral receptors may express on glial cells, allowing initial infection despite a failure in subsequent replication. Consequently, glial infection—though abortive—may still compete with neuronal infection and reduce tracing efficiency.

      What is the survival rate of the infected glial cells over time?

      We observed the disappearance of glial fluorescence after transneuronal labeling, while we did not observe punctate fluorescent debris typically indicative of apoptotic cell death. Therefore, we favor the hypothesis that the loss of glial fluorescence results from abortive infection rather than cell death. Abortive infection refers to a scenario in which viral replication is actively suppressed by host antiviral responses, preventing the production of infectious viral particles. For example, recent studies have shown that lab-attenuated rabies virus (RV) induces the accumulation of aberrant double-stranded DNA in astrocytes, which activates mitochondrial antiviral-signaling protein (MAVS) and subsequent interferon expression (Tian et al., 2018). This antiviral response inhibits RV replication, ultimately resulting in abortive infection. 

      In addition, we quantified the proportion of glial cells labeled at 2 dpi and 4dpi that retained fluorescence over time. By 6 dpi (approximately 11 dpf), glial labeling had largely diminished in both groups (Author response image 3). These results suggest that the decline in glial fluorescence is more closely linked to larval age than to the duration of glial infection, supporting the notion of abortive infection. This also addresses the reviewer’s earlier concern and indicates that glial labeling is mitigated in older larvae.

      Author response image 3.

      Fraction of glial cells with fluorescence retention. (a and b) Proportion of glial cells labeled at 2 dpi (a) and 4 dpi (b) that retained fluorescence over time. Data are from the CVS|N2cG|36°C group. In boxplots: center, median; bounds of box, first and third quartiles; whiskers, minimum and maximum values. n.s., not-significant; *P < 0.05, **P < 0.01 (nonparametric two-tailed Mann-Whitney test).

      If an infected glial cell dies due to infection or gets ablated, does the rabies virus spread from the dead glial cells?

      In our system, glial cells do not express the rabies glycoprotein (G). Therefore, even if glial cells are transneuronally infected, they cannot support viral budding or assembly of infectious particles due to the absence of G (Mebatsion et al., 1996), preventing further viral propagation to neighboring cells.

      If TVA and rabies G are delivered to glial cells, followed by rabies virus injection, will it lead to the infection of other glial cells or neurons?

      We have conducted experiments in which TVA and rabies G were specifically expressed in astroglia using the gfap promoter, followed by RVdG-mCherry[EnvA] injection. This resulted in initial infection of TVA-positive astroglia and occasional subsequent labeling of nearby TVA-negative astroglia (Author response image 4), suggesting astroglia-toastroglia transmission. Notably, no neuronal labeling was observed. This glial-to-glial spread is consistent with previous rabies tracing studies reporting similar phenomena involving the interaction of astrocytes with astrocytes and microglia (Clark et al., 2021). However, the underlying mechanism remains unclear, and we have discussed this in response to the first comment.

      Author response image 4.

      Viral tracing initiated from astroglia. (a) Confocal images of the tectum of a larva expressing EGFP and TVA using UGBT in randomly sparse astroglia (gfap<sup>+</sup>) and infected by SADdG-mCherry[EnvA] (magenta) injected into the anterior hindbrain.  (b) Confocal images of the posterior hindbrain of a larva expressing EGFP and TVA using UGNT in randomly sparse astroglia (gfap<sup>+</sup>) and infected by CVSdG-tdTomato[EnvA] (magenta) injected into the anterior hindbrain. Dashed yellow circles, starter astroglia (EGFP+/mCherry<su>+</sup> or EGFP<sup>+</sup>/tdTomato<sup>+</sup>); gray arrows, transneuronally labeled astroglia (tdTomato<sup>+</sup>/EGFP<sup>−</sup>); dashed white lines, tectum or hindbrain boundaries. C, caudal; R, rostral. Scale bars, 20 μm.<br />

      Answers to any of these questions could greatly benefit the broader research community.

      (2) The optimal virus tracing effect has to be achieved by raising the injected larvae at 36C. Since the routine temperature of zebrafish culture is around 28C, a more thorough characterization of the effect on the health of zebrafish should be conducted.

      Yes, 36°C is required to achieve optimal labeling efficiency. Although this is above the standard zebrafish culture temperature (28°C), previous work (Satou et al., 2022) and our observations indicate that this transient elevation does not adversely affect larval health within the experimental time window. 

      In the previous study, Satou et al. reported no temperature-dependent effects on swimming behavior, social interaction, or odor discrimination in adult fish maintained at 28°C and 36°C. In larvae, both non-injected and virus-injected fish showed a decrease in survival at later time points (7 dpi), with slightly increased mortality observed at elevated temperatures.

      In our study, we raised the same batch of non-virus-injected larvae at 28°C and 36°C, and found no mortality over a 10-day period. For CVS-N2c-injected larvae, electrode insertion caused injury, but survival rates remained around 80% at both temperatures (see Figure 3A). Moreover, we successfully maintained CVS-N2c-injected larvae at 36°C for over a month, indicating that elevated temperature does not adversely affect fish health. Notably, higher temperatures were associated with an accelerated developmental rate. 

      This point was briefly addressed in the previous version and has now been further elaborated in the revised Discussion section (page 8).

      (3) Given the ability of time-lapse imaging of the infected larval zebrafish brain, the system can be taken advantage of to tackle important issues of rabies virus tracing tools.

      a) Toxicity. 

      The toxicity of rabies viruses is an important issue that limits their application and affects the interpretation of traced circuits. For example, if a significant proportion of starter cells die before analysis, the traced presynaptic networks cannot be reliably assigned to a "defined" population of starter cells. In this manuscript, the authors did an excellent job of characterizing the effects of different rabies strains, G proteins derived from various strains, and levels of G protein expression on starter cell survival. However, an additional parameter that should be tested is the dose of rabies virus injection. The current method section states that all rabies virus preparations were diluted to 2x10^8 infection units per ml, and 2-5 nl of virus suspension was injected near the target cells. It would be interesting to know the impact of the dose/volume of virus injection on retrograde tracing efficiency and toxicity. Would higher titers of the virus lead to more efficient labeling but stronger toxicities? What would be the optimal dose/volume to balance efficiency and toxicity? Addressing these questions would provide valuable insights and help optimize the use of rabies viruses for circuit tracing.

      This is an important concern. Viral cytotoxicity is primarily driven by the level of viral transcription and replication, which inhibits host protein synthesis (Komarova et al., 2007). The RVdG-EnvA typically infects cells at a rate of one viral particle per cell (Zhang et al., 2024), suggesting that increasing viral concentration does not proportionally increase percell infection. Accordingly, viral titer and injection volume are unlikely to influence cytotoxicity at the single-cell level. In our experiments, injection volumes up to 20 nl (i.e., 4 to 10 times the standard injection volume) did not affect starter cell survival. However, higher titers or volumes may increase the number of initially infected starter cells, potentially leading to greater overall mortality in larval zebrafish.

      Similarly, given that rabies virus typically infects cells at one particle per cell, increasing viral titer alone is unlikely to enhance tracing efficiency once the virus type is fixed. In contrast, the level of G protein expression significantly influences tracing efficiency (see Figure 2D). However, excessive G protein expression reduces the survival of starter cells (see Figure 3D). Therefore, careful control of G protein levels is essential to balance tracing efficiency and cytotoxicity.

      Notably, regardless of whether infected cells undergo apoptosis or necrosis due to cytotoxicity, the resulting disruption of the plasma membrane severely impairs viral budding. As a result, the formation of intact, G protein-enveloped viral particles is prevented, limiting further infection of neighboring neurons.

      The latest second-generation ΔGL RV vectors (Jin et al., 2024), which lack both the G and L (viral polymerase) genes, have been shown to markedly reduce cytotoxicity. These improved tracing strategies may be explored in future zebrafish studies to further optimize labeling efficiency and cell viability.

      The issue of viral titer and volume has been addressed in the revised Discussion section (page 10).

      b) Primary starters and secondary starters: 

      Given that the trans-expression of TVA and G is widespread, there is the possibility of coexistence of starter cells from the initial infection (primary starters) and starter cells generated by rabies virus spreading from the primary starters to presynaptic neurons expressing G. This means that the labeled input cells could be a mixed population connected with either the primary or secondary starter cells.

      It would be immensely interesting if time-lapse imaging could be utilized to observe the appearance of such primary and secondary starter cells. Assuming there is a time difference between the initial appearance of these two populations, it may be possible to differentiate the input cells wired to these populations based on a similar temporal difference in their initial appearance. This approach could provide valuable insights into the dynamics of rabies virus spread and the connectivity of neural circuits.

      The reviewers suggestion is valuable. Regarding the use of Purkinje cells (PCs) as starter cells, we consider the occurrence of secondary PCs to be extremely rare. Although previous evidence suggests that PCs can form synaptic connections with one another (Chang et al., 2020), our sparse labeling strategy—typically involving fewer than 10 labeled cells— significantly reduces the likelihood of viral transmission between PC starter cells. In addition, if secondary starter PCs were frequently generated, we would expect increased tracing efficiency at 10 dpi compared to 6 dpi. However, our results show no significant difference (see Figure 2—figure supplement 1C and G). 

      Given the restricted expression of TVA and G in PCs, even if a limited number of secondary starters were generated, the labeled inputs would predominantly be granule cells (GCs), thereby preserving the cell-type identity of upstream inputs. While this raises a potential concern regarding an overestimation of the convergence index (CI). Notably, within the GC-PC circuit, individual GCs often project to multiple PCs. Consequently, a GC labeled via a secondary PC may also a bona fide presynaptic partner of the primary starter population. This overlap could mitigate the overestimation of CI. Taken together, we believe that the CI values reported in this study provide a reasonable approximation of monosynaptic connectivity.

      In scenarios where TVA and G are broadly expressed—for example, under the control of vglut2a promoter—secondary starter cells may arise frequently. In such cases, long-term time-lapse imaging in the zebrafish whole brain presents a promising strategy to distinguish primary and secondary starter cells, along with their respective input populations, based on the timing of their appearance. This approach potentially enables multi-step circuit tracing within individual animals. An alternative strategy is to use an EnvA-pseudotyped, G-competent rabies virus, which allows targeted initial infection while supporting multisynaptic propagation. When combined with temporally resolved imaging, this strategy could facilitate direct labeling of higher-order circuits and allow clear differentiation between multi-order inputs and the original starter population over time.

      In conclusion, we find this suggestion compelling and will explore these strategies in future studies to optimize and broaden the application of rabies virus-based circuit tracing.

      Reviewer #2 (Public Review):

      The study by Chen, Deng et al. aims to develop an efficient viral transneuronal tracing method that allows efficient retrograde tracing in the larval zebrafish. The authors utilize pseudotyped-rabies virus that can be targeted to specific cell types using the EnvA-TvA systems. Pseudotyped rabies virus has been used extensively in rodent models and, in recent years, has begun to be developed for use in adult zebrafish. However, compared to rodents, the efficiency of the spread in adult zebrafish is very low (~one upstream neuron labeled per starter cell). Additionally, there is limited evidence of retrograde tracing with pseudotyped rabies in the larval stage, which is the stage when most functional neural imaging studies are done in the field. In this study, the authors systematically optimized several parameters of rabies tracing, including different rabies virus strains, glycoprotein types, temperatures, expression construct designs, and elimination of glial labeling. The optimal configurations developed by the authors are up to 5-10 fold higher than more typically used configurations.

      The results are solid and support the conclusions. However, the methods should be described in more detail to allow other zebrafish researchers to apply this method in their own work.

      Additionally, some findings are presented anecdotally, i.e., without quantification or sufficient detail to allow close examinations. Lastly, there is concern that the reagents created by the authors will not be easily accessible to the zebrafish community.

      (1) The titer used in each experiment was not stated. In the methods section, it is stated that aliquots are stored at 2x10e8. Is it diluted for injection? Are all of the experiments in the manuscripts with the same titer?

      We injected all three viral vectors as undiluted stock aliquots. The titer for SADdGmCherry[EnvA], CVSdG-tdTomato[EnvA], and CVSdG-mCherry-2A-Cre[EnvA]) was 2 × 10<sup>8</sup>, 2 × 10<sup>8</sup>, and 3 × 10<sup>8</sup> infectious units/mL, respectively. This has been clarified in the updated Methods section (page 12).

      (2) The age for injection is quite broad (3-5 dpf in Fig 1 and 4-6 dpf in Fig 2). Given that viral spread efficiency is usually more robust in younger animals, describing the exact injection age for each experiment is critical.

      We appreciate the reviewer’s suggestions. For the initial experiments tracing randomly from neurons in Figure 1, the injection age was primarily 3–4 dpf, with a one-day difference. Due to the slower development of PCs, the injection age for experiments related to Figure 2,3, and 4, is mainly 5 dpf. To clarify the developmental stages at the time of injection for each experiment, we have  newly added tables (see Figure 1,2—table supplement 2) listing the number of fish used at each injection age for all experimental groups shown in Figure 1 and 2.

      (3) More details should be provided for the paired electrical stimulation-calcium imaging study. How many GC cells were tested? How many had corresponding PC cell responses? What is the response latency? For example, images of stimulated and recorded GCs and PCs should be shown.

      Yes, these are important details for the paired electrical stimulation-calcium imaging study. We stimulated 33 GCs from 32 animals and detected calcium responses in putative postsynaptic PCs in 15 cases. Among these, we successfully ablated the single GC in 11 pairs and observed a weakened calcium response in PCs following ablation (see Figure 2M). The response latency was determined as the first calcium imaging frame where ΔF/F exceeded the baseline (pre-stimulus average) by 3 times the standard deviation. Imaging was performed at 5 Hz, and as shown in Figure 2L, the calculated average response latency was 152 ± 35 ms (mean ± SEM), indicating an immediate response with calcium intensity from the first post-stimulus imaging frame consistently exceeding the threshold.

      We have added additional details to the Results (page 5), Discussion (page 9), and Methods (page 15) sections. A representative image showing both the stimulated GC and the recorded PC has been added to Figure 2 in the revised manuscript (see Figure 2K).

      (4) It is unclear how connectivity between specific PC and GC is determined for single neuron connectivity. In other images (Figure 4C), there are usually multiple starter cells and many GCs. It was not shown that the image resolution can establish clear axon dendritic contacts between cell pairs.

      In our experiments, sparse labeling typically results in 1–10 starter cells per fish. Regarding the case shown in Figure 4C (right column), only two PC starters were labeled, which simplifies the assignment of presynaptic inputs to individual PCs. Connectivity is determined based on clear axon-dendritic or axon-cell body apposition between GCs and PCs. We have accordingly added more details to the Methods (page 16) section regarding how we determined connectivity between specific PCs and GCs.

      Reviewer #2 (Recommendations For The Authors):

      To enable broader use of this technique, I would encourage the authors to submit their zebrafish lines, plasmids, and plasmid sequences to public repositories such as ZIRC and  Addgene. Additionally, there is no mention of how viral vectors will be shared.

      We have deposited the related zebrafish lines at CZRC (China Zebrafish Resource Center) and uploaded plasmid maps and sequences to Addgene. The viral vectors are available through BrainCase (Shenzhen, China). We have included the information in the revised manuscript.

      Reviewer #3 (Public Review):

      Summary:

      The authors establish reagents and define experimental parameters useful for defining neurons retrograde to a neuron of interest.

      Strengths:

      A clever approach, careful optimization, novel reagents, and convincing data together lead to convincing conclusions.

      Weaknesses: 

      In the current version of the manuscript, the tracing results could be better centered with  respect to past work, certain methods could be presented more clearly, and other approaches worth considering.

      Appraisal/Discussion:

      Trans-neuronal tracing in the larval zebrafish preparation has lagged behind rodent models,limiting "circuit-cracking" experiments. Previous work has demonstrated that pseudotyped rabies virus-mediated tracing could work, but published data suggested that there was considerable room for optimization. The authors take a major step forward here, identifying a number of key parameters to achieve success and establishing new transgenic reagents that incorporate modern intersectional approaches. As a proof of concept, the manuscript concludes with a rough characterization of inputs to cerebellar Purkinje cells. The work will be of considerable interest to neuroscientists who use the zebrafish model.

      Reviewer #3 (Recommendations For The Authors):

      The main limitations of the work are as follows:

      (1) The optimizations might differ for different neurons. Purkinje cells are noteworthy because they develop considerably during the time window detailed here, almost doubling in number between 7-14dpf. Presumably, connectivity follows. This sort of neurogenesis is much less common elsewhere. It would be useful to show similar results in, say, tectal neurons, which would have spatially-restricted retinal ganglion cells labelled.

      We acknowledge that Purkinje cells (PCs) undergo significant development between 7–14 dpf, which may influence synaptic connectivity and result in differences in tracing efficiency. However, all experimental conditions were standardized across groups, and the selection of starter PCs was unbiased, typically focusing on PCs in the lateral region of the CCe (corpus cerebelli) subregion, ensuring that the relative comparisons remain valid. 

      We agree that testing other neuronal populations would be valuable, as tracing efficiency is influenced by multiple factors, such as the number of endogenous inputs, synaptic maturation, and developmentally regulated synaptic strength. Tectal neurons, which receive spatially restricted retinal ganglion cell inputs, would be a suitable choice for further investigation. However, due to the various tectal cell types and the opacity of the eyeball, such studies present additional technical challenges and are beyond the scope of this paper.

      (2) The virus is delivered by means of microinjection near the cell. This is invasive and challenging for labs that dont routinely perform electrophysiology. It would be useful to know if coarser methods of viral delivery (e.g. intraventricular injection) would be successful. 

      Our protocol does not require the level of precision needed for electrophysiology. The procedure can be performed using a standard high-magnification upright (135× magnification, Nikon SMZ18) or inverted fluorescence microscope (200× magnification, Olympus IX51). The virus suspension was loaded into a glass micropipette with a ~10 µm tip diameter and directly microinjected into the target region using a micromanipulator. The procedure was comparable to embryonic microinjection in terms of precision and operational control. Notably, direct contact with the target cells is not necessary, as the injected virus solution can diffuse and effectively infect nearby cells.  

      We had attempted intraventricular injection as an alternative, but it failed to produce robust labeling, reinforcing the necessity for direct tissue injection. 

      We have now included additional methodological details in the Methods section (page 13). 

      (3) Because of the combination of transgenic lines, plasmid injection, and viral type, it is often confusing to follow exactly what is being done for a particular experiment. It would be useful to specify the transgenic background used for each experiment using standard nomenclature e.g. "Plasmids were injected into Tg(elavl3:GAL4) fish." This is particularly important for the experiments in Figure 4: it isnt clear what the background used for the sparse labels was. 

      Thank the reviewer for bringing this issue to our attention. In order to improve clarity, we have revised the figure legends to explicitly state the transgenic background, injected plasmids, and viral type used in each experiment, particularly for Figure 4. 

      (4) Plasmids should be deposited with Addgene along with maps specifying the particular "codon-optimized Tetoff" per 388. 

      We confirm that all plasmids, including those containing codon-optimized Tetoff constructs, have been uploaded to Addgene along with detailed maps.

      (5) It would be useful to know if there were more apoptotic cells after transfection -- an acridine orange or comparable assay is recommended, rather than loss of fluorescence. 

      We appreciate the reviewer’s suggestion to assess apoptosis using acridine orange staining or comparable assays. We agree that such methods can provide more direct detection of apoptotic events. However, we believe that the difference in cytotoxicity is already evident in our current data: SAD-infected cells exhibit greater loss than CVSinfected cells (see Figure 3D). This is consistent with previous observations in mice, where greater toxicity of SAD compared to CVS was demonstrated using propidium iodide (PI) staining in cultured cells (Reardon et al., 2016).

      (6) Line 219-228 Hibis lab has described the subtypes of granule cells in detail already; the work should discuss the tracings with respect to previous characterizations instead of limiting that work to a citation. 

      Thanks for the reminding of this point. We have expanded the Results section (page 6) to discuss the subtypes of GCs and PCs in relation to previously reported characterizations.

      (7) "Activities" is often used when "activity" is correct. The use of English in the manuscript is, by and large, excellent, but its worth running the text through software like Grammarly to catch the occasional error. 

      We have carefully edited the manuscript using professional language editing tools to correct any grammatical issues.

      (8) The experiments in 2J-2L would be more convincing if they were performed on inferior olive inputs as well -- especially given the small size of the granule cells. 

      We acknowledge the reviewers observation that granule cells (GCs) are relatively small, which may underline the finding that, out of 33 stimulated GCs, only 15 were capable of eliciting calcium responses in putative postsynaptic PCs. However, in all 11 pairs where a single GC was successfully ablated, we observed a weakened calcium response in PCs after the ablation (see Figure 2M), suggesting our tracing approach specifically identifies synaptically coupled neurons. We have clarified this point in the revised manuscript (page 5).

      We agree that verifying the IO inputs to PCs would strengthen the validity of our findings. However, in our experiments, the probability of tracing upstream IO cells was relatively low. This may be due to the developmental immaturity of the synapse and the fact that each PC typically receives input from a single IO cell. Additionally, the deep and distant anatomical location of the IO presents technical challenges for paired electrical stimulationcalcium imaging study. To address these limitations, we are currently exploring the integration of viral tracing and optogenetics to further investigate IO-PC connectivity in future studies.

      (9) It would be useful if the manuscript discussed the efficacy of trans-synaptic labelling. What fraction of granule cell / olivary inputs to a particular Purkinje cell do the authors think their method captures?

      This is an important point for assessing the efficacy of our trans-synaptic labeling. Ideally, electron microscopy (EM) data would provide the most precise evaluation. In the absence of EM data, we estimated the number of GCs, IOs and PCs using light microscopy-based cell counting. 

      At approximately 7 dpf, we manually counted 327 ± 14 PCs and 2318 ± 70 GCs in the Tg(2×en.cpce-E1B:tdTomato-CAAX) and Tg(cbln12:GAL4FF);Tg(5×UAS:EGFP) zebrafish cerebellum, across all subregions (Va, CCe, EG, and LCa). Given the developmental increase in the number of GCs and the fact that some GCs that have exclusively ipsilateral projections, and that a single PC would not receive input from all parallel fibers, we estimate that by 10–14 dpf, a single PC receives approximately 1000– 2000 GC inputs. Under optimal tracing conditions, we observed an average of 20 labeled GC inputs per PC, yielding a capture fraction of ~1–2%. Although this represents only a subset of total inputs, it is consistent with mammalian studies (Wall et al., 2010; Callaway et al., 2015), suggesting inherent limitations of this viral labeling approach.

      For IO inputs, we counted 325 ± 26 inferior olivary neurons in Tg(elavl3:H2B-GCaMP6s) fish. A single PC likely receives input from one IO neuron, though an IO neuron may innervate multiple PCs. Accordingly, the observed capture rate for IO inputs was lower (7 out of 248 starters). 

      Further optimization is required to enhance the tracing efficiency. We have now incorporated a Discussion on this point in the revised manuscript (page 8).

    1. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Public review): 

      The authors present their new bioinformatic tool called TEKRABber, and use it to correlate expression between KRAB ZNFs and TEs across different brain tissues, and across species. While the aims of the authors are clear and there would be significant interest from other researchers in the field for a program that can do such correlative gene expression analysis across individual genomes and species, the presented approach and work display significant shortcomings. In the current state of the analysis pipeline, the biases and shortcomings mentioned below, for which I have seen no proof of that they are accounted for by the authors, are severely impacting the presented results and conclusions. It is therefore essential that the points below are addressed, involving significant changes in the TEKRABber progamm as well as the analysis pipeline, to prevent the identification of false positive and negative signals, that would severely affect the conclusions one can raise about the analysis. 

      Thank you very much for the insightful review of our manuscript. Since most of the comments on our revised version are not different from the comments on our first version, we repeated our previous answer, but wrote a new reply to the new concerns (please see the last two paragraphs). 

      We would also like to reiterate here that most of the critique of the reviewer concerns the performance of other tools and not TEKRABber presented in our manuscript. We consider it out of scope for this manuscript to improve other tools.

      My main concerns are provided below: 

      One important shortcoming of the biocomputational approach is that most TEs are not actually expressed, and others (Alus) are not a proxy of the activity of the TE class at all. I will explain: While specific TE classes can act as (species-specific) promoters for genes (such as LTRs) or are expressed as TE derived transcripts (LINEs, SVAs), the majority of other older TE classes do not have such behavior and are either neutral to the genome or may have some enhancer activity (as mapped in the program they refer to 'TEffectR'. A big focus is on Alus, but Alus contribute to a transcriptome in a different way too: They often become part of transcripts due to alternative splicing. As such, the presence of Alu derived transcripts is not a proxy for the expression/activity of the Alu class, but rather a result of some Alus being part of gene transcripts (see also next point). Bottom line is that the TEKRABber software/approach is heavily prone to picking up both false positives (TEs being part of transcribed loci) and false negatives (TEs not producing any transcripts at all) , which has a big implication for how reads from TEs as done in this study should be interpreted: The TE expression used to correlate the KRAB ZNF expression is simply not representing the species-specific influences of TEs where the authors are after. 

      With the strategy as described, a lot of TE expression is misinterpreted: TEs can be part of gene-derived transcripts due to alternative splicing (often happens for Alus) or as a result of the TE being present in an inefficiently spliced out intron (happens a lot) which leads to TE-derived reads as a result of that TE being part of that intron, rather than that TE being actively expressed. As a result, the data as analysed is not reliably indicating the expression of TEs (as the authors intend too) and should be filtered for any reads that are coming from the above scenarios: These reads have nothing to do with KRAB ZNF control, and are not representing actively expressed TEs and therefore should be removed. Given that from my lab's experience in brain (and other) tissues, the proportion of RNA sequencing reads that are actually derived from active TEs is a stark minority compared to reads derived from TEs that happen to be in any of the many transcribed loci, applying this filtering is expected to have a huge impact on the results and conclusions of this study. 

      We sincerely thank the reviewer for highlighting the potential issues of false positives and negatives in TE quantification. The reviewer provided valuable examples of how different TE classes, such as Alus, LTRs, LINEs, and SVAs, exhibit distinct behaviors in the genome. To our knowledge, specific tools like ERVmap (Tokuyama et al., 2018), which annotates ERVs, and LtrDetector (Joseph et al., 2019), which uses k-mer distributions to quantify LTRs, could indeed enhance precision by treating specific TE classes individually. We acknowledge that such approaches may yield more accurate results and appreciate the suggestion. 

      In our study, we used TEtranscripts (Jin et al., 2015) prior to TEKRABber. TEtranscripts applies the Expectation Maximization (EM) algorithm to assign ambiguous reads as the following steps. Uniquely mapped reads are first assigned to genes, and  reads overlapping genes and TEs are assigned to TEs only if they do not uniquely match an annotated gene. The remaining ambiguous reads are distributed based on EM iterations. While this approach may not be as specialized as the latest tools for specific TE classes, it provides a general overview of TE activity. TEtranscripts outputs subfamily-level TE expression data, which we used as input for TEKRABber to perform downstream analyses such as differential expression and correlation studies.

      We understand the importance of adapting tools to specific research objectives, including focusing on particular TE classes. TEKRABber is designed not to refine TE quantification at the mapping stage but to flexibly handle outputs from various TE quantification tools. It accepts raw TE counts as input in the form of dataframes, enabling diverse analytical pipelines. We would also like to clarify that, since the input data is transcriptomic, our primary focus is on expressed TEs, rather than the effects of non-expressed TEs in the genome. In the revised version of our manuscript, we emphasize this distinction in the discussion and provide examples of how TEKRABber can integrate with other tools to enhance specificity and accuracy.

      Another potential problem that I don't see addressed is that due to the high level of similarity of the many hundreds of KRAB ZNF genes in primates and the reads derived from them, and the inaccurate annotations of many KZNFs in non-human genomes, the expression data derived from RNA-seq datasets cannot be simply used to plot KZNF expression values, without significant work and manual curation to safeguard proper cross species ortholog-annotation: The work of Thomas and Schneider (2011) has studied this in great detail but genome-assemblies of non-human primates tend to be highly inaccurate in appointing the right ortholog of human ZNF genes. The problem becomes even bigger when RNA-sequencing reads are analyzed: RNA-sequencing reads from a human ZNF that emerged in great apes by duplication from an older parental gene (we have a decent number of those in the human genome) may be mapped to that older parental gene in Macaque genome: So, the expression of human-specific ZNF-B, that derived from the parental ZNF-A, is likely to be compared in their DESeq to the expression of ZNF-A in Macaque RNA-seq data. In other words, without a significant amount of manual curation, the DE-seq analysis is prone to lead to false comparisons which make the stategy and KRABber software approach described highly biased and unreliable. 

      There is no doubt that there are differences in expression and activity of KRAB-ZNFs and TEs repspectively that may have had important evolutionary consequences. However, because all of the network analyses in this paper rely on the analyses of RNA-seq data and the processing through the TE-KRABber software with the shortcomings and potential biases that I mentioned above, I need to emphasize that the results and conclusions are likely to be significantly different if the appropriate measures are taken to get more accurate and curated TE and KRAB ZNF expression data. 

      We thank the reviewer for raising the important issue of accurately annotating the expanded repertoire of KRAB-ZNFs in primates, particularly the challenges of cross-species orthology and potential biases in RNA-seq data analysis. Indeed, we have also addressed this challenge in some of our previous papers (Nowick et al., 2010, Nowick et al., 2011 and Jovanovic et al., 2021).

      In the revised manuscript, we include more details about our two-step strategy to ensure accurate KRAB-ZNF ortholog assignments. First, we employed the Gene Order Conservation (GOC) score from Ensembl BioMart as a primary filter, selecting only one-to-one orthologs with a GOC score above 75% across primates. This threshold, recommended in Ensembl’s ortholog quality control guidelines, ensures high-confidence orthology relationships.(http://www.ensembl.org/info/genome/compara/Ortholog_qc_manual.html#goc).

      Second, we incorporated data from Jovanovic et al. (2021), which independently validated KRAB-ZNF orthologs across 27 primate genomes. This additional layer of validation allowed us to refine our dataset, resulting in the identification of 337 orthologous KRAB-ZNFs for differential expression analysis (Figure S2).

      We acknowledge that different annotation methods or criteria may for some genes yield variations in the identified orthologs. However, we believe that this combination provides a robust starting point for addressing the challenges raised, while we remain open to additional refinements in future analyses.

      Finally, there are some minor but important notes I want to share:

      The association with certain variations in ZNF genes with neurological disorders such as AD, as reported in the introduction is not entirely convincing without further functional support. Such associations could be merely happen by chance, given the high number of ZNF genes in the human genome and the high chance that variations in these loci happen associate with certatin disease associated traits. So using these associations as an argument that changes in TEs and KRAB ZNF networks are important for diseases like AD should be used with much more caution. 

      We fully acknowledge the concern that, given the large number of KRAB-ZNFs and their inherent variability, some associations with AD or other neurological disorders could occur by chance. This highlights the importance of additional functional studies to validate the causal role of KRAB-ZNF and TE interactions in disease contexts. While previous studies have indeed analyzed KRAB-ZNF and TE expression in human brain tissues, our study seeks to expand on this foundation by incorporating interspecies comparisons across primates. This approach enabled us to identify TE:KRAB-ZNF pairs that are uniquely present in healthy human brains, which may provide insights into their potential evolutionary significance and relevance to diseases like AD.

      In addition to analyzing RNA-seq data (GSE127898 and syn5550404), we have cross-validated our findings using ChIP-exo data for 159 KRAB-ZNF proteins and their TE binding regions in humans (Imbeault et al., 2017). This allowed us to identify specific binding events between KRAB-ZNF and TE pairs, providing further support for the observed associations. We agree with the reviewer that additional experimental validations, such as functional studies, are critical to further establish the role of KRAB-ZNF and TE networks in AD. We hope that future research can build upon our findings to explore these associations in greater detail.

      There is a number of papers where KRAB ZNF and TE expression are analysed in parallel in human brain tissues. So the novelty of that aspect of the presented study may be limited. 

      We agree with the reviewer that many studies have examined the expression levels of KRAB-ZNFs and TEs in developing human brain tissues (Farmiloe et al., 2020; Turelli et al., 2020; Playfoot et al., 2021, among others). However, the novelty of our study lies in comparing KRAB-ZNF and TE expression across primate species, as well as in adult human brain tissues from both control individuals and those with Alzheimer’s disease. To our knowledge, no previous study has analyzed these data in this context. We therefore believe that our results will be of interest to evolutionary biologists and neurobiologists focusing on Alzheimer’s disease.

      Additional note after reviewing the revised version of the manuscript: 

      After reviewing the revised version of the manuscript, my criticism and concerns with this study are still evenly high and unchanged. To clarify, the revised version did not differ in essence from the original version; it seems that unfortunately, no efforts were taken to address the concerns raised on the original version of the manuscript, the results section as well as the discussion section are virtually unchanged.

      We regret that this reviewer was not satisfied with our changes. In fact, many of the points raised by this reviewer are important, but concern weaknesses of other tools. In our opinion, validating other tools would be out of scope for this paper. We want to emphasize that TEKRABber is not a quantification tool for sequencing data, but a software for comparative analysis across species. We provided a detailed answer to the reviewer and readers can refer to that answer in the public review above for further information.


      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      The authors present their new bioinformatic tool called TEKRABber, and use it to correlate expression between KRAB ZNFs and TEs across different brain tissues, and across species. While the aims of the authors are clear and there would be significant interest from other researchers in the field for a program that can do such correlative gene expression analysis across individual genomes and species, the presented approach and work display significant shortcomings. In the current state of the analysis pipeline, the biases and shortcomings mentioned below, for which I have seen no proof that they are accounted for by the authors, are severely impacting the presented results and conclusions. It is therefore essential that the points below are addressed, involving significant changes in the TEKRABber program as well as the analysis pipeline, to prevent the identification of false positive and negative signals, that would severely affect the conclusions one can raise about the analysis.

      Thank you very much for the insightful review of our manuscript.

      My main concerns are provided below:

      (1) One important shortcoming of the biocomputational approach is that most TEs are not actually expressed, and others (Alus) are not a proxy of the activity of the TE class at all. I will explain: While specific TE classes can act as (species-specific) promoters for genes (such as LTRs) or are expressed as TE derived transcripts (LINEs, SVAs), the majority of other older TE classes do not have such behavior and are either neutral to the genome or may have some enhancer activity (as mapped in the program they refer to 'TEffectR'. A big focus is on Alus, but Alus contribute to a transcriptome in a different way too: They often become part of transcripts due to alternative splicing. As such, the presence of Alu derived transcripts is not a proxy for the expression/activity of the Alu class, but rather a result of some Alus being part of gene transcripts (see also next point). The bottom line is that the TEKRABber software/approach is heavily prone to picking up both false positives (TEs being part of transcribed loci) and false negatives (TEs not producing any transcripts at all), which has a big implication for how reads from TEs as done in this study should be interpreted: The TE expression used to correlate the KRAB ZNF expression is simply not representing the species-specific influences of TEs where the authors are after.

      With the strategy as described, a lot of TE expression is misinterpreted: TEs can be part of gene-derived transcripts due to alternative splicing (often happens for Alus) or as a result of the TE being present in an inefficiently spliced out intron (happens a lot) which leads to TE-derived reads as a result of that TE being part of that intron, rather than that TE being actively expressed. As a result, the data as analysed is not reliably indicating the expression of TEs (as the authors intend to) and should be filtered for any reads that are coming from the above scenarios: These reads have nothing to do with KRAB ZNF control, and are not representing actively expressed TEs and therefore should be removed. Given that from my lab's experience in the brain (and other) tissues, the proportion of RNA sequencing reads that are actually derived from active TEs is a stark minority compared to reads derived from TEs that happen to be in any of the many transcribed loci, applying this filtering is expected to have a huge impact on the results and conclusions of this study.

      We sincerely thank the reviewer for highlighting the potential issues of false positives and negatives in TE quantification. The reviewer provided valuable examples of how different TE classes, such as Alus, LTRs, LINEs, and SVAs, exhibit distinct behaviors in the genome. To our knowledge, specific tools like ERVmap (Tokuyama et al., 2018), which annotates ERVs, and LtrDetector (Joseph et al., 2019), which uses k-mer distributions to quantify LTRs, could indeed enhance precision by treating specific TE classes individually. We acknowledge that such approaches may yield more accurate results and appreciate the suggestion. 

      In our study, we used TEtranscripts (Jin et al., 2015) prior to TEKRABber. TEtranscripts applies the Expectation Maximization (EM) algorithm to assign ambiguous reads as the following steps. Uniquely mapped reads are first assigned to genes, and  reads overlapping genes and TEs are assigned to TEs only if they do not uniquely match an annotated gene. The remaining ambiguous reads are distributed based on EM iterations. While this approach may not be as specialized as the latest tools for specific TE classes, it provides a general overview of TE activity. TEtranscripts outputs subfamily-level TE expression data, which we used as input for TEKRABber to perform downstream analyses such as differential expression and correlation studies.

      We understand the importance of adapting tools to specific research objectives, including focusing on particular TE classes. TEKRABber is designed not to refine TE quantification at the mapping stage but to flexibly handle outputs from various TE quantification tools. It accepts raw TE counts as input in the form of dataframes, enabling diverse analytical pipelines. We would also like to clarify that, since the input data is transcriptiomic, our primary focus is on expressed TEs, rather than the effects of non-expressed TEs in the genome. In the revised version of our manuscript, we emphasize this distinction in the discussion and provide examples of how TEKRABber can integrate with other tools to enhance specificity and accuracy.

      (2) Another potential problem that I don't see addressed is that due to the high level of similarity of the many hundreds of KRAB ZNF genes in primates and the reads derived from them, and the inaccurate annotations of many KZNFs in non-human genomes, the expression data derived from RNA-seq datasets cannot be simply used to plot KZNF expression values, without significant work and manual curation to safeguard proper cross species ortholog-annotation: The work of Thomas and Schneider (2011) has studied this in great detail but genome-assemblies of non-human primates tend to be highly inaccurate in appointing the right ortholog of human ZNF genes. The problem becomes even bigger when RNA-sequencing reads are analyzed: RNA-sequencing reads from a human ZNF that emerged in great apes by duplication from an older parental gene (we have a decent number of those in the human genome) may be mapped to that older parental gene in Macaque genome: So, the expression of human-specific ZNF-B, that derived from the parental ZNF-A, is likely to be compared in their DESeq to the expression of ZNF-A in Macaque RNA-seq data. In other words, without a significant amount of manual curation, the DE-seq analysis is prone to lead to false comparisons which make the strategy and KRABber software approach described highly biased and unreliable.

      There is no doubt that there are differences in expression and activity of KRAB-ZNFs and TEs respectively that may have had important evolutionary consequences. However, because all of the network analyses in this paper rely on the analyses of RNA-seq data and the processing through the TE-KRABber software with the shortcomings and potential biases that I mentioned above, I need to emphasize that the results and conclusions are likely to be significantly different if the appropriate measures are taken to get more accurate and curated TE and KRAB ZNF expression data.

      We thank the reviewer for raising the important issue of accurately annotating the expanded repertoire of KRAB-ZNFs in primates, particularly the challenges of cross-species orthology and potential biases in RNA-seq data analysis. Indeed, we have also addressed this challenge in some of our previous papers (Nowick et al., 2010, Nowick et al., 2011 and Jovanovic et al., 2021).

      In the revised manuscript, we include more details about our two-step strategy to ensure accurate KRAB-ZNF ortholog assignments. First, we employed the Gene Order Conservation (GOC) score from Ensembl BioMart as a primary filter, selecting only one-to-one orthologs with a GOC score above 75% across primates. This threshold, recommended in Ensembl’s ortholog quality control guidelines, ensures high-confidence orthology relationships. (http://www.ensembl.org/info/genome/compara/Ortholog_qc_manual.html#goc).

      Second, we incorporated data from Jovanovic et al. (2021), which independently validated KRAB-ZNF orthologs across 27 primate genomes. This additional layer of validation allowed us to refine our dataset, resulting in the identification of 337 orthologous KRAB-ZNFs for differential expression analysis (Figure S2).

      We acknowledge that different annotation methods or criteria may for some genes yield variations in the identified orthologs. However, we believe that this combination provides a robust starting point for addressing the challenges raised, while we remain open to additional refinements in future analyses.

      (3) The association with certain variations in ZNF genes with neurological disorders such as AD, as reported in the introduction is not entirely convincing without further functional support. Such associations could merely happen by chance, given the high number of ZNF genes in the human genome and the high chance that variations in these loci happen to associate with certain disease-associated traits. So using these associations as an argument that changes in TEs and KRAB ZNF networks are important for diseases like AD should be used with much more caution.

      There are a number of papers where KRAB ZNF and TE expression are analysed in parallel in human brain tissues. So the novelty of that aspect of the presented study may be limited.

      We fully acknowledge the concern that, given the large number of KRAB-ZNFs and their inherent variability, some associations with AD or other neurological disorders could occur by chance. This highlights the importance of additional functional studies to validate the causal role of KRAB-ZNF and TE interactions in disease contexts. While previous studies have indeed analyzed KRAB-ZNF and TE expression in human brain tissues, our study seeks to expand on this foundation by incorporating interspecies comparisons across primates. This approach enabled us to identify TE:KRAB-ZNF pairs that are uniquely present in healthy human brains, which may provide insights into their potential evolutionary significance and relevance to diseases like AD.

      In addition to analyzing RNA-seq data (GSE127898 and syn5550404), we have cross-validated our findings using ChIP-exo data for 159 KRAB-ZNF proteins and their TE binding regions in humans (Imbeault et al., 2017). This allowed us to identify specific binding events between KRAB-ZNF and TE pairs, providing further support for the observed associations. We agree with the reviewer that additional experimental validations, such as functional studies, are critical to further establish the role of KRAB-ZNF and TE networks in AD. We hope that future research can build upon our findings to explore these associations in greater detail.

      Reviewer #1 (Recommendations for the authors):

      It is essential before this work can be considered for publication, that the points above are addressed, involving significant changes in the TEKRABber program as well as the analysis pipeline, to prevent the identification of false positive and negative signals, that would severely affect the conclusions one can raise about the analysis.

      We sincerely appreciate the reviewer’s insightful recommendations and constructive feedback. Each specific point has been carefully addressed in detail in the public reviews section above.

      Reviewer #2 (Public review)

      Summary:

      The aim was to decipher the regulatory networks of KRAB-ZNFs and TEs that have changed during human brain evolution and in Alzheimer's disease.

      Strengths:

      This solid study presents a valuable analysis and successfully confirms previous assumptions, but also goes beyond the current state of the art.

      Weaknesses:

      The design of the analysis needs to be slightly modified and a more in-depth analysis of the positive correlation cases would be beneficial. Some of the conclusions need to be reinterpreted.

      We sincerely thank the reviewer for the thoughtful summary, positive evaluation of our study, and constructive feedback. We appreciate the recognition of the strengths in our analysis and the valuable suggestions for improving its design and interpretation. 

      We would like to briefly comment on the suggested modifications to the design here and will provide a detailed point-by-point review later with our revised manuscript. 

      The reviewer recommended considering a more recent timepoint, such as less than 25 million years ago (mya), to define the "evolutionary young group" of KRAB-ZNF genes and TEs when discussing the arms-race theory. This is indeed a valuable perspective, as the TE repressing functions by KRAB-ZNF proteins  may have evolved more recently than the split between Old World Monkeys (OWM) and New World Monkeys (NWM) at 44.2 mya we used. 

      Our rationale for selecting 44.2 mya is based on certain primate-specific TEs such as the Alu subfamilies, which emerged after the rise of Simiiformes and have been used in phylogenetic studies (Xing et al., 2007 and Williams et al., 2010). This timeframe allowed us to investigate the potential co-evolution of KRAB-ZNFs and TEs in species that emerged after the OWM-NWM split (e.g., humans, chimpanzees, bonobos, and macaques used for this study). However, focusing only on KRAB-ZNFs and TEs younger than 25 million years would limit the analysis to just 9 KRAB-ZNFs and 92 TEs expressed in our datasets. While we will not conduct a reanalysis using this more recent timepoint, we will integrate the recommendation into the discussion section of the revised manuscript. 

      Furthermore, we greatly appreciate the reviewer's detailed insights and suggestions for refining specific descriptions and interpretations in our manuscript. We will address these points in the revised version to ensure the content is presented with greater precision and clarity.

      Once again, we thank both reviewers for their valuable feedback, which provides significant input for strengthening our study.

      Reviewer #2 (Recommendations for the authors):

      We thank the reviewer for the very insightful comments, which helped a lot in our interpretation and discussion of our results and in improving some of our statements.

      The present study seeks to uncover how the repression of transposable elements (TEs) by rapidly evolving KRAB-ZNF genes, which are known for their role in TE suppression, may influence human brain evolution and contribute to Alzheimer's disease (AD). Utilizing their previously developed tool, TEKRABber, the researchers analyze transcriptome datasets from the brains of four species of Old World Monkeys (OWM) alongside samples from healthy human individuals and AD patients.

      Through bipartite network analysis, they identify KRAB-ZNF/Alu-TE interactions as the most negatively correlated in the network, highlighting the repression of Alu elements by KRAB-ZNF proteins. In AD patient samples, they observe a reduction in a subnetwork comprising 21 interactions within an Alu TE module. These findings are consistent with earlier evidence that: (1) KRAB-ZNFs are involved in suppressing evolutionarily young Alu TEs; and (2) specific Alu elements have been reported to be deregulated in AD. The study also validates previous experimental ChIP-exo data on KRAB-ZNF proteins obtained in a different cell type (Imbeault et al., 2017).

      As a novely, the study identifies a human-specific amino acid variation in ZNF528, which directly contacts DNA nucleotides, showing signs of positive selection in humans and several human-specific TE interactions.

      Interestingly, in addition to the negative links, the researchers observed predominantly positive connections with other TEs, suggesting that while their approach is consistent with some previous observations, the authors conclude that it provides limited support for the 'genetic arms race' hypothesis.

      The reviewer is a specialist in TE and evolutionary research.

      Major issues:

      The study demonstrates the usefulness of the TEKRABber tool, which can support and successfully validate previous observations. However, there are several misconceptions and problems with the interpretation of the results.

      KRAB-ZNF proteins in repressing TEs in vertebrates  In the Abstract: "In vertebrates, some KRAB-ZNF proteins repress TEs, offering genomic protection."

      Although some KRAB-ZNF proteins exist in vertebrates, their TE-suppression role is not as prominent or specialized as it is in mammals, where it serves as a key defense mechanism against the mobilization of TEs.

      We appreciate the reviewer’s clarification regarding the role of KRAB-ZNF proteins in vertebrates. To improve accuracy and precision, we have revised the wording to specify that this mechanism is primarily observed in mammals rather than vertebrates.

      The definition of young and old

      The study considers the evolutionary age of young ({less than or equal to} 44.2 mya) and old(> 44.2 mya). This is the time of the Old World Monkey (OWM) and New World Monkey (NWM) split. Importantly, however, the KRAB-ZNF / KAP1 suppression system primarily suppresses evolutionarily younger TEs (< 25 MY old). These TEs are relatively new additions to the genome, i.e. they are specific to certain lineages (such as primates or hominins) and are more likely to be actively transcribed (and recognized as foreign by innate immunity) or have residual activity upon transposition. Examples include certain subfamilies of LINE-1, Alu (Y, S, less effective for J), SVA and younger human endogenous retroviruses (HERVs) such as HERV-K. The KRAB-ZNF / KAP1 system therefore focuses primarily on TEs that have evolved more recently in primates, in the last few million years (within the last 25 million years). Older TEs are controlled by broader epigenetic mechanisms such as DNA methylation, histone modifications, etc. Therefore, the age ({less than or equal to} 44.2 mya) is not suitable to define it as young.

      In this context, the specific TEs of the Simiiformes cannot be considered as 'recently evolved' (in the Abstract). The Simiiformes contain both OWM and NWM. Notably, the study includes four species, all of which belong to the OWMs.

      The 'genetic arms race' theory

      Unfortunately, the problematic definition of young and old could also explain why the authors conclude that their data only weakly support the 'genetic arms race' hypothesis.

      The KRAB-ZNF proteins evolve rapidly, similar to TEs, which raises the 'genetic arms race' hypothesis. This hypothesis refers to the constant evolutionary struggle between organisms and TEs. TEs constantly evolve to overcome host defences, while host genomes develop mechanisms to suppress these potentially harmful elements. Indeed, in mammals, an important example is the KRAB-ZNF/TE interaction. The KRAB-ZNF proteins rapidly evolve to target specific TEs, creating a 'genetic arms race' in which each side - TEs and the KRAB-ZNF/KAP1 (alias TRIM28) repressor complex - drives the evolution of the other in response to adaptive pressure. Importantly, the 'genetic arms race' hypothesis describes the evolutionary process that occurs between TE and host when the TE is deleterious. Again, this includes the young TEs (< 25 MY old) with residual transposition activity or those that actively transcribed and exacerbate cellular stress and inflammatory responses. Approximately 25 million years ago, the superfamilies Hominoidea (apes) and Cercopithecoidea (Old World monkeys, I.e. macaque) split.

      Just to clarify, our initial study aim was to examine whether TEs exhibit any evolutionary relationships with KRAB-ZNFs across the four studied species (human, chimpanzee, bonobo, and macaque). For investigating the arms-race hypothesis, we really appreciate the reviewer suggesting a more recent time point, such as less than 25 million years ago (mya), to define the "evolutionary young group" of TEs and KRAB-ZNF genes. This is indeed a valuable recommendation, as 25 mya marks the emergence of Hominoidea (Figure 2C in the manuscript), making it a meaningful reference point for studying recently evolved KRAB-ZNFs and TEs. However, restricting the analysis to elements younger than 25 mya would reduce the dataset to only 9 KRAB-ZNFs and 92 TEs. Nevertheless, we provide here our results for those elements in Table S7:

      We observed that among the correlations in the < 25 mya subset, negative correlations (7) outnumbered positive ones (2). However, these correlations were derived from only 3 out of 9 KRAB-ZNFs and 9 out of 92 TE subfamilies. Therefore, based on our data, while the < 25 mya group shows a higher proportion of negative correlations, the sample size is too limited to derive networks or draw robust conclusions in our analysis, especially when compared to our original evolutionary age threshold of 44.2 mya. For this reason, we chose not to reanalyze the data but rather to acknowledge that our current definition of “young” may not be optimal for testing the arms-race model in humans. While previous studies (Jacobs et al., 2014; Bruno et al., 2019; Zuo et al., 2023) have explored relevant KRAB-ZNF and TE interactions, our review of the KRAB-ZNFs and TEs highlighted in those works suggests that a specific focus on elements <25 mya has not been a primary emphasis. 

      "our findings only weakly support the arms-race hypothesis. Firstly, we noted that young TEs exhibit lower expression levels than old TEs (Figure 2D and 5B), which might not be expected if they had recently escaped repression". - This is a misinterpretation. These old TEs are no longer harmful. This is not the case of the 'genetic arms race'.

      We sincerely appreciate the reviewer’s comments, which have helped us refine our interpretation to prevent potential misunderstandings. Our initial expectation, based on the arms-race hypothesis, was that young TEs would exhibit higher expression levels due to a recent escape from repression, while young KRAB-ZNFs would show increased expression as a counter-adaptive response. However, our findings indicate that both young TEs and young KRAB-ZNFs exhibit lower expression levels. This observation does not align with the classical arms-race model, which typically predicts an ongoing cycle of adaptive upregulation. We rephrase the sentences in our discussion to hopefully make our idea more clear. In addition, we added the notion that older TEs might not be harmful anymore, which we agree with.

      "Additionally, some young TEs were also negatively correlated with old KRAB-ZNF genes, leading to weak assortativity regarding age inference, which would also not be in line with the arms-race idea."

      This is not a contradiction, as an old KRAB-ZNF gene could be 'reactivated' to protect against young TEs. (It might be cheaper for the host than developing a brand new KRAB-ZNF gene.

      We agree with the reviewer's point that older KRAB-ZNFs may be reactivated to suppress young TEs, potentially as a more cost-effective evolutionary strategy than the emergence of entirely new KRAB-ZNFs. We have incorporated this perspective into the revised manuscript to provide a more detailed discussion of our findings.

      TEs remain active

      In the abstract: "Notably, KRAB-ZNF genes evolve rapidly and exhibit diverse expression patterns in primate brains, where TEs remain active."

      This is not precise. TEs are not generally remain active in the brain. It is only the autonomous LINE-1 (young) and non-autonomous Alu (young) and SVA (young) elements that can be mobilized by LINE-1. In addition, the evolutionary young HERV-K is recognized as foreign and alerts the innate immune system (DOI: 10.1172/jci.insight.131093 ) and is a target of the KRAB-ZNF/KAP1 suppression system.

      In the abstract: "Evidence indicates that transposable elements (TEs) can contribute to the evolution of new traits, despite often being considered deleterious."

      Oversimplification: The harmful and repurposed TEs are washed together.

      We appreciate the reviewer’s detailed suggestions for improving the precision of our abstract. While we previously mentioned LINE-1 and Alu elements in the introduction, we now explicitly specify in the abstract that only certain TE subfamilies, such as autonomous LINE-1 and non-autonomous Alu and SVA elements, remain active in the primate brain. Additionally, we have refined the phrasing regarding the role of TEs in evolution to clearly distinguish between their deleterious effects and their potential for functional repurposing. These clarifications have been incorporated into the revised abstract to ensure greater accuracy and nuance.

      Positive links

      "The high number of positive correlations might be surprising, given that KRAB-ZNFs are considered to repress TEs."

      Based on the above, it is not surprising that negative associations are only found with young (< 25 my) TEs. In fact, the relationship between old KRAB-ZNF proteins and old (non-damaging) TEs could be neutral/positive. The case of ZNF528 could be a valuable example of this.

      We thank the reviewer for providing this plausible interpretation and added it to the manuscript.

      "276 TE:KRAB-ZNF with positive correlations in humans were negatively correlated in bonobos"  It would be important to characterise the positive correlations in more detail. Could it be that the old KRAB-ZNF proteins lost their ability to recruit KAP1/TRIM28? Demonstrate it.

      The strategy of developing sequence-specific DNA recognition domains that can specifically recognise TEs is expensive for the host. Recent studies suggest that when the TE is no longer harmful, these proteins/connections can be occasionally repurposed. The repurposed function would probably differ from the original suppressive function.

      In my opinion, the TEKRABber tool could be useful in identifying co-option events:

      We appreciate the reviewer’s suggestion regarding the characterization of positive correlations. While it is possible that some old KRAB-ZNF proteins have lost their ability to recruit KAP1/TRIM28, we cannot conclude this definitively for all cases. To address this, we examined ChIP-exo data from Imbeault et al. (2017) (Accession: GSE78099) and analyzed the overlap of binding sites between KRAB-ZNFs, KAP1/TRIM28, and RepeatMasker-annotated TEs. Our results indicate that some old KRAB-ZNFs still exhibit binding overlap with KAP1 at TE regions, suggesting that their repressive function may be at least partially retained (Author response image 1).

      Author response image 1.<br /> Overlap of KAP1, Zinc finger proteins, and RepeatMasker annotation. Here we detect the overlap of ChIP-exo binding events using KAP1/TRIM28, with KRAB-ZNF genes (one at a time) and RepeatMasker annotation. (115 old and 58 young KRAB-ZNFs, Mann-Whitney, p<0.01).<br />

      Minor

      "Lead poisoning causes lead ions to compete with zinc ions in zinc finger proteins, affecting proteins such as DNMT1, which are related to the progression of AD (Ordemann and Austin 2016)."

      Not precise: While DNMT1 does contain zinc-binding domains, it is not categorized as a zinc finger protein.

      We appreciate the reviewer’s insight regarding the classification of DNMT1. After careful consideration, we have removed this sentence from the introduction to maintain focus on KRAB zinc finger proteins.

      Definition of TEs

      "There were 324 KRAB-ZNFs and 895 TEs expressed in Primate Brain Data." Define it more precisely. It is not clear, what the authors mean by TEs: Are these TE families, subfamilies? Provide information on copy numbers of each in the analysed four species.

      We appreciate the reviewer’s suggestion to clarify our definition of TEs. To improve precision, we have specified that the analysis was conducted at the subfamily level. Additionally, we have provided the copy numbers of TEs for the four analyzed species in Table S4.

      Occupancy of TEs in the genome

      "TEs comprise (i) one third to one half of the mammalian genome and are (ii) not randomly distributed..."

      (i) The most accepted number is 45%. However, some more recent reports estimate over 50%, thus the one third is an underestimation.

      (ii) Not randomly distributed among the mammalian species?

      (i) We thank the reviewer for pointing out that our statement about the abundance of TEs was outdated. We have updated the estimate to reflect that TEs can occupy more than half of the genome, based on recent publications.

      (ii) We acknowledge the reviewer’s concern regarding the distribution of TEs. Although TEs are interspersed throughout the genome, their insertion sites are not entirely random, as they tend to exhibit preferences for certain genomic regions. To clarify this, we have revised the wording in the paragraph accordingly.

      We would like to express our sincere gratitude to both reviewers for their insightful feedback, which has been instrumental in enhancing the quality of our study.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      In this study, Ana Lapao et al. investigated the roles of Rab27 effector SYTL5 in cellular membrane trafficking pathways. The authors found that SYTL5 localizes to mitochondria in a Rab27A-dependent manner. They demonstrated that SYTL5-Rab27A positive vesicles containing mitochondrial material are formed under hypoxic conditions, thus they speculate that SYTL5 and Rab27A play roles in mitophagy. They also found that both SYTL5 and Rab27A are important for normal mitochondrial respiration. Cells lacking SYTL5 undergo a shift from mitochondrial oxygen consumption to glycolysis which is a common process known as the Warburg effect in cancer cells. Based on the cancer patient database, the author noticed that low SYTL5 expression is related to reduced survival for adrenocortical carcinoma patients, indicating SYTL5 could be a negative regulator of the Warburg effect and potentially tumorigenesis.

      Strengths:

      The authors take advantage of multiple techniques and novel methods to perform the experiments.

      (1) Live-cell imaging revealed that stably inducible expression of SYTL5 co-localized with filamentous structures positive for mitochondria. This result was further confirmed by using correlative light and EM (CLEM) analysis and western blotting from purified mitochondrial fraction.

      (2) In order to investigate whether SYTL5 and Rab27A are required for mitophagy in hypoxic conditions, two established mitophagy reporter U2OS cell lines were used to analyze the autophagic flux.

      Weaknesses:

      This study revealed a potential function of SYTL5 in mitophagy and mitochondrial metabolism. However, the mechanistic evidence that establishes the relationship between SYTL5/Rab27A and mitophagy is insufficient. The involvement of SYTL5 in ACC needs more investigation. Furthermore, images and results supporting the major conclusions need to be improved.

      We thank the reviewer for their constructive comments. We agree that a complete understanding of the mechanism by which SYTL5 and Rab27A are recruited to the mitochondria and subsequently involved in mitophagy requires further investigation. Here, we have shown that SYTL5 recruitment to the mitochondria requires both its lipid-binding C2 domains and the Rab27A-binding SHD domain (Figure 1G-H). This implies a coincidence detection mechanism for mitochondrial localisation of SYTL5.  Additionally, we find that mitochondrial recruitment of SYTL5 is dependent on the GTPase activity and mitochondrial localisation of Rab27A (Figure 2D-E). We also identified proteins linked to the cellular response to oxidative stress, reactive oxygen species metabolic process, regulation of mitochondrion organisation and protein insertion into mitochondrial membrane to be enriched in the SYTL5 interactome (Figure 3A and C).

      However, less details regarding the mitochondrial localisation of Rab27A are understood. To investigate this, we have now performed a mass spectrometry analysis to identify the interactome of Rab27A (see Author response table 1 below,). U2OS cells with stable expression of mScarlet-Rab27A or mScarlet only, were subjected to immunoprecipitation, followed by MS analysis.  Of the 32 significant Rab27A-interacting hits (compared to control), two of the hits are located in the inner mitochondrial membrane (IMM); ATP synthase F(1) complex subunit alpha (P25705), and mitochondrial very long-chain specific acyl-CoA dehydrogenase (VLCAD)(P49748). However, as these IMM proteins are not likely involved in mitochondrial recruitment of Rab27A, observed under basal conditions, we choose not to include these data in the manuscript. 

      It is known that other RAB proteins are recruited to the mitochondria. During parkin-mediated mitophagy, RABGEF1 (a guanine nucleotide exchange factor) is recruited through its ubiquitin-binding domain and directs mitochondrial localisation of RAB5, which subsequently leads to recruitment of RAB7 by the MON1/CCZ1 complex[1]. As already mentioned in the discussion (p. 12), ubiquitination of the Rab27A GTPase activating protein alpha (TBC1D10A) is reduced in the brain of Parkin KO mouse compared to controls[35], suggesting a possible connection of Rab27A with regulatory mechanisms that are linked with mitochondrial damage and dysfunction. While this an interesting avenue to explore, in this paper we will not follow up further on the mechanism of mitochondrial recruitment of Rab27A. 

      Author response table 1.

      Rab27A interactome. Proteins co-immunoprecipitated with mScarlet-Rab27A vs mScarlet expressing control. The data show average of three replicates. 

      To investigate the role of SYTL5 in the context of ACC, we acquired the NCI-H295R cell line isolated from the adrenal gland of an adrenal cancer patient. The cells were cultured as recommended from ATCC using DMEM/F-12 supplemented with NuSerum and ITS +premix. It is important to note that the H295R cells were adapted to grow as an adherent monolayer from the H295 cell line which grows in suspension. However, there can still be many viable H295R cells in the media. 

      We attempted to conduct OCR and ECAR measurements using the Seahorse XF upon knockdown of SYTL5 and/or Rab27A in H295R cells. For these assays, it is essential that the cells be seeded in a monolayer at 70-90% confluency with no cell clusters[4]. Poor adhesion of the cells can cause inaccurate measurements by the analyser. Unfortunately, the results between the five replicates we carried out were highly inconsistent, the same knockdown produced trends in opposite directions in different replicates. This is likely due to problems with seeding the cells. Despite our best efforts to optimise seeding number, and pre-coating the plate with poly-D-lysine[5] we observed poor attachment of cells and inability to form a monolayer. 

      To study the localisation of SYTL5 and Rab27A in an ACC model, we transduced the H295R cells with lentiviral particles to overexpress pLVX-SV40-mScarlet-I-Rab27A and pLVX-CMV-SYTL5-EGFP-3xFLAG. Again, this proved unsuccessful after numerous attempts at optimising transduction. 

      These issues limited our investigation into the role of SYTL5 in ACC to the cortisol assay (Supplementary Figure 6). For this the H295R cells were an appropriate model as they are able to produce an array of adrenal cortex steroids[6] including cortisol[7]. In this assay, measurements are taken from cell culture supernatants, so the confluency of the cells does not prevent consistent results as the cortisol concentration was normalised to total protein per sample. With this assay we were able to rule out a role for SYTL5 and Rab27A in the secretion of cortisol.  

      Another consideration when investigating the involvement of SYTL5 in ACC, is that in general ACC cells should have a low expression of SYTL5 as is seen from the patient expression data (Figure 6B).

      The reviewer also writes “Furthermore, images and results supporting the major conclusions need to be improved.”. We have tried several times, without success, to generate U2OS cells with CRISPR/Cas9-mediated C-terminal tagging of endogenous SYTL5 with mNeonGreen, using an approach that has been successfully implemented in the lab for other genes. This is likely due to a lack of suitable sgRNAs targeting the C-terminal region of SYTL5, which have a low predicted efficiency score and a large number of predicted off-target sites in the human genome including several other gene exons and introns (see Author response image 2). 

      We have also included new data (Supplementary Figure 4B) showing that some of the hypoxia-induced SYTL5-Rab27A-positive vesicles stain positive for the autophagy markers p62 and LC3B when inhibiting lysosomal degradation, further strengthening our data that SYTL5 and Rab27A function as positive regulators of mitophagy.  

      Reviewer #2 (Public review): 

      Summary:

      The authors provide convincing evidence that Rab27 and STYL5 work together to regulate mitochondrial activity and homeostasis.

      Strengths:

      The development of models that allow the function to be dissected, and the rigorous approach and testing of mitochondrial activity.

      Weaknesses:

      There may be unknown redundancies in both pathways in which Rab27 and SYTL5 are working which could confound the interpretation of the results.

      Suggestions for revision:

      Given that Rab27A and SYTL5 are members of protein families it would be important to exclude any possible functional redundancies coming from Rab27B expression or one of the other SYTL family members. For Rab27 this would be straightforward to test in the assays shown in Figure 4 and Supplementary Figure 5. For SYTL5 it might be sufficient to include some discussion about this possibility.

      We thank the reviewer for pointing out the potential redundancy issue for Rab27A and SYTL5. There are multiple studies demonstrating the redundancy between Rab27A and Rab27B. For example, in a study of the disease Griscelli syndrome, caused by Rab27A loss of function, expression of either Rab27A or Rab27B rescues the healthy phenotype indicating redundancy[8]. This redundancy however applies to certain function and cell types. In fact, in a study regarding hair growth, knockdown of Rab27B had the opposite effect to knockdown of Rab27A[9].

      In this paper, we conducted all assays in U2OS cells, in which the expression of Rab27B is very low. Human Protein Atlas reports expression of 0.5nTPM for Rab27B, compared to 18.4nTPM for Rab27A. We also observed this low level of expression of Rab27B compared to Rab27A by qPCR in U2OS cells. Therefore, there would be very little endogenous Rab27B expression in cells depleted of Rab27A (with siRNA or KO). In line with this, Rab27B peptides were not detected in our SYTL5 interactome MS data (Table 1 in paper). Moreover, as Rab27A depletion inhibits mitochondrial recruitment of SYTL5 and mitophagy, it is not likely that Rab27B provides a functional redundancy. It is possible that Rab27B overexpression could rescue mitochondrial localisation of SYTL5 in Rab27A KO cells, but this was not tested as we do not have any evidence for a role of Rab27B in these cells. Taken together, we believe our data imply that Rab27B is very unlikely to provide any functional redundancy to Rab27A in our experiments. 

      For the SYTL family, all five members are Rab27 effectors, binding to Rab27 through their SHD domain. Together with Rab27, all SYTL’s have been implicated in exocytosis in different cell types. For example, SYTL1 in exocytosis of azurophilic granules from neutrophils[10], SYTL2 in secretion of glucagon granules from pancreatic α cells[11], SYTL3 in secretion of lytic granules from cytotoxic T lymphocytes[12], SYTL4 in exocytosis of dense hormone containing granules from endocrine cells[13] and SYTL5 in secretion of the RANKL cytokine from osteoblasts[14]. This indicates a potential for redundancy through their binding to Rab27 and function in vesicle secretion/trafficking. However, one study found that different Rab27 effectors have distinct functions at different stages of exocytosis[15].

      Very little known about redundancy or hierarchy between these proteins. Differences in function may be due to the variation in gene expression profile across tissues for the different SYTL’s (see Author response image 1 below). SYTL5 is enriched in the brain unlike the others, suggesting possible tissue specific functions. There are also differences in the binding affinities and calcium sensitivities of the C2iA and C2B domains between the SYTL proteins[16].

      Author response image 1.

      GTEx Multi Gene Query for SYTL1-5

      All five SYTL’s are expressed in the U2OS cell line with nTPMs according to Human Protein Atlas of SYTL1: 7.5, SYTL2: 13.4, SYTL3:14.2, SYTL4: 8.7, SYTL5: 4.8. In line with this, in the Rab27A interactome, when comparing cells overexpressing mScarlet-Rab27A with control cells, we detected all five SYTL’s as specific Rab27A-interacting proteins (see Author response table 1 above). Whereas, in the SYTL5 interactome we did not detect any other SYTL protein (table 1 in paper), confirming that they do not form a complex with SYTL5. 

      We have included the following text in the discussion (p. 12): “SYTL5 and Rab27A are both members of protein families, suggesting possible functional redundancies from Rab27B or one of the other SYTL isoforms. While Rab27B has a very low expression in U2OS cells, all five SYTL’s are expressed. However, when knocking out or knocking down SYTL5 and Rab27A we observe significant effects that we presume would be negated if their isoforms were providing functional redundancies. Moreover, we did not detect any other SYTL protein or Rab27B in the SYTL5 interactome, confirming that they do not form a complex with SYTL5.”

      Suggestions for Discussion: 

      Both Rab27A and STYL5 localize to other membranes, including the endolysosomal compartments. How do the authors envisage the mechanism or cellular modifications that allow these proteins, either individually or in complex to function also to regulate mitochondrial funcYon? It would be interesYng to have some views.

      We agree that it would be interesting to better understand the mechanism involved in modulation of the localisation and function of SYTL5 and Rab27A at different cellular compartments, including the mitochondria. Here, we have shown that SYTL5 recruitment to the mitochondria involves coincidence detection, as both its lipid-binding C2 domains and the Rab27A-binding SHD domain are required (Figure 1G-H). Both these domains also seem required for localisation of SYTL5 to vesicles, and we can only speculate that binding to different lipids (Figure 1F) may regulate SYTL5 localisation. Additionally, we find that mitochondrial recruitment of SYTL5 is dependent on the GTPase activity and mitochondrial localisation of Rab27A (Figure 2D-E). However, this seems also the case for vesicular recruitment of SYTL5, although a few SYTL5-Rab27A (T23N) positive vesicles were seen (Figure 2E). 

      To characterise the mechanisms involved in mitochondrial localisation of Rab27A, we have performed mass spectrometry analysis to identify the interactome of Rab27A (see Author response table 1 above). U2OS cells with stable expression of mScarlet-Rab27A or mScarlet only were subjected to immunoprecipitation, followed by MS analysis.  Of the 32 significant Rab27A-interacting hits (compared to control), two of the hits localise in the inner mitochondrial membrane (IMM); ATP synthase F(1) complex subunit alpha (P25705), and mitochondrial very long-chain specific acyl-CoA dehydrogenase (VLCAD)(P49748). However, as these IMM proteins are not likely involved in mitochondrial recruitment of Rab27A, observed under basal conditions, we chose not to include these data in the manuscript. 

      It is known that other RAB proteins are recruited to the mitochondria by regulation of their GTPase activity. During parkin-mediated mitophagy, RABGEF1 (a guanine nucleotide exchange factor) is recruited through its ubiquitin-binding domain and directs mitochondrial localisation of RAB5, which subsequently leads to recruitment of RAB7 by the MON1/CCZ1 GEF complex[1]. As already mentioned in the discussion (p.12), ubiquitination of the Rab27A GTPase activating protein alpha (TBC1D10A) is reduced in the brain of Parkin KO mouse compared to controls[35], suggesting a possible connection of Rab27A with regulatory mechanisms that are linked with mitochondrial damage and dysfunction. While this an interesting avenue to explore, it is beyond the scope of this paper. 

      Our data suggest that SYTL5 functions as a negative regulator of the Warburg effect, the switch from OXPHOS to glycolysis. While both SYTL5 and Rab27A seem required for mitophagy of selective mitochondrial components, and their depletion leading to reduced mitochondrial respiration and ATP production, only depletion of SYTL5 caused a switch to glycolysis. The mechanisms involved are unclear, but we found several proteins linked to the cellular response to oxidative stress, reactive oxygen species metabolic process, regulation of mitochondrion organisation and protein insertion into mitochondrial membrane to be enriched in the SYTL5 interactome (Figure 3A and C).

      We have addressed this comment in the discussion on p.12 

      Reviewer #3 (Public review):

      Summary:

      In the manuscript by Lapao et al., the authors uncover a role for the Rab27A effector protein SYTL5 in regulating mitochondrial function and turnover. The authors find that SYTL5 localizes to mitochondria in a Rab27A-dependent way and that loss of SYTL5 (or Rab27A) impairs lysosomal turnover of an inner mitochondrial membrane mitophagy reporter but not a matrix-based one. As the authors see no co-localization of GFP/mScarlet tagged versions of SYTL5 or Rab27A with LC3 or p62, they propose that lysosomal turnover is independent of the conventional autophagy machinery. Finally, the authors go on to show that loss of SYTL5 impacts mitochondrial respiration and ECAR and as such may influence the Warburg effect and tumorigenesis. Of relevance here, the authors go on to show that SYTL5 expression is reduced in adrenocortical carcinomas and this correlates with reduced survival rates.

      Strengths:

      There are clearly interesting and new findings here that will be relevant to those following mitochondrial function, the endocytic pathway, and cancer metabolism.

      Weaknesses:

      The data feel somewhat preliminary in that the conclusions rely on exogenously expressed proteins and reporters, which do not always align.

      As the authors note there are no commercially available antibodies that recognize endogenous SYTL5, hence they have had to stably express GFP-tagged versions. However, it appears that the level of expression dictates co-localization from the examples the authors give (though it is hard to tell as there is a lack of any kind of quantitation for all the fluorescent figures). Therefore, the authors may wish to generate an antibody themselves or tag the endogenous protein using CRISPR.

      We agree that the level of SYTL5 expression is likely to affect its localisation. As suggested by the reviewer, we have tried hard, without success, to generated U2OS cells with CRISPR knock-in of a mNeonGreen tag at the C-terminus of endogenous SYTL5, using an approach that has been successfully implemented in the lab for other genes. This is likely due to a lack of suitable sgRNAs targeting the C-terminal region of SYTL5, which have a low predicted efficiency score and a large number of predicted off-target sites in the human genome including several other gene exons and introns (see Author response image 2). 

      Author response image 2.

      Overview of sgRNAs targeting the C-terminal region of SYTL5 

      Although the SYTL5 expression level might affect its cellular localization, we also found the mitochondrial localisation of SYTL5-EGFP to be strongly increased in cells co-expressing mScarletRab27A, supporting our findings of Rab27A-mediated mitochondrial recruitment of SYTL5. We have also included new data (Supplementary Figure 4B) showing that some of the hypoxia-induced SYTL5Rab27A-positive vesicles stain positive for the autophagy markers p62 and LC3B when inhibiting lysosomal degradation, further strengthening our data that SYTL5 and Rab27A function as positive regulators of mitophagy.  

      In relation to quantitation, the authors found that SYTL5 localizes to multiple compartments or potentially a few compartments that are positive for multiple markers. Some quantitation here would be very useful as it might inform on function. 

      We find that SYTL5-EGFP localizes to mitochondria, lysosomes and the plasma membrane in U2OS cells with stable expression of SYTL5-EGFP and in SYTL5/Rab27A double knock-out cells rescued with SYTL5EGFP and mScralet-Rab27A. We also see colocalization of SYTL5-EGFP with endogenous p62, LC3 and LAMP1 upon induction of mitophagy. However, as these cell lines comprise a heterogenous pool with high variability we do not believe that quantification of the overexpressing cell lines would provide beneficial information in this scenario. As described above, we have tried several times to generate SYTL5 knock-in cells without success.  

      The authors find that upon hypoxia/hypoxia-like conditions that punctate structures of SYTL5 and Rab27A form that are positive for Mitotracker, and that a very specific mitophagy assay based on pSu9-Halo system is impaired by siRNA of SYTL5/Rab27A, but another, distinct mitophagy assay (Matrix EGFP-mCherry) shows no change. I think this work would strongly benefit from some measurements with endogenous mitochondrial proteins, both via immunofluorescence and western blot-based flux assays. 

      In addition to the western blotting for different endogenous ETC proteins showing significantly increased levels of MTCO1 in cells depleted of SYTL5 and/or Rab27A (Figure 5E-F), we have now blotted for the endogenous mitochondrial proteins, COXIV and BNIP3L, in DFP and DMOG conditions upon knockdown of SYTL5 and/or Rab27A (Figure 5G and Supplementary Figure 5A). Although there was a trend towards increased levels, we did not see any significant changes in total COXIV or BNIP3L levels when SYTL5, Rab27A or both are knocked down compared to siControl. Blotting for endogenous mitochondrial proteins is however not the optimum readout for mitophagy. A change in mitochondrial protein level does not necessarily result from mitophagy, as other factors such as mitochondrial biogenesis and changes in translation can also have an effect. Mitophagy is a dynamic process, which is why we utilise assays such as the HaloTag and mCherry-EGFP double tag as these indicate flux in the pathway. Additionally, as mitochondrial proteins have different half-lives, with many long-lived mitochondrial proteins[17], differences in turnover rates of endogenous proteins make the results more difficult to interpret. 

      A really interesting aspect is the apparent independence of this mitophagy pathway on the conventional autophagy machinery. However, this is only based on a lack of co-localization between p62or LC3 with LAMP1 and GFP/mScarlet tagged SYTL5/Rab27A. However, I would not expect them to greatly colocalize in lysosomes as both the p62 and LC3 will become rapidly degraded, while the eGFP and mScarlet tags are relatively resistant to lysosomal hydrolysis. -/+ a lysosome inhibitor might help here and ideally, the functional mitophagy assays should be repeated in autophagy KOs. 

      We thank the reviewer for this suggestion. We have now repeated the colocalisation studies in cells treated with DFP with the addition of bafilomycin A1 (BafA1) to inhibit the lysosomal V-ATPase. Indeed, we find that a few of the SYTL5/Rab27A/MitoTracker positive structures also stain positive for p62 and LC3 (Supplementary Figure 4B). As expected, the occurrence of these structures was rare, as BafA1 was only added for the last 4 hrs of the 24 hr DFP treatment. However, we cannot exclude the possibility that there are two different populations of these vesicles.

      The link to tumorigenesis and cancer survival is very interesYng but it is not clear if this is due to the mitochondrially-related aspects of SYTL5 and Rab27A. For example, increased ECAR is seen in the SYTL5 KO cells but not in the Rab27A KO cells (Fig.5D), implying that mitochondrial localization of SYTL5 is not required for the ECAR effect. More work to strengthen the link between the two sections in the paper would help with future direcYons and impact with respect to future cancer treatment avenues to explore. 

      We agree that the role of SYTL5 in ACC requires future investigation. While we observe reduced OXPHOS levels in both SYTL5 and Rab27A KO cells (Figure 5B), glycolysis was only increased in SYTL5 KO cells (Figure 5D). We believe this indicates that Rab27A is being negatively regulated by SYTL5, as ECAR was unchanged in both the Rab27A KO and Rab27A/SYTL5 dKO cells. This suggests that Rab27A is required for the increase in ECAR when SYTL5 is depleted, therefore SYTL5 negatively regulates Rab27A. The mechanism involved is unclear, but we found several proteins linked to the cellular response to oxidative stress, reactive oxygen species metabolic process, regulation of mitochondrion organisation and protein insertion into mitochondrial membrane to be enriched in the SYTL5 interactome (Figure 3A and C).

      To investigate the link to cancer further, we tested the effect of knockdown of SYTL5 and/or Rab27A on the levels of mitochondrial ROS. ROS levels were measured by flow cytometry using the MitoSOX Red dye, together with the MitoTracker Green dye to normalise ROS levels to the total mitochondria. Cells were treated with the antioxidant N-acetylcysteine (NAC)[18] as a negative control and menadione as a positive control, as menadione induces ROS production via redox cycling[19]. We must consider that there is also a lot of autofluorescence from cells that makes it impossible to get a level of ‘zero ROS’ in this experiment. We did not see a change in ROS with knockdown of SYTL5 and/or Rab27A compared to the NAC treated or siControl samples (see Author response image 3 below). The menadione samples confirm the success of the experiment as ROS accumulated in these cells. Thus, based on this, we do not believe that low SYTL5 expression would affect ROS levels in ACC tumours.

      Author response image 3.

      Mitochondrial ROS production normalised to total mitochondria

      As discussed in our response to Reviewer #1, we tried hard to characterise the role of SYTL5 in the context of ACC using the NCI-H295R cell line isolated from the adrenal gland of an adrenal cancer patient. We attempted to conduct OCR and ECAR measurements using the Seahorse XF upon knockdown of SYTL5 and/or Rab27A in H295R cells without success, due to poor attachment of the cells and inability to form a monolayer. We also transduced the H295R cells with lentiviral particles to overexpress pLVX-SV40-mScarlet-I-Rab27A and pLVX-CMV-SYTL5-EGFP-3xFLAG to study the localisation of SYTL5 and Rab27A in an ACC model. Again, this proved unsuccessful after numerous attempts at optimising the transduction. These issues limited our investigation into the role of SYTL5 in ACC to the cortisol assay (Supplementary Figure 6). For this the H295R cells were an appropriate model as they are able to produce an array of adrenal cortex steroids[6] including cortisol[7] In this assay, measurements are taken from cell culture supernatants, so the confluency of the cells does not prevent consistent results as the cortisol concentration was normalised to total protein per sample. With this assay we were able to rule out a role for SYTL5 and Rab27A in the secretion of cortisol.  

      Another consideration when investigating the involvement of SYTL5 in ACC, is that in general ACC cells should have a low expression of SYTL5 as is seen from the patient expression data (Figure 6B).

      Further studies into the link between SYTL5/Rab27A and cancer are beyond the scope of this paper as we are limited to the tools and expertise available in the lab.

      References

      (1) Yamano, K. et al. Endosomal Rab cycles regulate Parkin-mediated mitophagy. eLife 7 (2018). https://doi.org:10.7554/eLife.31326

      (2) Carré, M. et al. Tubulin is an inherent component of mitochondrial membranes that interacts with the voltage-dependent anion channel. The Journal of biological chemistry 277, 33664-33669 (2002). https://doi.org:10.1074/jbc.M203834200

      (3) Hoogerheide, D. P. et al. Structural features and lipid binding domain of tubulin on biomimetic mitochondrial membranes. Proceedings of the National Academy of Sciences 114, E3622-E3631 (2017). https://doi.org:10.1073/pnas.1619806114

      (4) Plitzko, B. & Loesgen, S. Measurement of Oxygen Consumption Rate (OCR) and Extracellular Acidification Rate (ECAR) in Culture Cells for Assessment of the Energy Metabolism. Bio Protoc 8, e2850 (2018). https://doi.org:10.21769/BioProtoc2850

      (5) Yavin, E. & Yavin, Z. Attachment and culture of dissociated cells from rat embryo cerebral hemispheres on polylysine-coated surface. The Journal of cell biology 62, 540-546 (1974). https://doi.org:10.1083/jcb.62.2.540

      (6) Wang, T. & Rainey, W. E. Human adrenocortical carcinoma cell lines. Mol Cell Endocrinol 351, 5865 (2012). https://doi.org:10.1016/j.mce.2011.08.041

      (7) Rainey, W. E. et al. Regulation of human adrenal carcinoma cell (NCI-H295) production of C19 steroids. J Clin Endocrinol Metab 77, 731-737 (1993). https://doi.org:10.1210/jcem.77.3.8396576

      (8) Barral, D. C. et al. Functional redundancy of Rab27 proteins and the pathogenesis of Griscelli syndrome. J. Clin. Invest. 110, 247-257 (2002). https://doi.org:10.1172/jci15058

      (9) Ku, K. E., Choi, N. & Sung, J. H. Inhibition of Rab27a and Rab27b Has Opposite Effects on the Regulation of Hair Cycle and Hair Growth. Int. J. Mol. Sci. 21 (2020). https://doi.org:10.3390/ijms21165672

      (10) Johnson, J. L., Monfregola, J., Napolitano, G., Kiosses, W. B. & Catz, S. D. Vesicular trafficking through cortical actin during exocytosis is regulated by the Rab27a effector JFC1/Slp1 and the RhoA-GTPase–activating protein Gem-interacting protein. Mol. Biol. Cell 23, 1902-1916 (2012). https://doi.org:10.1091/mbc.e11-12-1001

      (11) Yu, M. et al. Exophilin4/Slp2-a targets glucagon granules to the plasma membrane through unique Ca2+-inhibitory phospholipid-binding activity of the C2A domain. Mol. Biol. Cell 18, 688696 (2007). https://doi.org:10.1091/mbc.e06-10-0914

      (12) Kurowska, M. et al. Terminal transport of lyXc granules to the immune synapse is mediated by the kinesin-1/Slp3/Rab27a complex. Blood 119, 3879-3889 (2012). https://doi.org:10.1182/blood-2011-09-382556

      (13) Zhao, S., Torii, S., Yokota-Hashimoto, H., Takeuchi, T. & Izumi, T. Involvement of Rab27b in the regulated secretion of pituitary hormones. Endocrinology 143, 1817-1824 (2002). https://doi.org:10.1210/endo.143.5.8823

      (14) Kariya, Y. et al. Rab27a and Rab27b are involved in stimulation-dependent RANKL release from secretory lysosomes in osteoblastic cells. J Bone Miner Res 26, 689-703 (2011). https://doi.org:10.1002/jbmr.268

      (15) Zhao, K. et al. Functional hierarchy among different Rab27 effectors involved in secretory granule exocytosis. Elife 12 (2023). https://doi.org:10.7554/eLife.82821

      (16) Izumi, T. Physiological roles of Rab27 effectors in regulated exocytosis. Endocr J 54, 649-657 (2007). https://doi.org:10.1507/endocrj.kr-78

      (17) Bomba-Warczak, E. & Savas, J. N. Long-lived mitochondrial proteins and why they exist. Trends in cell biology 32, 646-654 (2022). https://doi.org:10.1016/j.tcb.2022.02.001

      (18) Curtin, J. F., Donovan, M. & Cotter, T. G. Regulation and measurement of oxidative stress in apoptosis. Journal of Immunological Methods 265, 49-72 (2002). https://doi.org:https://doi.org/10.1016/S0022-1759(02)00070-4

      (19) Criddle, D. N. et al. Menadione-induced Reative Oxygen Species Generation via Redox Cycling Promotes Apoptosis of Murine Pancreatic Acinar Cells. Journal of Biological Chemistry 281, 40485-40492 (2006). https://doi.org:https://doi.org/10.1074/jbc.M607704200

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Turner et al. present an original approach to investigate the role of Type-1 nNOS interneurons in driving neuronal network activity and in controlling vascular network dynamics in awake head-fixed mice. Selective activation or suppression of Type-1 nNOS interneurons has previously been achieved using either chemogenetic, optogenetic, or local pharmacology. Here, the authors took advantage of the fact that Type-1 nNOS interneurons are the only cortical cells that express the tachykinin receptor 1 to ablate them with a local injection of saporin conjugated to substance P (SP-SAP). SP-SAP causes cell death in 90 % of type1 nNOS interneurons without affecting microglia, astrocytes, and neurons. The authors report that the ablation has no major effects on sleep or behavior. Refining the analysis by scoring neural and hemodynamic signals with electrode recordings, calcium signal imaging, and wide-field optical imaging, the authors observe that Type-1 nNOS interneuron ablation does not change the various phases of the sleep/wake cycle. However, it does reduce low-frequency neural activity, irrespective of the classification of arousal state. Analyzing neurovascular coupling using multiple approaches, they report small changes in resting-state neural-hemodynamic correlations across arousal states, primarily mediated by changes in neural activity. Finally, they show that nNOS type 1 interneurons play a role in controlling interhemispheric coherence and vasomotion.

      In conclusion, these results are interesting, use state-of-the-art methods, and are well supported by the data and their analysis. I have only a few comments on the stimulus-evoked haemodynamic responses, and these can be easily addressed.

      We thank the reviewer for their positive comments on our work.

      Reviewer #2 (Public review):

      Summary:

      This important study by Turner et al. examines the functional role of a sparse but unique population of neurons in the cortex that express Nitric oxide synthase (Nos1). To do this, they pharmacologically ablate these neurons in the focal region of whisker-related primary somatosensory (S1) cortex using a saponin-substance P conjugate. Using widefield and 2photon microscopy, as well as field recordings, they examine the impact of this cell-specific lesion on blood flow dynamics and neuronal population activity. Locally within the S1 cortex, they find changes in neural activity paFerns, decreased delta band power, and reduced sensory-evoked changes in blood flow (specifically eliminating the sustained blood flow change amer stimulation). Surprisingly, given the tiny fraction of cortical neurons removed by the lesion, they also find far-reaching effects on neural activity paFerns and blood volume oscillations between the cerebral hemispheres.

      Strengths:

      This was a technically challenging study and the experiments were executed in an expert manner. The manuscript was well wriFen and I appreciated the cartoon summary diagrams included in each figure. The analysis was rigorous and appropriate. Their discovery that Nos1 neurons can have far-reaching effects on blood flow dynamics and neural activity is quite novel and surprising (to me at least) and should seed many follow-up, mechanistic experiments to explain this phenomenon. The conclusions were justified by the convincing data presented.

      Weaknesses:

      I did not find any major flaws in the study. I have noted some potential issues with the authors' characterization of the lesion and its extent. The authors may want to re-analyse some of their data to further strengthen their conclusions. Lastly, some methodological information was missing, which should be addressed.

      We thank the reviewer for their enthusiasm for our work.

      Reviewer #3 (Public review):

      The role of type-I nNOS neurons is not fully understood. The data presented in this paper addresses this gap through optical and electrophysiological recordings in adult mice (awake and asleep).

      This manuscript reports on a study on type-I nNOS neurons in the somatosensory cortex of adult mice, from 3 to 9 months of age. Most data were acquired using a combination of IOS and electrophysiological recordings in awake and asleep mice. Pharmacological ablation of the type-I nNOS populations of cells led to decreased coherence in gamma band coupling between lem and right hemispheres; decreased ultra-low frequency coupling between blood volume in each hemisphere; decreased (superficial) vascular responses to sustained sensory stimulus and abolishment of the post-stimulus CBV undershoot. While the findings shed new light on the role of type-I nNOS neurons, the etiology of the discrepancies between current observations and literature observations is not clear and many potential explanations are put forth in the discussion.

      We thank the reviewer for their comments.

      Reviewer #1 (Recommendations for the authors):  

      (1) Figure 3, Type-1 nNOS interneuron ablation has complex effects on neural and vascular responses to brief (.1s) and prolonged (5s) whisker stimulation. During 0.1 s stimulation, ablation of type 1 nNOS cells does not affect the early HbT response but only reduces the undershoot. What is the pan-neuronal calcium response? Is the peak enhanced, as might be expected from the removal of inhibition? The authors need to show the GCaMP7 trace obtained during this short stimulation.

      Unfortunately, we did not perform brief stimulation experiments in GCaMP-expressing mice. As we did not see a clear difference in the amplitude of the stimulus-evoked response with our initial electrophysiology recordings (Fig. 3a), we suspected that an effect might be visible with longer duration stimuli and thus pivoted to a pulsed stimulation over the course of 5 seconds for the remaining cohorts. It would have been beneficial to interweave short-stimulus trials for a direct comparison between the complimentary experiments, but we did not do this.

      During 5s stimulation, both the early and delayed calcium/vascular responses are reduced. Could the authors elaborate on this? Does this mean that increasing the duration of stimulation triggers one or more additional phenomena that are sensitive to the ablation of type 1 nNOS cells and mask what is triggered by the short stimulation? Are astrocytes involved? How do they interpret the early decrease in neuronal calcium?

      As our findings show that ablation reduces the calcium/vascular response more prominently during prolonged stimulation, we do suspect that this is due to additional NO-dependent mechanisms or downstream responses. NO is modulator of neural activity, generally increasing excitability (Kara and Friedlander 1999, Smith and Otis 2003), so any manipulation that changes NO levels will change (likely decrease) the excitability of the network, potentially resulting in a smaller hemodynamic response to sensory stimulation secondary to this decrease. While short stimuli engage rapid neurovascular coupling mechanisms, longer duration (>1s) stimulation could introduce additional regulatory elements, such as astrocytes, that operate on a slower time scale. On the right, we show a comparison of the control groups ploFed together from Fig. 3a and 3b with vertical bars aligned to the peak. During the 5s stimulation, the time-to-peak is roughly 830 milliseconds later than the 0.1s stimulation, meaning it’s plausible that the signals don’t separate until later. Our interpretation is that the NVC mechanisms responsible for brief stimulus-evoked change are either NO-independent or are compensated for in the SSP-SAP group by other means due to the chronic nature of the ablation. 

      We have added the following text to the Discussion (Line 368): “Loss of type-I nNOS neurons drove minimal changes in the vasodilation elicited by brief stimulation, but led to decreased vascular responses to sustained stimulation, suggesting that the early phase of neurovascular coupling is not mediated by these cells, consistent with the multiple known mechanisms for neurovascular coupling (AFwell et al 2010, Drew 2019, Hosford & Gourine 2019) acting through both neurons and astrocytes with multiple timescales (Le Gac et al 2025, Renden et al 2024, Schulz et al 2012, Tran et al 2018).”

      Author response image 1.

      (2) In Figures 4d and e, it is unclear to me why the authors use brief stimulation to analyze the relationship between HbT and neuronal activity (gamma power) and prolonged stimulation for the relationship between HbT and GCaMP7 signal. Could they compare the curves with both types of stimulation?

      As discussed previously, we did not use the same stimulation parameters across cohorts. The mice with implanted electrodes received only brief stimulation, while those undergoing calcium imaging received longer duration stimulus. 

      Reviewer #2 (Recommendations for the authors):

      (1) Results, how far-reaching is the cell-specific ablation? Would it be possible to estimate the volume of the cortex where Nos1 cells are depleted based on histology? Were there signs of neuronal injury more remotely, for example, beading of dendrites?

      We regularly see 1-2 mm in diameter of cell ablation within the somatosensory cortex of each animal, which is consistent with the spread of small molecules. Ribosome inactivating proteins like SAP are smaller than AAVs (~5 nm compared to ~25 nm in diameter) and thus diffuse slightly further. We observed no obvious indication of neuronal injury more remotely or in other brain regions, but we did not image or characterize dendritic beading, as this would require a sparse labeling of neurons to clearly see dendrites (NeuN only stains the cell body). Our histology shows no change in cell numbers. 

      We have added the following text to the Results (Line 124): “Immunofluorescent labeling in mice injected with Blank-SAP showed labeling of nNOS-positive neurons near the injection site. In contrast, mice injected with SP-SAP showed a clear loss in nNOS-labeling, with a typical spread of 1-2 mm from the injection site, though nNOS-positive neurons both subcortically and in the entirety of the contralateral hemisphere remaining intact.”

      (2) For histological analysis of cell counts amer the lesion, more information is needed. How was the region of interest for counting cells determined (eg. 500um radius from needle/pipeFe tract?) and of what volume was analysed?

      The region of interest for both SSP-SAP and Blank SAP injections was a 1 mm diameter circle centered around the injection site and averaged across sections (typically 3-5 when available). In most animals, the SSP-SAP had a lateral spread greater than 500 microns and encompassed the entire depth of cortex (1-1.5 mm in SI, decreasing in the rostral to caudal direction). The counts within the 1 mm diameter ROI were averaged across sections and then converted into the cells per mm area as presented. Note the consistent decrease in type I nNOS cells seen across mice in Fig 1d, Fig S1b.

      We have added the following text in the Materials & Methods (Line 507): “The region of interest for analysis of cell counts was determined based on the injection site for both SP-SAP and Blank SAP injections, with a 1 mm diameter circle centered around the injection site and averaged across 3-5 sections where available. In most animals, the SP-SAP had a lateral spread greater than 500 microns and encompassed the entire depth of cortex (1-1.5 mm in SI).”

      (3) Based on Supplementary Figure 1, it appears that the Saponin conjugate not only depletes Nos neurons but also may affect vascular (endothelial perhaps) Nos expression. Some quantification of this effect and its extent may be insighIul in terms of ascribing the effects of the lesion directly on neurons vs indirectly and perhaps more far-reaching via vascular/endothelial NOS.

      Thank you for this comment. While this is a possibility, while we have found that the high nNOS expression of type-I nnoos neurons makes NADPH diaphorase a good stain for detecting them, it is less useful for cell types that expres NOS at lower levels.  We have found that the absolute intensity of NADPH diaphorase staining is somewhat variable from section to section. Variability in overall NADPH diaphorase intensity is likely due to several factors, such as duration of staining, thickness of the section, and differences in PFA concentration within the tissue and between animals. As NADPH diaphorase staining is highly sensitive to amount PFA exposure, any small differences in processing could affect the intensity, and slight differences in perfusion quality and processing could account. A second, perhaps larger issue could be due to differences in the number of arteries (which will express NOS at much higher levels than veins, and thus will appear darker) in the section. We did not stain for smooth muscle and so cannot differentiate arteries and veins.  Any difference in vessel intensity could be due to random variations in the numbers of arteries/veins in the section. While we believe that this is a potentially interesting question, our histological experiments were not able to address it.

      (4) The assessment for inflammation took place 1 month amer the lesion, but the imaging presumably occurred ~ 2 weeks amer the lesion. Note that it seemed somewhat ambiguous as to when approximately, the imaging, and electrophysiology experiments took place relative to the induction of the lesion. Presumably, some aspects of inflammation and disruption could have been missed, at the time when experiments were conducted, based on this disparity in assessment. The authors may want to raise this as a possible limitation.

      We apologize for our unclear description of the timeline. We began imaging experiments at least 4 weeks amer ablation, the same time frame as when we performed our histological assays. 

      We have added the following text to the Discussion (Line 379): “With imaging beginning four weeks amer ablation, there could be compensatory rewiring of local and/or network activity following type-I nNOS ablation, where other signaling pathways from the neurons to the vasculature become strengthened to compensate for the loss of vasodilatory signaling from the typeI nNOS neurons.”

      (5) Results Figure 2, please define "P or delta P/P". Also, for Figure 2c-f, what do the black vertical ticks represent?

      ∆P/P is the change in the gamma-band power relative to the resting-state baseline, and black tick marks indicate binarized periods of vibrissae motion (‘whisking’). We have clarified this in Figure caption 2 (Line 174).

      (6) Figure 3b-e, is there not an undershoot (eventually) amer 5s of stimulation that could be assessed? 

      Previous work has shown that there is no undershoot in response to whisker stimulations of a few seconds (Drew, Shih, Kelinfeld, PNAS, 2011).  The undershoot for brief stimuli happens within ~2.5 s of the onset/cessation of the brief stimulation, this is clearly lacking in the response to the 5s stim (Fig 3).  The neurovascular coupling mechanisms recruited during the short stimulation are different than those recruited during the long stimulus, making a comparison of the undershoot between the two stimulation durations problematic. 

      For Figures 3e and 6 how was surface arteriole diameter or vessel tone measured? 2P imaging of fluorescent dextran in plasma? Please add the experimental details of 2P imaging to the methods. Including some 2P images in the figures couldn't hurt to help the reader understand how these data were generated.

      We have added details about our 2-photon imaging (FITC-dextran, full-width at half-maximum calculation for vessel diameter) as well as a trace and vessel image to Figure 2.

      We have added the following text to the Materials & Methods (Line 477): “In two-photon experiments, mice were briefly anesthetized and retro-orbitally injected with 100 µL of 5% (weight/volume) fluorescein isothiocyanate–dextran (FITC) (FD150S, Sigma-Aldrich, St. Louis, MO) dissolved in sterile saline.”

      We have added the following text to the Materials & Methods (Line 532): “A rectangular box was drawn around a straight, evenly-illuminated vessel segment and the pixel intensity was averaged along the long axis to calculate the vessel’s diameter from the full-width at half-maximum (https://github.com/DrewLab/Surface-Vessel-FWHM-Diameter; (Drew, Shih et al. 2011)).”

      (7) Did the authors try stimulating other body parts (eg. limb) to estimate how specific the effects were, regionally? This is more of a curiosity question that the authors could comment on, I am not recommending new experiments.

      We did measure changes in [HbT] in the FL/HL representation of SI during locomotion (Line 205), which is known to increase neural activity in the somatosensory cortex (Huo, Smith and Drew, Journal of Neuroscience, 2014; Zhang et al., Nature Communications 2019). We observed a similar but not statistically significant trend of decreased [HbT] in SP-SAP compared to control. This may have been due to the sphere of influence of the ablation being centered on the vibrissae representation and not having fully encompassed the limb representation. We agree with the referee that it would be interesting to characterize these effects on other sensory regions as well as brain regions associated with tasks such as learning and behavior.

      (8) Regarding vasomotion experiments, are there no other components of this waveform that could be quantified beyond just variance? Amplitude, frequency? Maybe these don't add much but would be nice to see actual traces of the diameter fluctuations. Further, where exactly were widefield-based measures of vasomotion derived from? From some seed pixel or ~1mm ROI in the center of the whisker barrel cortex? Please clarify.

      The reviewer’s point is well taken. We have added power spectra of the resting-state data which provides amplitude and frequency information. The integrated area under the curve of the power spectra is equal to the variance. Widefield-based measures of vasomotion were taken from the 1 mm ROI in the center of the whisker barrel cortex.

      We have added the following text to the Materials & Methods (Line 560): “Variance during the resting-state for both ∆[HbT] and diameter signals (Fig. 7) was taken from resting-state events lasting ≥10 seconds in duration. Average ∆[HbT] from within the 1 mm ROI over the vibrissae representation of SI during each arousal state was taken with respect to awake resting baseline events ≥10 seconds in duration.” 

      (9) On page 13, the title seems like a bit strong. The data show a change in variance but that does not necessarily mean a change in absolute amplitude. Also, I did not see any reports of absolute vessel widths between groups from 2P experiments so any difference in the sampling of larger vs smaller arterioles could have affected the variance (ie. % changes could be much larger in smaller arterioles).

      We have updated the title of Figure 7 to specifically state power (which is equivalent to the variance) rather than amplitude (Line 331). We have also added absolute vessel widths to the Results (Line 340): “There was no difference in resting-state (baseline) diameter between the groups, with Blank-SAP having a diameter of 24.4 ± 7.5 μm and SP-SAP having a diameter of 23.0 ± 9.4 μm (Fest, p ti 0.61). “

      (10) Big picture question. How could a manipulation that affects so few cells in 1 hemisphere (below 0.5% of total neurons in a region comprising 1-2% of the volume of one hemisphere) have such profound effects in both hemispheres? The authors suggest that some may have long-range interhemispheric projections, but that is presumably a fraction of the already small fraction of Nos1 neurons. Perhaps these neurons have specializing projections to subcortical brain nuclei (Nucleus Basilis, Raphe, Locus Coerulus, reticular thalamus, etc) that then project widely to exert this outsized effect? Has there not been a detailed anatomical characterization of their efferent projections to cortical and sub-cortical areas? This point could be raised in the discussion.

      We apologize for the lack of clarity of our work in this point.  We would like to clarify that the only analysis showing a change in the unablated hemisphere being coherence/correlation analysis between the two hemispheres.  Other metrics (LFP power and CBV power spectra) do not change in the hemisphere contralateral to the injections site, as we show in data added in two supplementary figures (Fig. S4 and 7). The coherence/correlation is a measure of the correlated dynamics in the two hemispheres. For this metric to change, there only needs to be a change in the dynamics of one hemisphere relative to another.  If some aspects of the synchronization of neural and vascular dynamics across hemispheres are mediated by concurrent activation of type I nNOS neurons in both hemispheres, ablating them in one hemisphere will decrease synchrony. It is possible that type I nNOS neurons make some subcortical projections that were not reported in previous work (Tomioka 2005, Ruff 2024), but if these exist they are likely to be very small in number as they were not noted.  

      We have added the text in the Results (Line 228): “In contrast to the observed reductions in LFP in the ablated hemisphere, we noted no gross changes in the power spectra of neural LFP in the unablated hemisphere (Fig. S7) or power of the cerebral blood volume fluctuations in either hemisphere (Fig. S4).”

      Line 335): “The variance in ∆[HbT] during rest, a measure of vasomotion amplitude, was significantly reduced following type-I nNOS ablation (Fig. 7a), dropping from 40.9 ± 3.4 μM<sup>2</sup> in the Blank-SAP group (N ti 24, 12M/12F) to 23.3 ± 2.3 μM<sup>2</sup> in the SP-SAP group (N ti 24, 11M/13F) (GLME p ti 6.9×10<sup>-5</sup>) with no significant di[erence in the unablated hemisphere (Fig. S7).”

      Reviewer #3 (Recommendations for the authors):

      (1)  The reporting would be greatly strengthened by following ARRIVE guidelines 2.0: https://arriveguidelines.org/: aFrition rates and source of aFrition, justification for the use of 119 (beyond just consistent with previous studies), etc.

      We performed a power analysis prior to our study aiming to detect a physiologically-relevant effect size of (Cohen’s d) ti 1.3, or 1.3 standard deviations from the mean. Alpha and Power were set to the standard 0.05 and 0.80 respectively, requiring around 8 mice per group (SP-SAP, Blank, and for histology, naïve animals) for multiple independent groups (ephys, GCamp, histology). To potentially account for any aFrition due to failures in Type-I nNOS neuron ablation or other problems (such as electrode failure or window issues) we conservatively targeted a dozen mice for each group. Of mice that were imaged (1P/2P), two SP-SAP mice were removed from the dataset (24 SP-SAP remaining) post-histological analysis due to not showing ablation of nNOS neurons, an aFrition rate of approximately 8%.

      We have added the following text to the Materials & Methods (Line 441): “Sample sizes are consistent with previous studies (Echagarruga et al 2020, Turner et al 2023, Turner et al 2020, Zhang et al 2021) and based on a power analysis requiring 8-10 mice per group (Cohen’s d ti 1.3, α ti 0.05, (1 - β) ti 0.800). Experimenters were not blind to experimental conditions or data analysis except for histological experiments. Two SP-SAP mice were removed from the imaging datasets (24 SP-SAP remaining) due to not showing ablation of nNOS neurons during post-histological analysis, an aFrition rate of approximately 8%.”

      (2) Intro, line 38: Description of the importance of neurovascular coupling needs improvement. Coordinated haemodynamic activity is vital for maintaining neuronal health and the energy levels needed.

      We have added a sentence to the introduction (Line 41): “Neurovascular coupling plays a critical role in supporting neuronal function, as tightly coordinated hemodynamic activity is essential for meeting energy metabolism and maintaining brain health (Iadecola et al 2023, Schaeffer & Iadecola 2021).“

      (3) Given the wide range of mice ages, how was the age accounted for/its effects examined?

      Previous work from our lab has shown that there is no change in hemodynamics responses in awake mice over a wide range of ages (2-18 months), so the age range we used (3 and 9 months of age) should not impact this.  

      We have added the following text in the Results (Line 437): “Previous work from our lab has shown that the vasodilation elicited by whisker stimulation is the same in 2–4-month-old mice as in 18-month-old mice (BenneF, Zhang et al. 2024). As the age range used here is spanned by this time interval, we would not expect any age-related differences.”

      (4) How was the susceptibility of low-frequency neuronal coupling signals to noise managed? How were the low-frequency bands results validated?

      We are not sure what the referee is asking here. Our electrophysiology recordings were made differentially using stereotrodes with tips separated by ~100µm, which provides excellent common-mode rejection to noise and a localized LFP signal. Previous publications from our lab (Winder et al., Nature Neuroscience 2017; Turner et al., eLife2020) and others (Tu, Cramer, Zhang, eLife 2024) have repeatedly show that there is a very weak correlation between the power in the low frequency bands and hemodynamic signals, so our results are consistent with this previous work. 

      (5) It would be helpful to demonstrate the selectivity of cell *death* (as opposed to survival) induced by SP-SAP injections via assessments using markers of cell death.

      We agree that this would be helpful complement to our histological studies that show loss of type-I nNOS neurons, but no loss of other cells and minimal inflammation with SP-saporin injections.  However, we did not perform histology looking at cell death, only at surviving cells, given that we see no obvious inflammation or cells loss, which would be triggered by nonspecific cell death.  Previous work has established that saporin is cytotoxic and specific only to cell that internalize the saporin.   Internalization of saporin causes cell death via apoptosis (Bergamaschi, Perfe et al. 1996), and that the substance P receptor is internalized when the receptor is bound (Mantyh, Allen et al. 1995). Treatment of internalized saporin generates cellular debris that is phagocytosed by microglial, consistent with cell death (Seeger, Hartig et al. 1997). While it is possible that treatment of SP-saporin causes type 1 nNOS neurons to stop expressing nitric oxide synthase (which would make them disappear from our IHC staining), we think that this is unlikely given the literature shows internalized saporin is clearly cytotoxic. 

      We have added the following text to the Results (Line 131): “It is unlikely that the disappearance of type-I nNOS neurons is because they stopped expressing nNOS, as internalized saporin is cytotoxic. Exposure to SP-conjugated saporin causes rapid internalization of the SP receptor-ligand complex (Mantyh, Allen et al. 1995), and internalized saporin causes cell death via apoptosis (Bergamaschi, Perfe et al. 1996). In the brain, the resulting cellular debris from saporin administration is then cleared by microglia phagocytosis (Seeger, Hartig et al. 1997).”

      (6) Was the decrease in inter-hemispheric correlation associated with any changes to the corpus callosum?

      We noted no gross changes to the structure of the corpus callosum in any of our histological reconstructions following SSPSAP administration, however, we did not specifically test for this. Again, as we note in our reply in reviewer 2, the decrease in interhemispheric synchronization does not imply that there are changes in the corpus callosum and could be mediated by the changes in neural activity in the hemisphere in which the Type-I nNOS neurons were ablated.

      (7) How were automated cell counts validated?

      Criteria used for automated cell counts were validated with comparisons of manual counting as described in previous literature. We have added additional text describing the process in the Materials & Methods (Line 510): “For total cell counts, a region of interest (ROI) was delineated, and cells were automatically quantified under matched criteria for size, circularity and intensity. Image threshold was adjusted until absolute value percentages were between 1-10% of the histogram density. The function Analyze Par-cles was then used to estimate the number of particles with a size of 100-99999 pixels^2 and a circularity between 0.3 and 1.0 (Dao, Suresh Nair et al. 2020, Smith, Anderson et al. 2020, Sicher, Starnes et al. 2023). Immunoreactivity was quantified as mean fluorescence intensity of the ROI (Pleil, Rinker et al. 2015).”

      (8) Given the weighting of the vascular IOS readout to the superficial tissue, it is important to qualify the extent of the hemodynamic contrast, ie the limitations of this readout.

      We have added the following text to the Discussion (Line 385): “Intrinsic optical signal readout is primarily weighted toward superficial tissue given the absorption and scaFering characteristics of the wavelengths used. While surface vessels are tightly coupled with neural activity, it is still a maFer of debate whether surface or intracortical vessels are a more reliable indicator of ongoing activity (Goense et al 2012; Huber et al 2015; Poplawsky & Kim 2014).” 

      (9) Partial decreases observed through type-I iNOS neuronal ablation suggest other factors also play a role in regulating neural and vascular dynamics: data presented thus do *not* "indicate disruption of these neurons in diseases ranging from neurodegeneration to sleep disturbances," as currently stated. Please revise.

      We agree with the reviewer. We have changed the abstract sentence to read (Line 30): “This demonstrates that a small population of nNOS-positive neurons are indispensable for regulating both neural and vascular dynamics in the whole brain, raising the possibility that loss of these neurons could contribute to the development of neurodegenerative diseases and sleep disturbances.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      The authors conducted a spatial analysis of dysplastic colon tissue using the Slide-seq method. Their main objective is to build a detailed spatial atlas that identifies distinct cellular programs and microenvironments within dysplastic lesions. Next, they correlated this observation with clinical outcomes in human colorectal cancer.

      Strengths:

      The work is a good example of utilising spatial methods to study different tumour models. The authors identified a unique stem cell program to understand tumours gently and improve patient stratification strategies.

      Weaknesses:

      However, the study's predominantly descriptive nature is a significant limitation. Although the spatial maps and correlations between cell states are interesting observations, the lack of functional validation-primarily through experiments in mouse models-weakens the causal inferences regarding the roles these cellular programs play in tumour progression and therapy resistance.

      We thank the reviewer for this comment. Indeed, functional validation to pin down causal dependencies and a more thorough investigation of tumor progression and therapy resistance both in mouse model as well as human patients and/or patient derived samples would broaden the insights to be gained from this work. Unfortunately, this is beyond the scope of this study.

      The authors also missed an opportunity to link the mutational status of malignant cells with the cellular neighbourhoods.

      The data reported in this study only contains spatial data for one mouse model (AV). As spatial data for the other model (AKPV) is missing, it is not possible to link the mutational type of the model with the cellular neighborhoods. We did investigate whether there is extra somatic mutational heterogeneity in the AV data, both regarding single nucleotide variations (SNVs) and copy number variations (CNVs). But at the time when the mice were sacrificed (after 3 weeks) there was no significant mutational heterogeneity discoverable.

      Overall, the study contributes to profiling the dysplastic colon landscape. The methodologies and data will benefit the research community, but further functional validation is crucial to validate the biological and clinical implications of the described cellular interactions.

      Reviewer #2 (Public review):

      In their study, Avraham-Davidi et al. combined scRNA-seq and spatial mapping studies to profile two preclinical mouse models of colorectal cancer: Apcfl/fl VilincreERT2 (AV) and Apcfl/fl LSL-KrasG12D Trp53fl/fl Rosa26LSL-tdTomato/+ VillinCreERT2 (AKPV). In the first part of the manuscript, the authors describe the analysis of the normal colon and dysplastic lesions induced in these models following tamoxifen injection. They highlight broad variations in immune and stromal cell composition within dysplastic lesions, emphasizing the infiltration of monocytes and granulocytes, the accumulation of IL-17+gdT cells, and the presence of a distinct group of endothelial cells. A major focus of the study is the remodeling of the epithelial compartment, where the most significant changes are observed. Using non-negative matrix factorization, the authors identify molecular programs of epithelial cell functions, emphasizing stemness, Wnt signaling, angiogenesis, and inflammation as major features associated with dysplastic cells. They conclude that findings from scRNA-seq analyses in mouse models are transposable to human CRC. In the second part of the manuscript, the authors aim to provide the spatial context for their scRNA-seq findings using Slide-seq and TACCO. They demonstrate that dysplastic lesions are disorganized and contain tumor-specific regions, which contextualize the spatial proximity between specific cell states and gene programs. Finally, they claim that these spatial organizations are conserved in human tumors and associate region-based gene signatures with patient outcomes in public datasets. Overall, the data were collected and analyzed using solid and validated methodology to offer a useful resource to the community.

      Main comments:

      (1) Clarity

      The manuscript would benefit from a substantial reorganization to improve clarity and accessibility for a broad readership. The text could be shortened and the number of figure panels reduced to emphasize the novel contributions of this work while minimizing extensive discussions on general and expected findings, such as tissue disorganization in dysplastic lesions. Additionally, figure panels are not consistently introduced in the correct order, and some are not discussed at all (e.g., Figure S1D; Figure 3C is introduced before Figure 3A; several panels in Figure 4 are not discussed). The annotation of scRNA-seq cell states is insufficiently explained, with no corresponding information about associated genes provided in the figures or tables. Multiple annotations are used to describe cell groups (e.g., TKN01 = γδ T and CD8 T, TKN05 = γδT_IL17+), but these are not jointly accessible in the figures, making the manuscript challenging to follow. It is also not clear what is the respective value of the two mouse models and time points of tissue collection in the analysis.

      We thank the reviewer for this suggestion. We clarified and simplified the revised manuscript, however we believe that the current discussions are an important part of the manuscript and would be useful to readers. We reordered panels in Figures S1 and 3 to align with their appearance in the manuscript. We kept the order of other panels as it is to keep both context and coherence of those figures intact. We changed the way we reference cell clusters in the manuscript to better align with the naming scheme introduced in Figure 1B. The respective value of the two mouse models as well as the time points of tissue collection are described in lines 108-120 of the manuscript.

      (2) Novelty

      While the study is of interest, it does not present major findings that significantly advance the field or motivate new directions and hypotheses. Many conclusions related to tissue composition and patient outcomes, such as the epithelial programs of Wnt signaling, angiogenesis, and stem cells, are well-established and not particularly novel. Greater exploration of the scRNA-seq data beyond cell type composition could enhance the novelty of the findings. For instance, several tumor microenvironment clusters uniquely detected in dysplastic lesions (e.g., Mono2, Mono3, Gran01, Gran02) are identified, but no further investigation is conducted to understand their biological programs, such as applying nNMF as was done for epithelial cells. Additional efforts to explore precise tissue localization and cellular interactions within tissue niches would provide deeper insights and go beyond the limited analyses currently displayed in the manuscript.

      We thank the reviewer for this comment. Our study aimed to spatially characterize the tumor microenvironment, with scRNA-seq analysis serving to support this spatial characterization.

      Due to technical limitations—such as the number of samples and the limited capture efficiency of Slide-seq—the resolution of immune cell identification in our spatial analysis is constrained. Additionally, while immune and stromal cells formed distinct clusters, epithelial cells exhibited a continuum that was better captured using nNMF.

      Lastly, our manuscript provides a general characterization of monocyte and granulocyte populations in scRNA-seq (line 144) and their spatial microenvironments (line 400). We believe that additional analyses of these populations would be beyond the scope of this study and could place an unnecessary burden on the reader. Instead, we suggest that such analyses be explored in future studies.

      We remark that we analyzed tissue localization for two entirely different spatial transcriptomics assays (Slide-seq and Cartana) at the resolution of cell types and programs, which was feasible within the constraints of the sparsity, gene panel and sample size in the experiments. A future potential path to further increase the resolution of investigation in this dataset is to include other datasets, e.g. by the emerging transformer-based spatial transcriptomics integration methods.

      We also remark that the manuscript already includes an investigation of cellular interactions within tissue niches based on COMMOT (Fig 4k, Fig S8i, Supp Item 4).

      (3) Validation

      Several statements made by the authors are insufficiently supported by the data presented in the manuscript and should be nuanced in the absence of proper validation. For example:

      (a) RNA velocity analyses: The conclusions drawn from these analyses are speculative and need further support.

      We thank the reviewer for this comment. We clarified that our conclusions from the RNA velocity analysis need further support by experimental validation (lines 223-225), which is outside the scope of the current study.

      (b) Annotations of epithelial clusters as dysplastic: These annotations could have been validated through morphological analyses and staining on FFPE slides.

      We thank the reviewer for this comment. While this could have been a possible approach, our study primarily relies on scRNA-seq, which does not preserve tissue morphology, and Slide-seq of fresh tissue, where such an analysis is particularly challenging.

      (c) Conservation of mouse epithelial programs in human tumors: The data in Figure S5B does not convincingly demonstrate the enrichment of stem cell program 16 in human samples. This should be more explicitly stated in the text, given the emphasis placed on this program by the authors.

      We thank the reviewer for pointing this out. We clarified the section about the stem cell program 16 and references to Figures S5A and S5B (lines 269-274): while we do see correlation in the definition of human programs with the mouse stem cell program (Figure S5A), we do not see a correlated expression of the stem cell program across human and mouse (Figure S5B).

      (d) Figure S6E: Cluster Epi06 is significantly overrepresented in spatial data compared to scRNA-seq, yet the authors claim that cell type composition is largely recapitulated without further discussion, which reduces confidence in other conclusions drawn.

      We thank the reviewer for this remark. Indeed, Epi06 was a cluster which drew our attention during early analyses for its mixed expression profiles with contributions of vastly different cell types. We concluded that this is best explained by doublets, but we cannot rule out (partial) non-doublet explanations (e.g. undifferentiated cells). As doublet detection with Scrublet did not flag those cells as doublets, we kept these cells in the workflow, but excluded them from further interpretation. While in the previous version of the manuscript we only shortly hinted to this in figure legend 2A ("Cluster Epi06: doublets (not called by Scrublet)"), we expanded on this in the methods section of the revised manuscript (lines 863-869). Given the doublet interpretation, the observation that this cluster is significantly overrepresented in the annotation of the spatial data is not surprising as this annotation comes from the decomposition of compositional data which contains contributions of multiple cells per Slide-seq bead which are structurally very similar to doublets. While Epi06 appears enriched in S6E when comparing Slide-Seq to scRNA-seq, there are multiple technical  cross platform differences, including different per-gene sensitivities or capture biases for certain cell types (e.g. stromal cells suffering more from dissociation in scRNA compared to Slide-Seq). We believe that comparisons between disease states within a single platform are more biologically meaningful, like the comparison between normal and premalignant tissue, which is presented in Figure S6G. To increase confidence in the analysis and to assess whether intra-platform biological conclusions are affected by the inclusion/exclusion of Epi06, we recreated Figure S6G for a Slide-Seq cell type annotation without Epi06 in the reference (see Author response image 1). Even though Epi06 is missing in that annotation, the strong enrichments are consistently preserved between the two analysis variants, while as expected some less significant enrichments with larger FDR values are not preserved.

      Author response image 1.

      Significance (FDR, color bar, two-sided Welch’s t test on CLR-transformed compositions) of enrichment (red) or depletion (blue) of cell clusters (rows) in normal (N) or AV (AV) tissues based on Slide-seq (“spatial”) data or scRNA-seq ("sc”) including (A) or excluding (B) Epi06 in the reference for annotating the Slide-Seq data (A is identical to Figure S6G in the manuscript).<br />

      Furthermore, stronger validation of key dysplastic regions (regions 6, 8, and 11) in mouse and human tissues using antibody-based imaging with markers identified in the analyses would have considerably strengthened the study. Such validation would better contextualize the distribution, composition, and relative abundance of these regions within human tumors, increasing the significance of the findings and aiding the generation of new pathophysiological hypotheses.

      We agree with the reviewer with their assessment that validation by antibody-based imaging (or other spatial proteomics data) would have been useful follow-up experiments, yet these are beyond the scope of the current study.

      Reviewer #1 (Recommendations for the authors):

      AV and AKPV have different oncogenic mutations, and their impact on spatial neighbourhoods is unclear. Can authors perform an analysis to understand the contribution of oncogenic mutations on the spatial landscape of CRC?

      The data reported in this study only contains spatial data for one mouse model (AV). As spatial data for the other model (AKPV) is missing, it is not possible to comparatively link the mutational type of the model with the spatial landscape.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review)

      (1) The authors postulate a synergistic role for Itgb1 and Itgb3 in the intravasation phenotype, because the single KOs did not replicate the phenotype of the DKO. However, this is not a correct interpretation in the opinion of this reviewer. The roles appear rather to be redundant. Synergistic roles would rather demonstrate a modest effect in the single KO with potentiation in the DKO.

      We agree that the interaction between Itgb1 and Itgb3 appears redundant and we have corrected this point in the revised manuscript (page 10).

      (2) The experiment does not explain how these integrins influence the interaction of the MK with their microenvironment. It is not surprising that attachment will be impacted by the presence or absence of integrins. However, it is unclear how activation of integrins allows the MK to become "architects for their ECM microenvironment" as the authors posit. A transcriptomic analysis of control and DKO MKs may help elucidate these effects.

      We do not yet understand how the activation of α5β1 or αvβ3 integrins affects ECM remodeling by megakaryocytes. Integrins are key regulators of ECM remodeling (see https://doi.org/10.1016/j.ceb.2006.08.009) and can transmit traction forces that induce these changes (see https://doi.org/10.1016/j.bpj.2008.10.009). Our previous study also found reduced RhoA activation in double knockout (DKO) megakaryocytes (MKs) (Guinard et al., 2023, PMID: 37171626), which likely affects ECM organization. These findings are discussed in the Discussion section of the paper (page 14).

      As suggested, conducting a transcriptomic analysis of control and DKO MKs may help to elucidate these effects. However, isolating native rare MKs from DKO mice is technically challenging and requires too many animals. To overcome this issue, we instead isolated mouse platelets and used targeted RT-PCR arrays to profile key ECM remodelling (ECM proteins, proteases…) and adhesion molecules (Zifkos et al., Circ. Res. 2024, PMID, 38563147). Quality controls confirmed that integrin RNA was undetectable in the DKO samples, ruling out contamination. Nevertheless, we found no significant expression differences exceeding the 3-fold change threshold between the control and DKO groups. The high Ct (threshold cycles) values indicate low transcript abundance, which may mask subtle changes (see the scatter plot below). As an example, we present a typical result obtained for the reviewer.

      Author response image 1.

      Relative expression comparison of ECM related-genes between control and DKO integrins in washed platelets. The figure shows a log transformation plot of the relative expression level of each gene between normal (x-axis) and DKO integrins (y-axis). The lines indicate the threefold change threshold for gene expression. These are representative results from two independent experiments.

      (3) Integrin DKO have a 50% reduction in platelets counts as reported previously, however laminin α4 deficiency only leads to 20% reduction in counts. This suggests a more nuanced and subtle role of the ECM in platelet growth. To this end, functional assays of the platelets in the KO and wildtype mice may provide more information.

      The exact contribution of the extracellular matrix (ECM) cage to platelet growth remains incompletely understood. In the Lamα4⁻/⁻ model, a collagen-rich ECM cage persists alongside normal fibronectin deposition. By contrast, the integrin DKO model exhibits a markedly severe phenotype characterized by the loss of both the laminin cage and collagen and the absence of fibrillar fibronectin. Also, the preserved collagen and fibronectin in Lamα4⁻/⁻ mice may permit residual activation of signaling pathways - potentially via integrins or alternative mechanisms- compared to the DKO model. We appreciate the reviewer’s feedback on this adjustment, which has been incorporated into the discussion (page 15).

      As suggested by the reviewer, we performed functional assays that demonstrated normal platelet function in Lamα4⁻/⁻ mice and impaired integrin-mediated aggregation in Itgb1<sup>-/-</sup>/Itgb3<sup>-/-</sup>  mice, as shown by the new data presented in the publication (see pages 7 and 9). Platelet function remained preserved following treatment with MMP inhibitors. This supports the idea that differences in ECM composition can influence the signaling environment and megakaryocyte maturation, but do not fully abrogate platelet function (page 15).

      (4) There is insufficient information in the Methods Section to understand the BM isolation approach. Did the authors flush the bone marrow and then image residual bone, or the extruded bone marrow itself as described in PMID: 29104956?

      Additional methodological information has been provided to clarify that only the extruded bone marrow, and not the bone itself, is isolated (page 17).

      (5) The references in the Methods section were very frustrating. The authors reference Eckly et al 2020 (PMID : 32702204) which provides no more detail but references a previous publication (PMID: 24152908), which also offers no information and references a further paper (PMID: 22008103), which, as far as this reviewer can tell, did not describe the methodology of in situ bone marrow imaging.

      To address this confusion, we have added the reference "In Situ Exploration of the Major Steps of Megakaryopoiesis Using Transmission Electron Microscopy" by C. Scandola et al. (PMID : 34570102) in the « Isolation and preservation of murine bone marrow » section (page 20), which provides a standardized protocol for bone marrow isolation and in situ bone marrow imaging.

      Therefore, this reviewer cannot tell how the preparation was performed and, importantly, how can we be sure that the microarchitecture of the tissue did not get distorted in the process?

      Thank you for pointing this out. While we cannot completely rule out the possibility of distortion, we have clarified the precautions taken to minimize it. We used a double fixation procedure immediately after bone marrow extrusion, followed by embedding it in agarose to preserve its integrity as much as possible. We have elaborated on this point in greater detail in the Methods section of the revised version (page 18).

      Reviewer #2 (Public review):

      (1) ECM cage imaging

      (a) The value or additional information provided by the staining on nano-sections (A) is not clear, especially considering that the thick vibratome sections already display the entirety of the laminin γ1 cage structure effectively. Further clarification on the unique insights gained from each approach would help justify its inclusion.

      Ultrathin cryosectioning enables high-resolution imaging with a threefold increase in Z-resolution, facilitating precise analysis of signal superposition. This approach was particularly valuable for clearly visualizing activated integrin in contact with laminin and collagen IV fibers (see Fig. 3 in revised manuscript, pages 6, 8 and 18). Additionally, 3D reconstructions and z-stack data reveal complex interactions between the basement membrane and the cellular ECM cage that are not evident in 2D projections (see page 6). These complementary methods help elucidate the detailed molecular and three-dimensional organization of the ECM cage surrounding megakaryocytes. These points have been clarified in the method and result sections.

      (b) The sMK shown in Supplementary Figure 1C appears to be linked to two sinusoids, releasing proplatelets to the more distant vessels. Is this observation representative, and if so, can further discussion be provided?

      This observation is not representative; MKs can also be associated with just one sinusoid.

      (c) Freshly isolated BM-derived MKs are reported to maintain their laminin γ1 cage. Are the proportions of MKs with/without cages consistent with those observed in microscopy?   

      After mechanical dissociation and size exclusion, almost half of the MKs successfully retained their cages (53.4% ± 5.6%, based on 329 MKs from three experiments; see page 7 of the manuscript for new data). This highlights the strong physical connection between MK and their cage.

      (2) ECM cage formation

      (a) The statement "the full assembly of the 3D ECM cage required megakaryocyte interaction with the sinusoidal basement membrane" on page 7 is too strong given the data presented at this stage of the study. Supplemental Figure 1C shows that approximately 10% of pMKs form cages without direct vessel contact, indicating that other factors may also play a role in cage formation.

      The reviewer is correct. We have adjust the text to reflect a more cautious interpretation of our results. « Althought we cannot exclude that ECM cage can be form on its own, our data suggests that ECM cage assembly may require interactions between megakaryocytes and the sinusoidal basement membrane » suggests that the assembly of the 3D ECM cage may require interactions between megakaryocytes and the sinusoidal basement membrane » (page 7).

      (b) The data supporting the statement that "pMK represent a small fraction of the total MK population" (cell number or density) could be shown to help contextualize the 10% of them with a cage.

      Following the reviewer's recommendation, a new bar graph has been added to illustrate the 18 ± 1.3 % of MK in the parenchyma relative to the total MK in the bone marrow (page 7 and Suppl. Figure 1H).

      (c) How "the full assembly of the 3D ECM cage" is defined at this stage of the study should be clarified, specifically regarding the ECM components and structural features that characterize its completion.

      We recognize that the term ' full assembly' of the 3D ECM cage can be misleading, as it might suggest different stages of cage formation, such as a completed cage, one in the formation process, or an incomplete cage. Since we have not yet studied this concept, we have eliminate the term "full assembly" from the manuscript to avoid confusion. Instead, we mention the presence of a cage.

      (3) Data on MK Circulation and Cage Integrity: Does the cage require full component integrity to prevent MK release in circulation? Are circulating MKs found in Lama4-/- mice? Is the intravasation affected in these mice? Are the ~50% sinusoid associated MK functional?  

      In lamα4-deficient (Lamα4-/-) mice, which possess an intact collagen IV cage but a structurally compromised laminin cage, electron microscopy and whole-mount imaging revealed an absence of intact megakaryocytes within the sinusoidal lumen. This observation indicates that the structural integrity of all components of the ECM cage is critical for preventing megakaryocyte entry into the circulation. Despite the laminin deficiency, mature Lamα4-/- megakaryocytes exhibited normal ultrastructure and maintained typical intravasation behavior. Furthermore, analysis of bone marrow explants from Lamα4-/- mice demonstrated that megakaryocytes retained their capacity to extend proplatelets. These findings are presented on page 7 and further discussed on page 14.

      (4) Methodology

      (a) Details on fixation time are not provided, which is critical as it can impact antibody binding and staining. Including this information would improve reproducibility and feasibility for other researchers.

      We have included this information in the methods section.

      (b) The description of 'random length measuring' is unclear, and the rationale behind choosing random quantification should be explained. Additionally, in the shown image, it appears that only the branching ends were measured, which makes it difficult to discern the randomness in the measurements.

      The random length measurement method uses random sampling to provide unbiased data on laminin/collagen fibers in a 3D cage. Contrary to what the initial image might have suggested, measurements go beyond just the branching ends ; they include intervals between various branching points throughout the cage. This is now explained page 19.

      To clarify this process, we will outline these steps page 19 as : 1) acquire 3D images, 2) project onto 2D planar sections, 3) select random intersection points for measurement, 4) measure intervals using ImageJ software, and 5) repeat the process for a representative dataset. This will better illustrate the randomness of our measurements.

      (5) Figures

      (a) Overall, the figures and their corresponding legends would benefit from greater clarity if some panels were split, such as separating images from graph quantifications.

      Following the reviewer’s suggestion, we will fully update all the Figures and separate images from graph quantifications.

      Reviewer #3 (Public review):

      (1) The data linking ECM cage formation to MK maturation raises several interesting questions. As the authors mention, MKs have been suggested to mature rapidly at the sinusoids, and both integrin KO and laminin KO MKs appear mislocalized away from the sinusoids. Additionally, average MK distances from the sinusoid may also help separate whether the maturation defects could be in part due to impaired migration towards CXCL12 at the sinusoid. Presumably, MKs could appear mislocalized away from the sinusoid given the data presented suggesting they leaving the BM and entering circulation. Additional data or commentary on intrinsic (ex-vivo) MK maturation phenotypes may help strengthen the author's conclusions and shed light on whether an essential function of the ECM cage is integrin activation at the sinusoid.

      The idea that megakaryocytes move toward CXCL12 is still debated. Some studies suggest mature MKs are mainly sessile (PMID: 28743899), while others propose that CXCL12 may guide MK progenitors rather than mature MKs (PMID: 38987596, this reference has been added). To address the reviewer’s concerns regarding CXCL12-mediated migration, we conducted additional investigations.

      For DKO integrins, Guinard et al. (2023, PMID: 37171626) reported no significant change in the distance between MKs and sinusoids, indicating that integrin deficiency does not impair MK migration toward sinusoidal vessels.

      In our own study involving Lamα4-/- mice, we utilized whole-mount bone marrow preparations, labeling MKs with GPIbβ antibodies and sinusoids with FABP4 antibodies. We observed a 1.6-fold increase in the proximity of MKs to sinusoids in Lamα4-/- mice compared to controls (see figure below). However, the absolute distances measured were less than 3 µm in both groups, much smaller than the average diameter of a mature MK (20 - 25 µm), raising questions about the biological significance of these findings in active MK migration. What happens with MK progenitors - a population not detectable in our experiments using morphological criteria or GPIb staining - remains an open question.

      These results are provided for the reviewer’s information and will be available to eLife readers, along with the authors’ responses, in the revised manuscript.

      Author response image 2.

      (2) The data demonstrating intact MKs in the circulation is intriguing - can the authors comment or provide evidence as to whether MKs are detectable in blood? A quantitative metric may strengthen these observations.

      To investigate this, we conducted flow cytometry experiments and prepared blood smears to determine the presence of intact Itgb1-/-/Itgb3-/- megakaryocytes in the blood. Unfortunately, we could not detect any intact megakaryocytes in the blood samples using FACS (see new Supplementary Figure 4E) nor any on the blood smears (data not shown). However, we observed that large, denuded megakaryocyte nuclei were retained in the downstream pulmonary capillaries of these mice. Intravital imaging of the lung has previously provided direct evidence for the phenomenon of microvascular trapping (Lefrançois et al., 2017; PMID: 28329764), demonstrating that megakaryocytes can be physically entrapped within the pulmonary circulation due to size exclusion while releasing platelets. This has been clarified in the revised paper (Results section, page 10).

      (3) Supplementary Figure 6 - shows no effect on in vitro MK maturation and proplt, or MK area - But Figures 6B/6C demonstrate an increase in total MK number in MMP-inhibitor treated mice compared to control. Some additional clarification in the text may substantiate the author's conclusions as to either the source of the MMPs or the in vitro environment not fully reflecting the complex and dynamic niche of the BM ECM in vivo.

      This is a valid point. We have revised the text to be more cautious and to provide further clarification on these points (page 12).

      (4) Similarly, one function of the ECM discussed relates to MK maturation but in the B1/3 integrin KO mice, the presence of the ECM cage is reduced but there appears to be no significant impact upon maturation (Supplementary Figure 4). By contrast, MMP inhibition in vivo (but not in vitro) reduces MK maturation. These data could be better clarified in the text, or by the addition of experiments addressing whether the composition and quantity of ECM cage components directly inhibit maturation versus whether effects of MMP-inhibitors perhaps lead to over-activation of the integrins (as with the B4galt KO in the discussion) are responsible for the differences in maturation.

      We thank the reviewer for pointing this out.

      In our study of DKO integrin mice with a reduced extracellular matrix (ECM) cage, we observed normal proportions of MK maturation stages. However, these mutant MKs had a disorganized membrane system and smaller cytoplasmic areas compared to wild-type cells, indicating issues in their maturation. This is detailed further in the manuscript (see page 9).

      In the context of MMP inhibition in vivo, which also leads to reduced MK maturation, our immunofluorescence analysis revealed in an increased presence of activated β1 integrin in bone marrow sections (see Supplementary Figure 6E). As suggested by the reviewer, this increase may explain the maturation defect.

      In summary, while it's challenging to definitively determine how ECM cage composition and quantity affect MK maturation in vivo, our results show that changes to the ECM cage - whether through genetic modification (DKO) or MMP inhibition - are consistently linked to defects in MK maturation.

      Reviewer #1 (Recommendations for the authors):

      (1) Movies 1-3 are referenced in the Results section, but this reviewer was not able to find a movie file.

      They have now been added to the downloaded revised manuscript.

      (2) Figure 2D is referenced in the Results Section but this panel is not present in the Figure itself. Instead, this seems to be what is referred to as the right panel of 2C. 

      Thank you. Following the suggestion of reviewer 2, we have now split the panels and separated the images from the graph quantifications. This change has modified all the panel annotations, which we have carefully checked both in the legend and in the manuscript.

      (3) Supplemental Fig 3C has Fibrinogen quantification which seems to belong in Supplemental 3 F instead.  

      Supplementary Figure 3C serves as a control for immunofluorescence, indicating that no fibrinogen-positive granules are detectable in the DKO mice. This supports the conclusion that the αIIbβ3 integrin-mediated fibrinogen internalization pathway is non-functional in this model, affirming the bar graph's placement. We appreciate the reviewer’s insight that similar results may arise from the IEM experiments in Figure 3H, which is valuable for strengthening our findings.

      (4) The x-axis labels in Supplemental 5B are not uniform.  

      This has be done. Thank you.

      Reviewer #2 (Recommendations for the authors):

      (1) Figure 1 Panel C: The sinusoidal basement membrane staining is missing, making it difficult to conclude that the collagen IV organization extends radially from the sinusoidal basement membrane.

      As recommended by the reviewer, we have updated Figure 1C with a new image illustrating the basement membrane (FABP4 staining) and the collagen IV cage. This new image confirms that the cage extends radially from the basement membrane.

      (2) Arrows in 1B: Based on the arrow's localisation, the description of "basement membrane-cage connection" is not evident from the images as it looks like the signal colocalization (right lower panel) occurs below the highlighted areas. Clarification or additional evidence of co-localization is required. 

      The apparent localization of the signal "below" the highlighted areas in the maximal projection image is due to the nature of 2D projections, which compress overlapping signals from multiple depths within the bone marrow into a single plane. This can obscure the spatial relationship between the basement membrane and extracellular matrix (ECM) components. However, when the complete z-stack series is examined, the direct connection between the basement membrane and the ECM cage becomes evident in three dimensions. Therefore, we have now added a comprehensive analysis of the entire z-stack dataset, allowing us to accurately interpret the spatial relationships between the basement membrane and ECM in the native bone marrow microenvironments (movies 1 and 2, and Suppl. Figure 1D-E).

      (3) In Figure 4C, GPIX is used to identify MKs by IVM while GP1bβ is used throughout the rest of the manuscript. It would be helpful for readers who are less familiar with MKs to understand whether GPIX and GP1bβ identify the same population of MKs and the rationale for choosing one marker over the other.  

      GPIX and GPIbβ are components of the GPIb-IX complex, identifying mature megakaryocytes (Lepage et al., 2000, PMID : 11110688). The choice of one over the other in different experiments is primarily based on technical considerations. The intravital experiments have been standardized using an AF488-conjugated anti-GPIX to identify mature megakaryocytes consistently. GPIbβ (GP1bβ) is used in the rest of the manuscript due to its strong and specific bright staining. We have clarified this point in the Result (page 10) and in the Material/methods section (page 17).

      (4) The term "total number of MKs" is used (p8), but the associated data presented in the figure reflect MK density per surface area. Descriptions in the text should align with the data format in the figures.

      This has been corrected in the revised manuscript (page 8). Thank you.

      (5) Supplemental Figure 1(B): Collagen I is written as Collagen III in the legend.

      This has been corrected in the legend of the Figure 1B.

      (6) Figure 2D is described in the text but is missing from the figure.

      This has been corrected.

      (7) Supplemental Figure 3: Plot E overlaps with the images, making it unclear.

      To minimise overlap with the images, we've moved the graph with the bars down. Thank you.

      (8) Supplemental Figure 7: The image quality is too low, and spelling underlining issues are present. A better-quality version with clear labelling is essential.

      We have improved the quality of Figure 7 and fixed the underlining problems.

      (9) The movies were not found in the downloads provided.

      They have now been added to the downloaded revised manuscript.

      (10) Some bar graphs are missing the individual data points.

      All figures have been standardized and now include the individual data points.

      Reviewer #3 (Recommendations for the authors):

      Some minor comments:

      (1) If there is specific importance to some of the analyses of the cage structure, such as fiber length, and pore size, (eg. if they may have biological significance to the MK) it may help readers to give additional context to what differences in the pore size might imply. For example, do pores constrain MKs at sites where actin-driven proplatelet formation could be initiated?

      The effects of extracellular matrix (ECM) features - like fiber length and pore size - on megakaryocyte (MK) biology are not fully understood. Longer ECM fibers may help MKs adhere better and sense their environment. Larger pores could make it easier for MKs to grow, communicate, and extend proplatelets through blood vessel walls. The role of matrix metalloproteinases (MMPs), which degrade the ECM, adds to the complexity, and how this occurs in vivo is not yet well understood.

      As suggested, some of these points have been addressed in the revised manuscript (Discussion, page 16).

      (2) "Although fibronectin and fibrinogen were readily detected around megakaryocytes, a reticular network around megakaryocytes was not observed. Furthermore, no connection was identified between fibronectin and fibrinogen deposition with the sinusoid basement membrane, in contrast to the findings for laminin and collagen IV (Supp. Figures 1E)." - Clarification of how these data are interpreted might be helpful as to what the authors are intending to demonstrate with these data as at least in Figure 1E, fibronectin, and fibrinogen do appear expressed along the MK surface and at the sinusoidal-MK interface.

      While fibronectin and fibrinogen are present around megakaryocytes and at the vessel-cell interface, they do not form a reticular ECM cage. The functional implications of this finding remain unclear. One can imagine that the specific spatial arrangement of various ECM components may lead to different functional roles. Laminin and collagen IV may provide structural support by forming a 3D cage that is essential for the proper positioning and maturation of megakaryocytes. In contrast, fibronectin and fibrinogen may have different functions, potentially related to megakaryocyte expansion in bone marrow fibrosis (Malara et al., 2019, PMID : 30733282) and (Matsuura et al., 2020, PMID : 32294178).  

      This topic has been adressed in the Results page 7 and discussion on page 13.

      (3) Given the effects of dual B1/B3 integrin inhibition on MK intravasation, can the authors comment on the use of integrin RGD-based inhibitors? Are these compounds and drugs likely to interfere with MK retention?

      Our study shows that MK retention depends on the integrity of both components of the cage, collagen IV and laminin (see also point 3 of reviewer 2). Collagen IV contains RGD sequences, making it susceptible to RGD-based inhibition, whereas laminin does not utilize the RGD motif, raising questions about the overall efficacy of these inhibitors.

      In addition, the in vivo efficacy and potential off-target effects of these inhibitors in the complex bone marrow microenvironment remain to be fully elucidated. This intriguing issue warrants further investigation.

      (4) Beyond protein components, other non-protein ECM molecules including glycosaminoglycans (HA, HS) have essential roles in supporting MK function, including maturation (PMIDs: 31436532, 36066492, 27398974) and may merit some brief discussion if the authors feel this is helpful.

      We followed reviewer’s suggestion and mention the contribution of glycoaminoglycans in MK maturation. We also added the three references (page 13). 

      (5) In several locations, the text refers to figure panels that are either not present or not annotated correctly (some examples include Figure 2D, Supplementary Figure 3E vs 3D).

      Following the suggestion of reviewer 2, we have now split the panels and separated the images from the graph quantifications. This change has changed all the panel annotations, which we have carefully checked both in the legend and in the manuscript.

      (6) In some cases, the figure legends seem to incorrectly refer to text, colors, or elements in the panels (e.g. Supplementary Figure 3, fibrinogen is referred to as yellow in the legend but is green in the figure). In Supplemental Figure 1, an image is annotated as pryenocyte in the figure, but splenocyte in the text.

      This has been corrected in the figures and in the revised manuscript. Please also see point (7) below.  Thank you very much.

      (7) Images demonstrating GPIX and GPIBb positive cells in the calvarial and lung microcirculation are convincing, but in Figure C these cells are referred to as MKs, whereas in Figure D they are referred to as pyrenocytes (as well as in the discussion). It is not clear if this is intentional and refers to bare nuclei from erythrocytes or indeed refers to MKs or MK nuclei. Clarification would help guide readers.

      We agree with the reviewer and fully acknowledge the need for clarification. We confirm that these circulating cells are megakaryocytes. To avoid confusion, we have ensure that all references to "pyrenocytes" have been replaced with "megakaryocytes."

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      This work starts with the observation that embryo polarization is asynchronous starting at the early 8-cell stage, with early polarizing cells being biased towards producing the trophectoderm (TE) lineage. They further found that reduced CARM1 activity and upregulation of its substrate BAF155 promote early polarization and TE specification, this piece of evidence connects the previous finding that at Carm1 heterogeneity 4-cell stage guide later cell lineages - the higher Carm1-expressing blastomeres are biased towards ICM lineage. Thus, this work provides a link between asymmetries at the 4-cell stage and polarization at the 8-cell stage, providing a cohesive explanation regarding the first lineage allocation in mouse embryos.

      Strengths:

      In addition to what has been put in the summary, the advanced 3D image-based analysis has found that early polarization is associated with a change in cell geometry in blastomeres, regarding the ratio of the long axis to the short axis. This is considered a new observation that has not been identified.

      Weaknesses:

      For the microinjection-based method to overexpression/deletion of proteins, although it has been shown to be effective in the early embryo settings and has been widely used, it may not fully represent the in vivo situation in some cases, compared to other strategies such as the use of knock-in mice. This is a minor weakness; it would be good to include some sentences in the discussion on the potential caveats.

      We thank the reviewer for their insightful summary of our work, and their adjudication on the novelty of our research. We agree with the reviewer that microinjection-based methods, whilst being the standard and widely used in the field, have their weaknesses. In this study, we have primarily used microinjection of previously tested and known constructs which may help mitigate these concerns, and have referenced numerous studies in which these constructs have been used and tested. Nevertheless, the authors are aware of this drawback and have tried to address this previously in other research using novel artificial intelligence techniques (Shen and Lamba et al., 2022 – cited in the manuscript) and this continues to be an active area of investigation for us.

      Reviewer #2 (Public review):

      Summary:

      In this study, Lamba and colleagues suggest a molecular mechanism to explain cell heterogeneity in cell specification during pre-implantation development. They show that embryo polarization is asynchronous. They propose that reduced CARM1 activity and upregulation of its substrate BAF155 promote early polarization and trophectoderm specification.

      Strengths:

      The authors use appropriate and validated methodology to address their scientific questions. They also report excellent live imaging. Most of the data are accompanied by careful quantifications.

      Weaknesses:

      I think this manuscript requires some more quantification, increased number of embryos in their evaluations and clearly stating the number of embryos evaluated per experiments.

      We thank the reviewer for these thoughtful comments on our work, their kind assessment of the strength of our research, and their notes on the weaknesses. We have replied to their points raised below.

      Here are some points:

      (1) It should be clearly stated in all figure legends and in the text how many cells from how many embryos were analyzed.

      We appreciate this comment to provide detailed quantification for every experiment in the paper and stating the numbers of embryos (if a whole embryo level experiment) or blastomeres used for statistical tests and displayed in the graph.

      (2) I think that the number of embryos sometimes are too low. These are mouse embryos easily accessible and the methods used are well established in this lab, so the authors should make an effort to have at least 10/15 embryos per experiment. For example "In agreement with this, hybridization chain reaction (HCR) RNA fluorescence in situ hybridization of early 8-cell stage embryos revealed that the number of CDX2 mRNA puncta was higher in polarized blastomeres with a PARD6-positive apical domain than in unpolarized blastomeres, for 5 out of 6 embryos with EP cells (Figure 3A, B)".. or the data for Figure 4, we know how many cells but now how many embryos.

      We appreciate the reviewer’s comment regarding the number of embryos used in the hybridization chain reaction (HCR) experiment. We agree that increasing the number of embryos could, in principle, further add statistical power. However, both first authors have since left the lab to begin their postdoctoral training or joining a company, and it is not feasible for us to generate additional embryos at this stage.

      Importantly, we believe the number of embryos included in the current manuscript is sufficient to support our conclusions, especially when considered in the context of the broader experimental design, the timing of the study, and our ethical commitment to minimizing animal use.

      Notably, the initial HCR experiment targeting Cdx2 mRNA served as a key indication that prompted further investigation of CDX2 at the protein level. These follow-up experiments were conducted with increased numbers of embryos and/or cells and are presented in Figure 3 and the associated supplementary figures (we now have 124 cells (including 23 EP cells) from 16 embryos), thereby strengthening and confirming the conclusion suggested by the HCR data.

      (3) It would be useful to see in Figure 4 an example of asymmetric cell division as done for symmetric cell division in panel 4B. This could really help the reader to understand how the authors assessed this.

      We used live imaging to track cell division patterns. Cells expressing RFP-tagged polarity proteins were observed during division to identify the resulting daughter cells. Immediately after cytokinesis, we assessed the polarity status of each daughter cell. If both daughter cells were polarized, the division was classified as symmetric; if only one was polarized, it was classified as asymmetric.

      Author response image 1.

      8-cell stage embryos expressing Ezrin-RFP (fire colour) was imaged during 8-16 cell stage division. Top panel arrows indicate a symmetric cell division in which polarity domain became partitioned into both daughter cells; bottom panel indicates asymmetric division in which the polarity domain only get inherited to one cell of the two daughter cells.

      (4) Figure 5C there is a big disproportion of the number of EP and LP identified. Could the authors increase the number of embryos quantified and see if they can increase EP numbers?

      We thank the reviewer for this comment and want to clarify an important detail: EP cells are a phenomenon with average cellular frequency of less than 10% as compared to LP cells (the other 90%). Therefore, when investigating natural embryo development without bias or exclusion, there will likely be an imbalance in the number of EP and LP cells as is the case for Figure 5C. In this case, morphological differences and clear statistical significance were seen between the shape of EP and LP cells within the cells quantified and therefore we decided not to expend further mice for this particular experiment – but we agree with the comment that in most cases additional embryos would help strength our conclusions further.

      (5) Could the authors give more details about how they mount the embryos for live imaging? With agarose or another technique? In which dishes? Overlaid with how much medium and oil? This could help other labs that want to replicate the live imaging in their labs. Also, was it a z-stack analysis? If yes, how many um per stack? Ideally, if they also know the laser power used (at least a range) it would be extremely useful.

      We thank the reviewer for this comment and have provided additional detail here and in the Methods section. For live imaging our embryos, we used glass-bottom 35 mm dishes. We then fixed a small cut square of nylon mesh (5mm to 1cm width and height) onto this plate in the centre using silicon which was used as a grid (diameter of approximately 150 micrometres) for deposition of embryos. After drying of the silicon (overnight) and washing with water, the grid was overlaid with a drop of 100 microlitres of KSOM and then covered with mineral oil until this KSOM drop was submerged. After incubation under conditions for live imaging, single embryos were deposited in each ‘well’ of the grid before being placed in the microscope, which was equilibrated at the correct temperature and CO2.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Reviews):

      Weaknesses: 

      Overall I find the data presented compelling, but I feel that the number of observations is quite low (typically n=3-7 neurons, typically one per animal). While I understand that only a few slices can be obtained for the IPN from each animal, the strength of the novel findings would be more convincing with more frequent observations (larger n, more than one per animal). The findings here suggest that the authors have identified a novel mechanism for the normal function of neurotransmission in the IPN, so it would be expected to be observable in almost any animal. Thus,  it is not clear to me why the authors investigated so few neurons per slice and chose to combine different treatments into one group (e.g. Figure 2f), even if the treatments have the same expected effect.  

      This is a well taken suggestion. However, we must  point out that we do perform statistical analyses on the original datasets and we believe that our conclusions are justified as acknowledged by the Reviewer. As the Reviewer is aware,  the IPN is a small nucleus and with the slicing protocol used, we typically attain 1-2 slices per mouse that are suitable for recordings. Since most of the experiments in the manuscript deals with some form of pharmacological interrogation, we were reticent to use slices that are not naïve and therefore in general did not perform more than 1 cell recording per slice. Having said this, to comply with the Reviewer’s suggestion we have now performed additional experiments to increase the n number for certain experiments. We have amended all figures and legends to incorporate the additional data. We must point out that during the replotting of the data in the summary Figure 8i (previously Figure 7i) we noticed an error with the data representation of the TAC IPL data and have now corrected this oversight  

      Figure 2b,c. 

      500nM DAMGO effect on TAC IPL AMPAR EPSC – n increased from 5 to 9

      Figure 3g. 

      500nM DAMGO effect on CHAT IPR AMPAR EPSC – n increased from 8 to 16 Effect of CTAP on DAMGO on CHAT IPR AMPAR EPSC – n increased from 4 to 7

      Figure 3i. 

      500nm DAMGO or Met-enk effect in “silent” CHAT IPR AMPAR EPSC – n increased    from 7 to 9

      Figure 4e. 

      500nM DAMGO effect on ES coupling – Note: in the original version the n number was 5 and not 7 as written in the figure legend. We have now increased the n from 5 – 9.

      Figure 5e,f. 

      500nM DAMGO effect on TAC IPR AMPAR EPSC – n increased from 5 to 9

      Figure 7f.

      Effect of DHE on EPSC amplitude after application of DNQX/APV/4-AP or DTX-α – n increased from 7-9.

      Figure 7g.

      Emergence of nAChR EPSC after DTX – n increased from 4 to 7

      Figure 7i. 

      Effect of ambenonium on nAChR amplitude and charge – n increased from 4 to 7

      Supplementary Figure 3c and h

      Effect of DAMGO after DNQX – n increased from 4 to 7

      Effect of DNQX after DAMGO mediated potentiation – n increased from 3 to 5.

      Throughout the study (Figs. 3i, 7f and 8h in the revised manuscript)  we do indeed pool datasets that were amassed from different conditions since we were not directly investigating the possibility of any deviation in the extent of response between said treatments. For example, and as pointed out by the Reviewer, in Fig. 2F (now Fig. 3i) the use of DAMGO and met-ENK were merely employed to ascertain whether light-evoked synaptic transmission (ChATCre:ai32 mice) in cells that had no measurable EPSC could be pharmacologically “unsilenced” by mOR activation. Thus, the means by which mOR receptor was activated was not relevant to this specific question. Note: 2 more recordings are now added to this dataset (Fig. 3i) that were taken from ChATChR2/SSTCre:ai9 mice in response to the comment by this Reviewer below (“Are there baseline differences in the electrophysiological or morphological properties of these "silent" neurons compared to the responsive neurons?”).  Similarly, in the revised Fig.7f we pooled data investigating the pharmacological block of the EPSC that emerged following application of either DNQX/APV/4-AP or DNQX/APV/DTX. Low concentrations 4-AP or DTX were interchangeably employed to reveal the DNQX-insensitive EPSC that we go on to show is indeed the nAChR response. Finally, in Fig. 8h, we pooled data demonstrating a  lack of effect of DAMGO in potentiating  both the glutamatergic and cholinergic arms of synaptic transmission in the OPRM1 KO mice. Again, here we were only interested in determining whether removal of mOR expression prevented potentiation of transmission mediated by mHB ChAT neurons irrespective of neurotransmitter modality.  Thus, overall we were careful to only pool data in those instances where it  would not change the interpretation and hence conclusions reached. 

      There are also significant sex differences in nAChR expression in the IPN that might not be functionally apparent using the low n presented here. It would be helpful to know which of the recorded neurons came from each sex, rather than presenting only the pooled data.  

      As the reviewer correctly states there are veins of literature concerning a divergence, based on sex, of not only nicotinic receptor expression but also behaviors associated with nicotine addiction. However, we have reanalyzed our datasets focusing on the extent of the mOR potentiation of glutamatergic and cholinergic transmission mediated by mHB ChAT neurons in IPR  between male and female mice. Please refer to the Author response image 1 below. Although there is a possible trend towards a higher potentiation of nAChR in female mice, this was not found to be of statistical significance (see Author response image 1 below). We therefore chose not to split our data in the manuscript based on gender.

      Author response image 1.

      Comparison of the mOR (500nM DAMGO) mediated potentiation on evoked (a) AMPAR and (b) nAChR  EPSCs in IPR between male and female mice.  

      There are also some particularly novel observations that are presented but not followed up on, and this creates a somewhat disjointed story. For example, in Figure 2, the authors identify neurons in which no response is elicited by light stimulation of ChAT-neurons, but the application of DAMGO (mOR agonist) un-silences these neurons. Are there baseline differences in the electrophysiological or morphological properties of these "silent" neurons compared to the responsive neurons?  

      Unfortunately, we did not routinely measure intrinsic properties of the recorded postsynaptic neurons nor systematically recovered biocytin fills to assess morphology. Therefore, it remains unclear whether the  neurons in which there were none or minimal AMPAR-mediated EPSCs are distinct to the ones displaying measurable responses. The IPR is resident to GABAergic SST neurons that comprise the most numerous neuron type in this IPN subdivision. Although heavily outnumbered by the SST neurons there are additionally VGluT3+ glutamatergic neurons in IPN. The Reviewer is likely referring to a recent study investigating synaptic transmission specifically onto  SST+ and VGluT3+ neurons in IPN demonstrating that mHB cholinergic mediated glutamatergic input is “weaker” onto the glutamatergic neurons. Furthermore, in some instances synaptic transmission onto this latter population can be “unsilenced” by GABAB receptor activation in a similar manner to that seen with mOR activation in this manuscript when IPR neurons are blindly targeted(Stinson & Ninan, 2025).  Using a similar strategy as in this recent study(Stinson & Ninan, 2025), we now include experiments in which the ChATChR2 mouse was crossed with  a SSTCre:Ai14. This allowed for recording of postsynaptic EPSCs in directly identified SST IPR neurons. We demonstrate that DAMGO can indeed increase glutamatergic EPSCs and in 2 of the cells where light activation demonstrated no appreciable AMPAR EPSC upon maximal LED light activation, DAMGO clearly “unsilenced” transmission.  Thus, our additional analyses directly demonstrate that our original observations concerning mOR modulation extend to the mHb cholinergic AMPAR mediated input onto IPR SST neurons. This additional data is in the revised manuscript (Figure 3D-F, I). Future experimentation will be required to determine if the propensity of encountering a  “silent” input that can be converted to robust synaptic transmission by mOR differs between these two cell types. Furthermore, it will be of interest to investigate if any differences exist in the magnitude of the cholinergic input or the mOR mediated potentiation of co-transmission between postsynaptic SST GABA and glutamatergic neuronal subtypes. 

      Reviewer #2 (Public review)

      Weaknesses: 

      The genetic strategy used to target the mHb-IPN pathway (constitutive expression in all ChAT+ and Tac1+ neurons) is not specific to this projection.  

      This is an important point made. We are acutely aware that the source of the synaptic input in IPN mediated by conditional expression of ChR2 employing  using transgenic cre driver lines does not confer specificity to mHB. This is particularly relevant considering one of the novel observations here relates to  a previously unidentified functional input from TAC1 neurons to the IPR. At this juncture we would like to point the Reviewer to the publicly available Connectivity Atlas provided by the Allen Brain Institute (https://connectivity.brain-map.org/). With reference to mHB TAC1 neuronal output, targeted viral injection into the habenula of Tac1Cre mice allows conditional expression of EGFP to SP neurons as evidenced by the predominant expression of reported fluorescence in dorsal mHB (see Author response image 2 a,b below). Tracing the axonal projections to the IPN clearly demonstrates dense fibers in IPL as expected but also arborization in  IPR (Author response image 2 a,c) . This pattern is reminiscent of that seen in the transgenic Tac1Cre:ai9 or ai32 mice used in the current study (Figs. 1c, 2a, 5c). Closer inspection of the fibers in the IPR reveals putative synaptic bouton like structures as we have shown in Fig. 5a,b (Author response image 2 d below).

      Author response image 2.

      Sterotaxic viral injection into mHB pf Tac1Cre mice taken from Allen Brain connectivity atlas (Link to Connectivity Atlas for mHb SP neuronal projection pattern)

      These anatomical data suggest that part of the synaptic input to the IPR originates from mHB TAC1 neurons although we cannot fully discount additional synaptic input from other brain areas that may impinge on the IPR. Indeed, as the Reviewer points out, it is evident that other regions including the nucleus incertus send outputs to the IPN(Bueno et al., 2019; Liang et al., 2024; Lima et al., 2017). However, it is unclear if neuronal inputs from these alternate sources {Liang, 2024 #123;Lima, 2017 #33}{Bueno, 2019 #178} are glutamatergic in nature AND mediated by a TAC1/OPRM1-expressing neuronal population. Nevertheless, we have now modified text in the discussion to highlight the limitations of using a transgenic strategy (pg 12, para 1).

      In addition, a braking mechanism involving Kv1.2 has not been identified.

      It is unclear to what the Reviewer is referring to here. Although most of our experiments pertaining to the brake on cholinergic  transmission by potassium channels use low concentrations of 4-AP (50100M) which have been used to block Shaker Kv1 channels there although at these concentrations there are additional action at other K+-channels such as Kv3, for instance. However, we essentially demonstrate that a selective Kv1.1 and Kv1.2 antagonist dendrotoxin replicates the 4-AP effects. We have now also included RNAseq data demonstrating the relative expression levels of Kv1 channel mRNA in mHb ChAT neurons (KCNA1 through KCNA6; Figure 6b). The complete absence of KCNA1 yet a high expression level of KCNA2 transcripts highly suggests a central role of Kv1.2 in unmasking nAChR mediated synaptic transmission. 

      Reviewer #3 (Public review)

      Weaknesses:  

      The significance of the ratio of AMPA versus nACh EPSCs shown in Figure 6 is unclear since nAChR EPSCs measured in the K+ channel blockers are compared to AMPA EPSCs in control (presumably 4-AP would also increase AMPA EPSCs). 

      We understand the Reviewer’s concern regarding the calculation of nicotinic/AMPA ratios since they are measured under differing conditions i.e. absence and presence of 4-AP, respectively. As the reviewer correctly points point 4-AP likely increases the amplitude of the AMPA receptor mediated EPSC. However, our intention of calculating this ratio was not to ascertain a measure of relative strengths of fast glutamatergic vs cholinergic transmission onto a given postsynaptic IPN neuron per se. Rather, we used the ratio as a means to normalize the size of the nicotinic receptor EPSC to the strength of the light stimulation (using the AMPA EPSC as the normalizing factor) in each individual recording. This permits a more meaningful comparison across cells/slices/mice . We apologize for the confusion and have amended the text in the results section to reflect this (pg 9; para2).

      The mechanistic underpinnings of the most now  results are not pursued. For example, the experiments do not provide new insight into the differential effects of evoked and spontaneous glutamate/Ach release by Gi/o coupled mORs, nor the differential threshold for glutamate versus Ach release. 

      Our major goal of the current manuscript was to provide a much-needed roadmap outlining the effects of opioids in the habenulo-interpeduncular axis. Of course, a full understanding of the mechanisms underlying such complex opioid actions at the molecular level will be of great value. We feel that this is beyond the scope of this already quite result dense manuscript but will be essential if directed manipulation of the circuit is to be leveraged to alter maladaptive behaviors associated with addiction/emotion during adolescence and in adult. 

      The authors note that blocking Kv1 channels typically enhances transmitter release by slowing action potential repolarization. The idea that Kv1 channels serve as a brake for Ach release in this system would be strengthened by showing that these channels are the target of neuromodulators or that they contribute to activity-dependent regulation that allows the brake to be released. 

      The exact mechanistic underpinnings that can potentially titer Kv1.2 availability and hence nAChR transmission would be essential to shed light on potential in vivo conditions under which this arm of neurotransmission can be modulated. However, we feel that detailed mechanistic interrogation constitutes significant work but one that future studies should aim to achieve. Thus, it presently remains unclear under what physiological or pathological scenarios result in attenuation of Kv1.2 to subsequently promote nAChR mediated transmission but as mentioned in the existing discussion future work to decipher such mechanisms would be of great value.

      Reviewer #1 (Recommendations for the authors): 

      Overall I find this to be a very interesting and exciting paper, presenting novel findings that provide clarity for a problem that has persisted in the IPN field: that of the conundrum that light-evoked cholinergic signaling was challenging to observe despite the abundance of nAChRs in the IPN. 

      Major concerns: 

      (1) The n is quite low in most cases, and in many instances, data from one figure are replotted in another figure. Given that the findings presented here are expected in the normal condition, it should not be difficult to increase the n. A more robust number of observations would strengthen the novel findings presented here. 

      Please refer to the response to the public review above.

      (2) In general, I find the organization of the figures somewhat disjointed. Sometimes it feels as if parts of the information presented in the results are split between figures, where it would make more sense to be together in a figure. For example, all the histology for each of the lines is in Figure 1, but only ephys data for one line is included there. It would be more logical to include the histology and ephys data for each line in its own figure. It would also be helpful to show the overlap of mOR expression with Tac1-Cre and ChAT-Cre terminals in the IPN. Likewise, the summarized Tac1Cre:Ai32 IPR data is in Figure 4, but the individual data is in Figure 5. 

      We introduce both ChAT and TAC1 cre lines in Figure 1 as an overview particularly for those readers who are not entirely familiar with the distinct afferent systems operating with the habenulointerpeduncular pathway.  However, in compliance with the Reviewer’s suggestion we have now restructured the Figures. In the revised manuscript, the functional data pertaining to the various transmission modalities mediated by the distinct afferent systems impinging on the subdivision of the IPN tested are now split into their own dedicated figure as follows:

      Figure 2. 

      mOR effect on TAC1neuronal glutamatergic output in IPL.

      Figure 3. 

      mOR effect on CHAT neuronal glutamatergic output in IPR.

      Figure 5. 

      mOR effect on TAC1neuronal glutamatergic output in IPR.

      Figure 8.

      mOR effect on CHAT neuronal cholinergic output in IPC.

      Supp. Fig. 1 mOR effect on CHAT neuronal glutamatergic output in IPC.

      We thank the Reviewer for their suggestions regarding the style of the manuscript. The restructuring has now resulted in a much better flow of the presented data.

      (3) The discussion is largely satisfactory. However, a little more discussion of the integrative function of the IPN is warranted given the opposing effects of MOR activation in the Tac vs ChAT terminals, particularly in the context of both opioids and natural rewards. 

      We thank the reviewer for this comment. However, we feel the discussion is rather lengthy as is and therefore we refrained from including additional text.  

      Minor concerns: 

      (1)  The methods are missing key details. For example, the stock numbers of each of the strains of mice appear to have been left out. This is of particular importance for this paper as there are key differences between the ChAT-Cre lines that are available that would affect observed electrophysiological properties. As the authors indicate, the ChAT-ChR2 mice overexpress VAChT, while the ChAT-IRES-Cre mice do not have this problem. However, as presented it is unclear which mice are being used. 

      We apologize for the omission - the catalog numbers of the mice employed have now been included in the methods section.

      We have now clearly included in each figure panel (single trace examples and pooled data) from which mice the data are taken from – in some instances the pooled data are from the two CHAT mouse strains employed. Despite the tendency of the ChATChR2 mice to demonstrate more pronounced nAChR mediated transmission (Fig. 7h),  we justify pooling the data since we see no statistical significance in the effect of mOR activation on either potentiating AMPA or nAChR EPSCs (Please refer to response to Reviewer 2, Minor Concern point 2)

      (2) Likewise, antibody dilutions used for staining are presented as both dilution and concentration, which is not typical. 

      We thank the reviewer for pointing out this inconsistency. We have amended the text in the methods to include only the working dilution for all antibodies employed in the study.

      (3) There are minor typos throughout the manuscript. 

      All typos have been corrected.

      Reviewer #2 (Recommendations for the authors): 

      The authors provide a thorough investigation into the subregion, and cell-type effect of mu opioid receptor (MOR) signaling on neurotransmission in the medial habenula to interpeduncular nucleus circuit (mHb-IPN). This circuit largely comprises two distinct populations of neurons: mHb substance P (Tac1+) and cholinergic (ChAT+) neurons. Corroborating prior work, the authors report that Tac1+ neurons preferentially innervate the lateral IPN (IPL) and rostral IPN (IPR), while ChAT+ neurons preferentially innervate the central IPN (IPC) and IPR. The densest expression of MOR is observed in the IPL and MOR agonists produce a canonical presynaptic depression of glutamatergic neurotransmission in this region. Interestingly, MOR signaling in the ChAT+ mHb projection to the IPR potentiates light-evoked glutamate and acetylcholine-mediated currents (EPSC), and this effect is mediated by a MOR-induced inhibition of Kv2.1 channels. 

      Major concerns: 

      (1) The method used for expressing channelrhodopsin (ChR2) into cholinergic and neurokinin neurons in the mHb (Ai32 mice crossed with Cre-driver lines) has limitations because all Tac1+/ChAT+ inputs to the IPN express ChR2 in this mouse. Importantly, the IPN receives inputs from multiple brain regions besides the IPN-containing neurons capable of releasing these neurotransmitters (PMID: 39270652). Thus, it would be important to isolate the contributions of the mHb-IPN pathway using virally expressed ChR2 in the mHb of Cre driver mice. 

      Please refer to the response to the public review above. 

      (2) Figure 4: The authors conclude that the sEPSC recorded from IPR originate from Tac1+ mHbIPR projections. However, this cannot be stated conclusively without additional experimentation. For instance, an optogenetic asynchronous release experiment. For these experiments it would also be important to express ChR2 virus in the mHb in Tac1- and ChAT-Cre mice since glutamate originating from other brain regions could contribute to a change in asynchronous EPSCs induced by DAMGO. 

      This is a well taken point. The incongruent effect of DAMGO on evoked CHAT neuronal EPSC amplitude and sEPSC frequency prompted us  to consider the the possibility of differing effect of DAMGO on a  secondary input. We agree that we do not show directly if the sEPSCs originate from a TAC1 neuronal population. Therefore, we have tempered our wording with regards the origin of the sEPSCs and  have also restructured the Figure in question moving the sEPSC data into supplemental data (Supplemental Fig. 2) 

      (3) Figure 5D: lt would be useful to provide a quantitative measure in a few mice of mOR fluorescence across development (e.g. integrated density of fluorescence in IPR). 

      We have now included mOR expression density across development  (Fig. 6). Interestingly, the adult expression levels of mOR in the IPR are essentially reached at a very early developmental age (P10) yet we see stark differences in the role of mOR activation in modulating glutamatergic transmission mediated by mHB cholinergic neurons. Note: since we processed adult tissue (i.e. >p40) for these developmental analyses we utilized these slices to also include an analysis of the relative mOR expression density specifically in adults between the subdivisions of IPN in Fig. 1.

      (4) Figure 6B: It would be useful to quantify the expression of Kcna2 in ChAT and Tac1 neurons (e.g. using FISH). 

      We thank the Reviewer for this suggestion. We have now included mRNA expression levels available from publicly available 10X RNA sequencing dataset provided by the Allen Brain Institute (Figure 7b).  

      (5) It would be informative to examine what the effects of MOR activation are on mHb projections to the (central) . 

      In response to this suggestion, we now have included  additional data in the manuscript in putative IPC cells that clearly demonstrate a similar DAMGO elicited potentiation of AMPAR EPSC to that  seen in IPR. These data are now included in the revised manuscript  (Supplemental Fig. 1; Fig. 8i). 

      (6) What is the proposed link between MOR activation and the inhibition of Kv1.2 (e.g. beta-Arrestin signaling, G beta-gamma interaction with Kv1.2, PKA inhibition?) 

      We apologize for any confusion. We do not directly test whether the potentiation of EPSCs upon mOR activation occurs via inhibition of Kv1.2.Although we have not directly tested this possibility we find it an unlikely underlying cellular mechanism, especially for the potentiation of the cholinergic arm of neurotransmission since in the presence of DNQX/APV, the activation of mOR does not result in any emergence of any nAChR EPSC (see Supplementary Fig. 3a-c)

      Minor concerns: 

      (1) Methods: Jackson lab ID# for used mouse strains is missing. 

      We apologize for this omission and have now included the mouse strain catalog numbers.

      (2) The authors use data from both ChAT-Cre x Ai32 and ChAT-ChR2 mice. It would be helpful to show some comparisons between the lines to justify merging data sets for some of the analyses as there appear to be differences between the lines (e.g. Figure 6G). 

      This is a well taken point. We have now provided a figure for the Reviewer (see below) that illustrates the lack of  significant difference between the mOR mediated potentiation of both mHB CHAT neuronal AMPAR and nAChR transmission between the two mouse lines employed despite a divergence in the extent of glutamatergic vs cholinergic transmission shown in Fig. 7g (previously Figure 6g). We have chosen not to include this data in the revised manuscript.

      Author response image 3.

      Comparison of the mOR (500nM DAMGO) mediated potentiation on evoked AMPAR (a) and nAChR (b)EPSCs in IPR between ChATCre:Ai32  and ChATChR2 mice.

      (3)  Line 154: How was it determined that the EPSC is glutamatergic? 

      We apologize for any confusion. In the revised manuscript we now clearly point to the relevant figures (see Supplementary Figs. 2a and 3) in the Results section (pg. 4, para 2; pg 7, para 1; pg 8, para2) where we determine that both the sEPSCs and ChAT mediated light evoked EPSCs recorded under baseline conditions are totally blocked by DNQX and hence are exclusively AMPAR events 

      (4) It would be helpful to discuss the differences between GABA-B mediated potentiation of mHbIPN signaling and the current data in more detail. 

      We are unclear as to what differences the Reviewer is referring to. At least from the perspective of ChAT neuronal mediated synaptic transmission, other groups (and in the current study; Fig. 7h) have clearly shown that GABA<sub>B</sub> activation markedly potentiates synaptic transmission like mOR activation. Nevertheless, based on our novel findings it would be of interest to determine whether the influence of GABA<sub>B</sub> is inhibitory onto the TAC mediated input in IPR and whether there is a developmental regulation of this effect as we demonstrate upon mOR activation. These additional comparisons between the effect of the two Gi-linked receptors may shed light onto the similarity, or lack thereof, regarding the underlying cellular mechanisms. We now have included a few sentences in the discussion to highlight this (pg 11, para 1).

      Reviewer #3 (Recommendations for the authors): 

      The abstract was confusing at first read due to the complex language, particularly the sentence starting with... Further, specific potassium channels... 

      The authors might want to consider simplifying the description of the experiments and the results to clarify the content of the manuscript for readers who many only read the abstract. 

      We have altered the wording of the abstract and hope it is now more reader friendly.

      The opposite effect of mOR activation on spontaneous EPSCs versus electrical or ChR2-evoked EPSCs is very interesting and raises the issue of which measure is most physiologically relevant. For example, it is unclear whether sEPSCs arise primarily from cholinergic neurons (that are spontaneously active in the slice, Figure 3), and if so, does mOR activation suppress or enhance cholinergic neuron excitability and/or recruitment by ChR2? While a full analysis of this question is beyond the scope of this manuscript, the assumption that glutamate release assayed by electrical/ChR2 evoked transmission is the most physiologically relevant might merit some discussion since sEPSCs presumably also reflect action-potential dependent glutamate release. One wonders whether mORs hyperpolarize cholinergic neurons to reduce spontaneous spiking yet enhance fiber recruitment by ChR2 or an electrical stimulus (i.e. by removing Na channel inactivation). The authors have clearly stated that they do not know where the mORs are located, and that the effects arising from disinhibition are likely complex. But they also might discuss whether glutamate release following synchronous activation of a fiber pathway by ChR2 or electrode is more or less physiologically relevant than glutamate release assayed during spontaneous activity. It seems likely that an equivalent experiment to Figure 3D, E using spontaneous spiking of IPR neurons would show that spiking is reduced by mOR activation. 

      We thank the Reviewer for this comment. As pointed it would be of interest to dissect the “network” effect of mOR activation but as the Reviewer acknowledges this is beyond the scope of the current manuscript. The Reviewer is correct in postulating that mOR activation results in hyperpolarization of mHB ChAT neurons.  A recent study(Singhal et al 2025) demonstrate that a subpopulation of ChAT neurons undergoes a reduction in firing frequency following DAMGO application. This is corroborated by our own observations although we chose not to include this data in our current manuscript (but see below).

      Additionally, the Reviewer questions whether ChR2/electrical stimulation is physiological. This is a well taken point and of course the simultaneous activation of potentially all possible axonal release sites is not the mode under which the circuit operates. Nevertheless, our data clearly demonstrates the ability of mORs to modulate release under these circumstances that must reflect an impact on spontaneous action potential driven evoked release.  Although the suggested experiment  could shed light on the synaptic outcomes of mOR receptor activation on ES coupling of downstream IPN neurons. Interpretation of the outcome would be confounded by the fact that postsynaptic IPN neurons also express mORs . Thus,  we would not be able to isolate the effects of presynaptic changes in modulating ES coupling from any direct postsynaptic effect on the recorded cell when in current clamp. 

      Together these additional sites of action of mOR (i.e. mHB ChAT somatodendritic and postsynaptic IPN neuron) only serve to further highlight the complex nature of the actions of opioids on the habenulo-interpeduncular axis warranting  future work to fully understand the physiological and pathological effects on the habenulo-interpeduncular axis as a whole.

      The idea that Kv2.1 channels serve as a brake raises the question of whether they contribute to activity-dependent action potential broadening to facilitate Ach release during trains of stimuli. 

      This is an interesting suggestion and one that we had considered ourselves. Indeed, as the Reviewer is likely aware and as mentioned in the manuscript, previous studies have shown nAChR signaling can be revealed under conditions of multiple stimulations given at relatively high frequencies.  We therefore attempted to perform high frequency stimulation (20 stimulations at 25Hz and 50Hz) in the presence of ionotropic glutamatergic receptor antagonists DNQX and APV. We have now included this data in the revised manuscript (Supplementary Fig 3b). As shown, this failed to engage nAChR mediated synaptic transmission in our hands. Interestingly there is evidence from reduced expression systems demonstrating that Kv1.2 channels undergo use-dependent potentiation(Baronas et al., 2015) in contrast to that seen with other K+-channels. Whether this is the case for the axonal Kv1.2 channels on mHB axonal terminals in situ is not known but this may explain the inability to reveal nAChR EPSCs upon delivery of such stimulation paradigms.  

      References 

      Baronas, V. A., McGuinness, B. R., Brigidi, G. S., Gomm Kolisko, R. N., Vilin, Y. Y., Kim, R. Y., … Kurata, H. T. (2015). Use-dependent activation of neuronal Kv1.2 channel complexes. J Neurosci, 35(8), 3515-3524. doi:10.1523/JNEUROSCI.4518-13.2015

      Bueno, D., Lima, L. B., Souza, R., Goncalves, L., Leite, F., Souza, S., … Metzger, M. (2019). Connections of the laterodorsal tegmental nucleus with the habenular-interpeduncular-raphe system. J Comp Neurol, 527(18), 3046-3072. doi:10.1002/cne.24729

      Liang, J., Zhou, Y., Feng, Q., Zhou, Y., Jiang, T., Ren, M., … Luo, M. (2024). A brainstem circuit amplifies aversion. Neuron. doi:10.1016/j.neuron.2024.08.010

      Lima, L. B., Bueno, D., Leite, F., Souza, S., Goncalves, L., Furigo, I. C., … Metzger, M. (2017). Afferent and efferent connections of the interpeduncular nucleus with special reference to circuits involving the habenula and raphe nuclei. J Comp Neurol, 525(10), 2411-2442. doi:10.1002/cne.24217

      Singhal, S. M., Szlaga, A., Chen, Y. C., Conrad, W. S., & Hnasko, T. S. (2025). Mu-opioid receptor activation potentiates excitatory transmission at the habenulo-peduncular synapse. Cell Rep, 44(7), 115874. doi:10.1016/j.celrep.2025.115874

      Stinson, H.E., & Ninan, I. (2025). GABA(B) receptor-mediated potentiation of ventral medial habenula glutamatergic transmission in GABAergic and glutamatergic interpeduncular nucleus neurons. bioRxiv doi.10.1101/2025.01.03.631193

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary: 

      Seon and Chung's study investigates the hypothesis that individuals take more risks when observed by others because they perceive others to be riskier than themselves. To test this, the authors designed an innovative experimental paradigm where participants were informed that their decisions would be observed by a "risky" player and a "safe" player. Participants underwent fMRI scanning during the task. 

      Strengths: 

      The research question is sound, and the experimental paradigm is well-suited to address the hypothesis. 

      Weaknesses:

      I have several concerns. Most notably, the manuscript is difficult to read in parts, and I suggest a thorough revision of the writing for clarity, as some sections are nearly incomprehensible. Additionally, key statistical details are missing, and I have reservations about the choice of ROIs.

      We appreciate the reviewer’s interest in and positive assessment of our work, and we thank the reviewer for the constructive feedback. In the current revision, we have revised the manuscript for clarity and added previously omitted statistical details. Furthermore, in the response letter, we have also provided additional explanations to clarify our approach, including the rationale for the choice and use of ROIs.

      Reviewer #2 (Public review): 

      Summary: 

      This study aims to investigate how social observation influences risky decision-making. Using a gambling task, the study explored how participants adjusted their risk-taking behavior when they believed their decisions were being observed by either a risk-averse or risk-seeking partner. The authors hypothesized that individuals would simulate the choices of their observers based on learned preferences and integrate these simulated choices into their own decision-making. In addition to behavioral experiments, the study employed computational modeling to formalize decision processes and fMRI to identify the neural underpinnings of risky decision-making under social observation. 

      Strengths: 

      The study provides a fresh perspective on social influence in decision-making, moving beyond the simple notion that social observation leads to uniformly riskier behavior. Instead, it shows that individuals adjust their choices depending on their beliefs about the observer's risk preferences, offering a more nuanced understanding of how social contexts shape decision-making. The authors provide evidence using comprehensive approaches, including behavioral data based on a well-designed task, computational modeling, and neuroimaging. The three models are well selected to compare at which level (e.g., computing utility, risk preference shift, and choice probability) the social influence alters one's risky decision-making. This approach allows for a more precise understanding of the cognitive processes underlying decision-making under social observation. 

      Weaknesses: 

      While the neuroimaging results are generally consistent with the behavioral and computational findings, the strength of the neural evidence could be improved. The authors' claims about the involvement of the TPJ and mPFC in integrating social information are plausible, but further analysis, such as model comparisons at the neuroimaging level, is needed to decisively rule out alternative interpretations that other computational models suggest. 

      We appreciate the reviewer’s interest in and positive assessment of our work, and we thank the reviewer for the constructive feedback. In the current revision, we have included neural results from additional analyses, which we believe provide stronger support for our proposed computational model.

      Reviewer #3 (Public review): 

      Summary: 

      This is an important paper using a novel paradigm to examine how observation affects the social contagion of risk preferences. There is a lot of interest in the field about the mechanisms of social influence, and adding in the factor of whether observation also influences these contagion effects is intriguing.

      Strengths:

      (1) There is an impressive combination of a multi-stage behavioural task with computational modelling and neuroimaging.

      (2) The analyses are well conducted and the sample size is reasonable. 

      Weaknesses: 

      (1) Anatomically it would be helpful to more explicitly distinguish between dmPFC and vmPFC. Particularly at the end of the introduction when mPFC and vmPFC are distinguished, as the vmPFC is in the mPFC. 

      (2) The authors' definition of ROIs could be elaborated on further. They suggest that peaks are selected from neurosynth for different terms, but were there not multiple peaks identified within a functional or anatomical brain area? This section could be strengthened by confirming with anatomical ROIs where available, such as the atlases here http://www.rbmars.dds.nl/lab/CBPatlases.html and the Harvard-Oxford atlases. 

      (3) How did the authors ensure there were enough trials to generate a reliable BOLD signal? The scanned part of the study seems relatively short. 

      (4) It would be helpful to add whether any brain areas survived whole-brain correction. 

      (5) There is a concern that mediation cannot be used to make causal inferences and much larger samples are needed to support claims of mediation. The authors should change the term mediation in order to not imply causality (they could talk about indirect effects instead) and highlight that the mediation analyses are exploratory as they would not be sufficiently powered (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2843527/). 

      (6) The authors may want to speculate on lifespan differences in this susceptibility to risk preferences given recent evidence that older adults are relatively more susceptible to impulsive social influence (Zhu et al, 2024, comms psychology). 

      We appreciate the reviewer’s interest in and positive assessment of our work, and we thank the reviewer for the constructive feedback. In the response letter below, we address each of the reviewer’s comments, including clarifications regarding the ROIs and the limitations of the current study in interpreting the results.

      Reviewer #1 (Recommendations for the authors):

      (1) The neuroimaging hypotheses seem post hoc to me. First, the term "social inference" is used very loosely. In line 103 the authors mentioned that TPJ has been reported to be involved in inferring other's intentions and learning about others. However, in their task, it is not clear where inference is needed. All participants need to do is recall others' "preferences", rather than inferring a hidden variable or hidden intention. In addition, in some of the studies that the authors have cited (e.g., Park et al. 2021), the hippocampus is the focus of the inference, which gets no mention here.

      How does solving this task require inference (as defined by the authors: inferring others' intentions)? And why do they choose TPJ while inference is not needed in this task?

      We regret any confusion and would like to take this chance to clarify our hypothesis on social inference. As the reviewer pointed out, participants were indeed instructed to predict their choices, through which we expected them to learn the demonstrators’ preferences. Our computational model suggests that during the main phase of the task, i.e., the Observed phase, participants simulated others’ choices based on these previously learned risk preferences of others. The gamble choices they encountered (payoffs and associated probabilities) did not overlap with those in the Learning phase, and therefore, we expected that the cognitive process triggered by the social context involved active simulation—what we describe as making inference about others—rather than simple ‘recall’ of previously learned information. In line with this reasoning, we hypothesized that the TPJ, a brain region previously implicated in simulating others’ actions and intentions, would play a key role during the Observed phase.

      Regarding the role of the hippocampus, the paper we cited by BoKyung Park et al. (2021), titled “The role of right temporoparietal junction in processing social prediction error across relationship contexts”, highlights the involvement of the rTPJ but does not mention the hippocampus. We are aware of the study by Seongmin A. Park et al. (2021), “Inferences on a multidimensional social hierarchy use a grid-like code”, which shows the involvement of the hippocampus and entorhinal cortex in making inferences about multidimensional social hierarchies; we believe the reviewer may have mistakenly assumed that we cited this article. As the study showed, the involvement of the hippocampus—and the use of its grid-like representation of social information—is likely tied to the multidimensional nature of task states. In our study, the hippocampus was not included as an ROI because we had no specific rationale to hypothesize that such grid-like representations would be recruited by our task.

      (2) Social influence can be motivated informationally (to improve accuracy) or normatively (to be aligned with others). To me, it seems that the authors have studied the latter, because, first, there is no objectively correct response in this task and second, because participants changed their risk preference according to the preference of the observing partner. This distinction has not been made throughout the manuscript. This is important because the two process (information and normative) are supported by different neural processes and it is extremely useful to understand neural basis of which process the authors are studying.

      We thank the reviewer for the opportunity to clarify the anticipated role of social influence in our study. As the reviewer pointed out, the gambling task used in our task does not have objectively correct or incorrect answers, and naturally, any social influence present during the task would align with normative social influence. To clarify this point, we have revised the discussion section as follows:

      [Page 9, Line 345]

      Observational learning and mimicry of others’ behavior are patterns commonly found in social animals, including nonhuman primates (Van de Waal et al., 2013). Such behaviors are thought to be driven either by a motivation to acquire additional information (‘informational conformity’) or by a motivation to align with group norm (‘normative conformity’), even when doing so does not necessarily lead to better outcomes (e.g., higher accuracy) (Cialdini & Goldstein, 2004). Given that there are no objectively correct or incorrect answers in the gambling task used in our study, the observed social influence is more consistent with normative conformity. However, we cannot rule out the possibility that individuals developed false beliefs about a particular observing partner—namely, that the partner had greater control over or insight into the gambling task. Future studies are needed to directly investigate whether individuals’ beliefs about others modulate informational social influence—that is, their motivation to use social information to gain additional insight by inferring others’ potential choices.

      (3) From Line 160 onward, the authors report several findings without providing any effect sizes or statistics. Please add effect size and statistics for each finding.

      We thank the reviewer for pointing this out. We have now added the corresponding effect sizes and statistical values for the reported findings, beginning from Line 160 in the revised manuscript.

      (4) Line 270: "In particular, bilateral TPJ, brain regions not implicated in the Solo phase, positively tracked trial-by-trial model-estimated decision probabilities". How can the authors conclude that TPJ is not involved in the solo phase? As far as I understood from the text, TPJ was not included as one of the ROIs for analysis of the Solo phase. If it was included, it should be mentioned in the text and there should be a direct comparison between the effect sizes of the solo and the observer phase. If not, "not implicated in the Solo phase" is not justified and should be removed.

      We apologize for the confusion. As the reviewer correctly pointed out, the TPJ was not included among the ROIs in our analysis of the Solo phase data; therefore, its involvement during the Solo phase was never directly assessed using an ROI-based approach.

      To examine brain responses during the Observed phase, we first assessed whether regions that tracked decision probabilities during the Solo phase—vmPFC, vStr, and dACC—were also engaged in the Observed phase. The involvement of the TPJ during the Observed phase was revealed through a subsequent whole-brain analysis. To clarify this point, we now have revised the corresponding part as follows:

      [Page 8, Line 276]

      In particular, bilateral TPJ positively, brain regions not implicated in the Solo phase, tracked trial-by-trial model-estimated decision probabilities

      à Notably, bilateral TPJ showed significant positive tracking of decision probabilities ~

      (5) I am a bit puzzled about the PPI analysis. Is the main finding increased connectivity within mPFC in the observing condition? PPI is often done between two separate brain regions. I am not sure what it means that connectivity within mPFC increases in one condition compared to another. What was the motivation for this analysis? Can you also please explain what it means?

      As the reviewer noted, psychophysiological interaction (PPI) analyses examine functional connectivity between brain regions as modulated by a psychological factor. To clarify our result, the reported ‘mPFC-mPFC connectivity’ refers to functional connectivity between the mPFC region responsive to the presence of an observing partner and an adjacent, anatomically distinct region within the mPFC. Note that we have revised the manuscript to refer to this region more specifically as the dorsomedial prefrontal cortex (dmPFC). Please see our response to Reviewer 3, Comment 1, for further details.

      During the Observed phase of our task, social information was processed at two distinct time points. First, at the beginning of each decision trial, individuals were cued with the presence (or absence) of an observing partner (‘Partner presentation’). Second, the gamble options, as well as the observing partner’s identity, were revealed (‘Options revealed’). Because participants had previously learned about the observing partner’s risk preferences, we expected them to simulate the choice the partner would likely make. We hypothesized that if individuals indeed simulated the partner’s choice and incorporated this information into their decision-making process, the brain region involved in recognizing the partner’s presence (dmPFC<sub>contrast</sub>) would be functionally connected to the region responsible for integrating social information into the final decision (TPJ). Our results showed that the two regions were functionally connected via an indirect path through an anatomically adjacent cluster within the mPFC (dmPFC<sub>PPI</sub>). Given that the recognition of the partner’s presence and the simulation of their choice occurred at two distinct time points, we interpreted the functional connectivity between the two dmPFC clusters (dmPFC<sub>contrast</sub> and dmPFC<sub>PPI</sub>) as evidence that the dmPFC<sub>PPI</sub>) remained engaged during the decision process to support simulation, rather than being involved solely in the passive recognition of the social context (i.e., observed vs not observed). Note that, consistent with this interpretation, functional connectivity was stronger in individuals who showed greater reliance on social information ('Social reliance' parameter in our model).

      To avoid confusion, we have now labeled the two dmPFC clusters as dmPFC<sub>contrast</sub>—the seed region identified at partner presentation—and dmPFC<sub>PPI</sub>—the target region identified in the PPI analysis.

      [Page 8, Line 284]

      This cue was intended to dissociate neural responses to the social context per se (i.e., the presence of an observing partner), which we hypothesized would initiate social processing, from the neural processes involved in incorporating this information during the subsequent decision-making phase.

      [Page 8, Line 291]

      We tested whether the dmPFC was also involved in incorporating social information during the decision process under social observation, particularly among individuals who relied more heavily on simulating others’ behavior.

      [Page 8, Line 297]

      We confirmed that the functional connectivity between the dmPFC<sub>contrast</sub> which is sensitive to cues regarding the presence of an observing partner, and its adjacent, anatomically distinct region within the dmPFC (‘dmPFC<sub>PPI</sub>’ hereafter; x = 3, y = 50, z = 5, k<sub>E</sub> = .74, cluster-level P<sub>FWE, SVC</sub> = 0.011; Fig. 4a, b, Table S5) was positively associated with individuals’ social reliance.

      (6) In Line 107 the authors say "excitatory stimulation of the TPJ improved social cognition". Improved social cognition is too general and unspecific. Please be more specific.

      We agree that the term ‘social cognition’ was too general and unspecific. In the revised manuscript, we have specified that the improvement was observed in tasks specifically involving the control of self-other representation, as demonstrated by Santiesteban et al. (2012).

      [Page 4, Line 106]

      Corroborating with these neuroimaging data, excitatory stimulation of the TPJ improved social cognition (Santiesteban et al., 2012),~

      à Corroborating these neuroimaging findings, excitatory stimulation of the TPJ improved social cognition involving the control of self-other representation (Santiesteban et al., 2012),~

      Writing:

      We thank the reviewer for their thorough evaluation of our manuscript. We have now made the necessary revisions in accordance with the provided comments.

      (7) Line 75: "one risky options" should be one risky option.

      [Page 3, Line 74]

      between one safe (i.e., guaranteed payoff) and one risky options.

      between a safe option (i.e., guaranteed payoff) and a risky option.

      (8) Line 82: were given with the same set of gamble should be "were given the same set of gamble".

      [Page 3, Line 81]

      In the third phase (‘Observed phase’), individuals were given with the same set of gamble choices they faced in the Solo phase,

      In the third phase (‘Observed phase’), individuals were given the same set of gamble choices they faced in the Solo phase,~

      (9) Line 63: and that the extent of such influence depends on the identity of the observer. It is not clear what the authors mean by the "identity of observer". Does it mean the preference of the observer?

      Van Hoorn et al. (2018) showed that the degree of social influence varies depending on whether individuals are being observed by parents or by peers. While one might attribute this difference to divergent preferences typically held by parents and peers, it is important to note that other factors may also differ between these social groups. To avoid overinterpretation while preserving the original meaning, we have revised the sentence as follows:

      [Page 3, Line 61]

      However, recent studies showed that the unidirectional influence of social others’ presence may be also observed in adults (Otterbring, 2021), and that the extent of such influence depends on the identity of the observer (Van Hoorn et al., 2018).  

      However, recent studies showed that the unidirectional influence of social others’ presence can also be observed in adults (Otterbring, 2021), and that the extent of this influence depends on the observer’s identity—specifically, whether the observer is a parent or a peer (Van Hoorn et al., 2018).

      (10) Line 103: "including inferring others' intention and in learning about others." An "in" is missing right before inferring.

      [Page 4, Line 101]

      The temporoparietal junction (TPJ) is another region known to play an important role in social cognitive functions, including inferring others’ intention and in learning about others (Behrens et al., 2008; Boorman et al., 2013; Charpentier et al., 2020; Park et al., 2021; Samson et al., 2004; Saxe & Kanwisher, 2003; Saxe & Kanwisher, 2013; Van Overwalle, 2009; Young et al., 2010).

      The temporoparietal junction (TPJ) is another region known to play an important role in a range of social cognitive functions, including simulating others’ intention and choices, as well as learning about others (Behrens et al., 2008; Boorman et al., 2013; Charpentier et al., 2020; Park et al., 2021; Samson et al., 2004; Saxe & Kanwisher, 2003; Saxe & Kanwisher, 2013; Van Overwalle, 2009; Young et al., 2010).

      (11) 106: "Corroborating with these neuroimaging data." It should be "corroborating these neuroimaging data".

      [Page 4, Line 106]

      Corroborating with these neuroimaging data, ~

      Corroborating these neuroimaging findings, ~

      (12) Lines 113-115. It is not clear what the authors are trying to say here.

      We have now revised the sentence as follows:

      [Page 4, Line 112]

      We hypothesized that even if others’ choices are not explicitly presented, simple presence of social others may trigger inference about others’ potential choices, and the same set of brain regions will play an important role in value-based decision-making.

      We hypothesized that, even in the absence of explicit information about others’ choices, the mere presence of social others could lead participants to conform to the option they believe others would choose. To do so, participants would need to simulate others’ potential choices, particularly when option values vary across trials. As a result, we propose that the same brain regions involved in simulating others’ decisions would also be engaged during value-based decision-making in the presence of social observers.

      (13) Line 151: This sentence is too long and hard to follow:

      We have now revised the sentence as follows:

      [Page 5, Line 154]

      Furthermore, individuals’ prediction responses on subsequent 10 prediction trials where no feedback was provided (Fig. 2b) as well as self-reports about the perceived riskiness of the partners collected at the end of the Learning phase (Fig. 1d) consistently showed that they were able to distinguish one partner from the other, and correctly estimate the partners’ risk preferences (Predicted risk preference: t(42) = -11.46, P = 1.66e-14; Self-report: t(42) = -35.83, P = 4.10e-33).

      Furthermore, individuals’ prediction responses during the subsequent 10 trials without feedback consistently indicated that they could distinguish between the two partners and accurately estimate each partner’s risk preferences (t(42) = -11.46, P = 1.66e-14; Fig. 2b). Self-reported ratings of the partners’ perceived riskiness, collected after the Learning phase, further supported this finding (t(42) = -35.83, P = 4.10e-33; Fig. 1d).

      (14) Line 178: This sentence is very hard to follow. I am not sure what the authors were trying to say here. Please clarify.

      We have now revised the sentence as follows:

      [Page 5, Line 183]

      Various previous studies examined the impacts of social context on decision-making processes, but the suggested mechanisms by which individuals were affected by the social information depended on how the information was presented.

      à Previous studies have shown that social context can influence decision-making processes. However, the underlying mechanisms proposed have varied depending on how the social information was presented.

      (15) Line 183: "when individuals were given with the chances" should be "when individuals were given the chance".

      [Page 5, Line 187]

      On the contrary, when individuals were given with the chances~

      On the contrary, when individuals were given the chances~

      (16) Line 192: "are sensitive to the identity of the currently observing partner...". Do the authors mean are sensitive to the preferences of the currently observing partner? If so, please clarify, it is hard to follow.

      We have now revised the sentence as follows:

      [Page 5, Line 195]

      We hypothesized that if individuals are sensitive to the identity of the currently observing partner, they would take into account the learned preferences of others in computing their choices rather than simply in guiding the direction how to change their own preferences.

      à We hypothesized that if individuals are sensitive to the learned preferences of the observing partner, they would use this information to simulate the partner’s likely choices, rather than simply aligning their own preferences with those of the partner.

      Reviewer #2 (Recommendations for the authors):

      (1) The current neuroimaging findings appear to support the decision processes of all three models. I recommend that the authors provide more detailed evidence of model comparisons in the neuroimaging analysis. This should go beyond simply comparing the goodness of fit of neural activity.

      We acknowledge that neuroimaging data alone often do not provide conclusive evidence for specific information processing. In our study, we examined brain regions that track decision probabilities and are associated with social cognition, such as simulating others’ choice tendencies. Because these processes are general and not tied to a specific computational model, neural responses supporting the occurrence of such processes cannot be used to rule out alternative decision models. For this reason, our approach prioritized a rigorous behavioral model comparison as a critical first step before probing the neural substrates underlying the proposed mechanism. Our behavioral model comparisons, including both quantitative fit indices and qualitative pattern predictions, indicated that the proposed model best accounted for participants' decision patterns across task conditions.

      More importantly, to further validate the model, we conducted a model recovery analysis (see Fig. S2b in SI), which confirmed that our model can be reliably distinguished from alternative accounts even when behavioral differences are subtle. This result suggests that our model captures unique and meaningful characteristics of the decision process that are not equally well explained by competing models.

      With this behavioral foundation, our neuroimaging analyses were designed not to serve as independent model arbiters, but rather to examine whether brain activity in regions of interest reflected the computations specified by the best-fitting model. We believe this two-step approach—first establishing behavioral validity, then linking model-derived variables to neural data—offers a principled framework for identifying the cognitive and neural mechanisms of decision-making.

      Nevertheless, per the reviewer’s suggestion, we further examined whether there is neural encoding of both the participant’s own utility and the observer’s utility—serving as potential neural evidence to differentiate our model from the two alternative models. Please see below for our response to Reviewer 2’s Comment (2).

      (2) Specifically, if participants are combining their own and simulated choices at the level of choice probability, we would expect to see neural encoding of both their own utility and the observer's utility. These may be observed in different areas of the mPFC, as demonstrated by Nicolle et al. (Neuron, 2012). In that study, decisions simulating others' choices were associated with activity in the dorsal mPFC, while one's own decisions were encoded in the vmPFC. On the contrary, if the brain encodes decision values based on the shifted risk preference, rather than encoding each decision's value in separate brain areas, this would support the alternative model.

      We thank the reviewer for this constructive comment. In our Social reliance model, we assumed that the decision probability based on an individual’s own risk preferences, as well as that based on the observing partner’s risk preferences, both contribute to the individual’s final choice. As the reviewer suggested, neural evidence that differentiates our model from the two alternative models—the Risk preference change model and the Other-conferred utility model—would involve demonstrating neural encoding of both the participant’s own utility and the observer’s utility.

      The utility differences between chosen and unchosen options from the two perspectives—self and observer—were highly correlated, preventing us from including both as regressors in the same design matrix. Instead, we defined ROIs along the ventral-to-dorsal axis of the mPFC, and examined whether each ROI more strongly reflected one’s own utility or that of the observer. Based on the meta-analysis by Clithero and Rangel (2014), we defined the most ventral mPFC ROI (ROI1) as a 10 mm-radius sphere centered at coordinate [x=-3, y=41, z=-7], a region previously associated with subjective value. From this ventral seed, we defined four additional spherical ROIs (10 mm radius each) at 12 mm intervals along the ventral-to-dorsal axis, resulting in five ROIs in total: ROI2 [x=-3, y=41, z=5], ROI3 [x=-3, y=41, z=17], ROI4 [x=-3, y=41, z=29], ROI5 [x=-3, y=41, z=41].

      Consistent with Nicolle et al. (2012), the representation of one’s own utility (labelled as ‘Own subjective value’) and that of the observer (‘Observer’s subjective value’) was organized along the ventral-to-dorsal axis of the mPFC. Specifically, utility signals from the participant’s own perspective (SV<sub>chosen, self</sub> – SV<sub>unchosen, self</sub>) were most prominently represented in the ventral-most ROIs (blue), whereas utility signals from the observer’s perspective (SV<sub>chosen, observer</sub> – SV<sub>unchosen, observer</sub>) were most strongly represented in the dorsal-most ROIs (orange).

      (3) Additionally, the authors may be able to detect neural signals related to conflict when the decisions of the individual and the observer differ, compared to when the decisions are congruent. These neural signatures would only be present if social influences are integrated at the choice level, as suggested by the authors.

      If individuals simulate the choices that others might make, they may compare them with the choices they would have made themselves. To investigate this possibility, we categorized task trials as Conflict or No-conflict trials based on greedy choice predictions derived from a softmax decision rule. Conflict trials were those in which the choice predicted from the participant’s own risk preference differed from that predicted for the observer, whereas No-conflict trials involved the same predicted choice from both perspectives. A contrast between Conflict and No-conflict trials revealed that the dACC and dlPFC—regions previously associated with conflict monitoring and cognitive control (Shenhav et al., 2013)—were sensitive to differences in choice tendencies between the self and observer perspectives.

      Author response image 1.

      dACC and dlPFC are associated with the discrepancy between participants’ own choice tendencies and those of observing partners, as estimated based on prior beliefs about the partners’ risk preferences.

      As the reviewer suggested, these results provide evidence in support of the Social Reliance model, which posits that participants simulate the observer's choice and integrate it with their own.

      (4) Incorporating these additional analyses would provide stronger evidence for distinguishing between the models.

      We again thank the reviewer for these constructive suggestions. Based on the new set of analyses and results, we have made the necessary revisions as noted above. We agree that these revisions provide stronger evidence for distinguishing between the models.

      Reviewer #3 (Recommendations for the authors):

      (1) Anatomically it would be helpful to more explicitly distinguish between dmPFC and vmPFC. Particularly at the end of the introduction when mPFC and vmPFC are distinguished, as the vmPFC is in the mPFC.

      We appreciate the reviewer’s suggestion regarding the anatomical distinction between the dmPFC and vmPFC, particularly in relation to our use of the term “mPFC.” We acknowledge that the dmPFC and vmPFC are subregions of the broader mPFC. In our original manuscript, we referred to one region as mPFC in line with prior studies highlighting its role in social cognition and contextual processing (Behrens et al., 2008; Sul et al., 2015; Wittmann et al., 2016). However, in response to the reviewer’s comment and to more clearly distinguish this region from the ventral portion of the mPFC (i.e., vmPFC), which is canonically associated with subjective valuation, we have now revised the manuscript to refer to this region as the dmPFC. This terminology better reflects its association with social cognition, including model-estimated social reliance and sensitivity to social cues in our study.

      (2) The authors' definition of ROIs could be elaborated on further. They suggest that peaks are selected from neurosynth for different terms, but were there not multiple peaks identified within a functional or anatomical brain area? This section could be strengthened by confirming with anatomical ROIs where available, such as the atlases here http://www.rbmars.dds.nl/lab/CBPatlases.html and the Harvard-Oxford atlases.

      We appreciate the opportunity to clarify how our ROIs were defined. To identify the ROIs, we drew upon both prior literature and results from a term-based meta-analysis using Neurosynth. For each meta-map, we applied an FDR-corrected threshold of p < 0.01 and a cluster extent threshold of k ≥ 100 voxels to identify distinct functional clusters. For each cluster, we constructed a spherical ROI (radius = 10 mm) centered on its center of gravity. Note that for each anatomically distinct brain region, only a single center of gravity was identified and used to define the ROI. The resulting ROIs were subsequently used for small volume correction (SVC) in the second-level fMRI analyses.

      For brain regions associated with decision-making processes, we obtained a meta-analytic activation map associated with the term “decision” from Neurosynth. After applying an FDR-corrected threshold of p < 0.001 and a cluster extent threshold of k ≥ 100 voxels, we identified five distinct clusters: vmPFC [x = -3, y = 38, z = -10]; right vStr [x = 12, y = 11, z = -7]; left vStr [x = -12, y = 8, z = -7]; dACC [x = 3, y = 26, z = 44]; and left Insula [x = -30, y = 23, z = -1]. To identify brain regions involved in decision-making under social observation, we used the Neurosynth meta-map associated with the term “social”, applying the same criteria (FDR p < 0.001, k ≥ 100). This analysis revealed several clusters, including bilateral TPJ: right TPJ [x = 51, y = -52, z = 14]; left TPJ [x = -51, y = -58, z = 17]. To isolate brain regions more specifically associated with social processing rather than valuation, we also constructed a conjunction map using the meta-maps for the terms “social” and “value.” We identified clusters present in the “social” map, but not in the “value” map. This analysis yielded, among others, a cluster in the dmPFC [x = 0, y = 50, z = 14].

      To clarify our ROI analysis methods, we have now revised the manuscript to include more detailed information about the procedures used, as follows:

      [Page 19, Line 746]

      Region-of-interest (ROI) analyses. To define ROIs for the neural analyses conducted in the Observed phase, we used significant clusters identified during the Solo phase. Specifically, regions showing significant activation for Prob(chosen) in the DM0 (thresholded at P < 0.001) were selected as ROIs. Three ROI clusters were defined: the vStr (peak voxel at [x = 3, y = 14, z = -10], k<sub>E</sub> = 9), vmPFC (peak voxel at [x = –3, y = 62, z = –13], k<sub>E</sub> = 99), and dACC (peak voxel at [x = 12, y = 32, z = 29], k<sub>E</sub> = 118). These ROIs were then applied in the Observed phase analyses to test whether similar neural representations are also engaged in social contexts.

      Term-based meta-analytic maps from Neurosynth for small volume correction. To reduce the likelihood of false positives arising from random significant activations and to enhance sensitivity within regions of theoretical interest, small volume correction (SVC) was applied using term-based meta-analytic maps from Neurosynth. This approach allows for hypothesis-driven correction by restricting statistical testing to anatomically and functionally defined ROI. Specifically, three meta-analytic maps were generated using Neurosynth’s term-based analyses (Yarkoni et al., 2011), with a false discovery rate (FDR) corrected P < 0.01 and a cluster size > 100 voxels. For each resulting cluster, we defined a spherical ROI with a 10 mm radius centered on the cluster’s center of gravity. For each anatomically distinct brain region, only a single center of gravity was identified and used to define the corresponding ROI.

      First, to identify regions encoding final decision probabilities during the Solo phase and enhance sensitivity, we used the meta-map associated with the term “decision” to identify neural substrates of value-based decision-making. This yielded three clusters: vmPFC ([x = -3, y = 38, z = -10]), vStr ([x = 12, y = 11, z = -7]), and dACC ([x = 3, y = 26, z = 44]) (Fig. 3a, S7). Second, to examine social processing during the Observed phase, we used the meta-map associated with the term “social” to identify brain regions typically involved in social cognition. This analysis revealed clusters, including the rTPJ ([x = 51, y = -52, z = 14]) and lTPJ ([x = -51, y = -58, z = 17]) (Fig. 3c, S8a). Third, to define an ROI involved in processing social cues independent of valuation, we used a meta-map associated with “social” but excluding “value”, isolating regions specific to non-valuation-related social cognition. This analysis revealed a cluster, including the dmPFC ([x = 0, y = 50, z = 14]) (Fig. 3d, 4a, S8b).

      (3) How did the authors ensure there were enough trials to generate a reliable BOLD signal? The scanned part of the study seems relatively short.

      We appreciate the reviewer’s concern regarding the number of trials and the potential implications for the reliability of the resulting BOLD signals. While we did not conduct formal statistical tests to determine the optimal number of trials, our task design, in general, followed well-established principles in functional neuroimaging. Specifically, we employed a jittered event-related design and used both temporal and dispersion derivatives in the GLM analyses. These strategies are widely recognized for enhancing the efficiency of BOLD signal deconvolution and improving model fit by accounting for inter-subject and inter-regional variability in the hemodynamic response function (HRF). Furthermore, the number of trials per condition in our study was comparable to those reported in previous publications (20-30 trials) that employed similar gambling paradigms to examine individual differences in the neural substrates of value-based decision-making (Chung et al., 2015; Chung et al., 2020).

      (4) It would be helpful to add whether any brain areas survived whole-brain correction.

      No brain regions survived whole-brain correction. Nevertheless, as described in the introduction, we had strong a priori hypotheses. Based on these hypotheses, we defined term-based ROIs using Neurosynth, and conducted small volume correction analyses. Per the reviewer’s suggestion, we have added information indicating that no brain regions survived whole-brain correction, as follows:

      [Page 8, Line 281]

      No additional regions survived whole-brain correction.

      (5) There is a concern that mediation cannot be used to make causal inferences and much larger samples are needed to support claims of mediation. The authors should change the term mediation in order to not imply causality (they could talk about indirect effects instead) and highlight that the mediation analyses are exploratory as they would not be sufficiently powered (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2843527/).

      We acknowledge the reviewer’s concerns regarding the causal interpretation of mediation analysis results. Per this comment, we have revised the manuscript as follows to avoid overinterpreting these results and to refrain from implying any causal inference.

      [Page 9, Line 327]

      Given that our sample size is smaller than the recommended threshold for detecting mediation effects (Fritz & MacKinnon, 2007), this significant indirect effect should be interpreted with caution, particularly with respect to causal inference.

      (6) The authors may want to speculate on lifespan differences in this susceptibility to risk preferences given recent evidence that older adults are relatively more susceptible to impulsive social influence (Zhu et al, 2024, comms psychology).

      We thank the reviewer for the thoughtful suggestion—we believe the referenced work is Zhilin Su et al. (2024). As noted in our manuscript, all participants in the current study were young adults aged between 18 and 29 years. Given this limited age range, our dataset does not provide sufficient variability to directly examine age-related differences across the lifespan. However, we are planning a follow-up study using the same task with older adult participants, which we believe will provide a valuable opportunity to address this important gap in understanding susceptibility to social influence across the lifespan.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations for authors):

      (1) Motivation for studying SUL1 in RLS

      Considering that the regulation of cellular metabolism in response to nutrient availability is crucial for cell survival and lifespan, and several organic nutrient transporters have also been implicated in the mediation of aging, we believe that transporters of specific nutrients can transduce the signal downstream to control genes responsible for survival. However, the impact of inorganic nutrient transporters, including phosphate and sulfate, on longevity remains largely unexplored. And another work of our group utilized a LASSO model derived from multi-omics data related to yeast aging, identifying SUL1 as a key candidate for regulating lifespan, which aroused our interest.

      (2) Discrepancy with prior RLS data (PMID: 26456335)​​

      Previous literature (PMID: 26456335) reported a limited number of experimental cells (n=25), which may have contributed to the observed variability in results. To enhance the reliability of our work, we have expanded the number of experimental cells for the sul1Δ strain to 400 (see Figure 1A). In contrast, the lifespan data for other mutant strains have been increased to 200 (see Figure 1B). This confirms the reproducibility of the lifespan extension observed in the sul1Δ strain.

      (3) Mechanistic link between sulfate transport and lifespan​​

      Sulfate absorption assays were performed on the WT, SUL1Δ, SUL2Δ, and SUL1<sup>E427Q</sup> strains (Figure 1C). Compared to the wild type (WT), the SUL1Δ, SUL2Δ, and SUL1<sup>E427Q</sup> strains exhibited delayed sulfate intracellular transportation. However, there was no significant difference in the final concentration of intracellular sulfur ions among all groups. This result reinforces our conclusion that the extended lifespan of SUL1Δ is not associated with sulfate transport.

      (4) Testing the RLS of SUL1ΔMSN4Δ double mutants​​

      The replicative lifespan data for the SUL1ΔMSN4Δ double mutant were further analyzed (shown in the following supplementary figure). It was observed that the extension of the SUL1Δ lifespan was not rescued by the knockout of MSN4, supporting the hypothesis that MSN2 may serve as the downstream transcription factor responsible for the increased lifespan of SUL1Δ.

      Author response image 1.

      Replicative life span of MSN4 deletion mutants in WT and SUL1Δ strains.

      Reviewer #2 (Recommendations for authors):

      (1) Inconsistent WT lifespan in Figure 1B

      All measurements of life expectancy were conducted under controlled conditions (30°C, 2% glucose). The revised Figure 1C illustrates that across three independent experiments (n=200 cells), the average lifespan of wild-type (WT) cells was 29.1 generations, which is comparable to the average lifespan of 25.6 generations reported in Figure 1A after data expansion (n=400 cells). This similarity may be attributed to experimental variability arising from multiple trials; however, it does not compromise the validity of our conclusions.

      (2) Sulfate level measurements​​

      Intracellular sulfate levels were measured by quantitatively assessing the sulfate concentrations in wild-type (WT), SUL1Δ, SUL2Δ, and SUL<sup>E427</sup> cells, as detailed in the methods section (Figure 1C). The results indicated that all mutant strains showed a delayed sulfur uptake process, but there was no significant difference in the final concentration of intracellular sulfur ions in all groups.

      (3) RNA-seq for non-lifespan-extending mutants​​

      RNA-seq data for the SUL2Δ and SULE427 mutants can be found in Supplementary Figure 1. These mutants do not exhibit a significant upregulation of stress-response genes, such as HSP12 and TPS1, which reinforces the specificity of the pathways induced by SUL1Δ.

      (4) Improved Msn2/4 imaging​​

      Figure 3C and supplementary Figure 4A present high-resolution confocal images (using a 63× objective lens) of cell nuclei labeled with MSN2-GFP and DAPI. The GFP intensity within the nucleus was normalized against the DAPI signal to account for differences in nuclear size.​​

      ​​Reviewer #3 (Recommendations for authors):

      (1) Nuclear size normalization​​

      The verification data for MSN2 and MSN4 were re-evaluated through DAPI signal normalization. The revised figures are presented in Figure 3C and Supplementary Figure 4A.

      (2) Strain nomenclature​​

      All strain names (e.g., SUL1Δ) were updated to follow SGD guidelines.

      (3) Grammar and formatting​​

      We have carefully revised the text to improve readability. And the manuscript was proofread by a native English speaker. Citations (e.g., "trehalose (Lillie and Pringle, 1980)") and spacing errors were corrected.

      (4) Microscopy resolution​​

      In the revised figures (Figures 3C, 3E, 4B, 4E, Supplementary Figure 3A, 4A, 4C), all fluorescence images are displayed as separate channels (EGFP, DAPI, BF). The scale and arrows have been added to the figure for clarity.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      The authors use electrophysiological and behavioral measurements to examine how animals could reliably determine odor intensity/concentration across repeated experiences. Because stimulus repetition leads to short-term adaptation evidenced by reduced overall firing rates in the antennal lobe and firing rates are otherwise concentration-dependent, there could be an ambiguity in sensory coding between reduced concentration or more recent experience. This would have a negative impact on the animal's ability to generate adaptive behavioral responses that depend on odor intensities. The authors conclude that changes in concentration alter the constituent neurons contributing to the neural population response, whereas adaptation maintains the 'activated ensemble' but with scaled firing rates. This provides a neural coding account of the ability to distinguish odor concentrations even after extended experience. Additional analyses attempt to distinguish hypothesized circuit mechanisms for adaptation but are inconclusive. A larger point that runs through the manuscript is that overall spiking activity has an inconsistent relationship with behavior and that the structure of population activity may be the more appropriate feature to consider.

      To my knowledge, the dissociation of effects of odor concentration and adaptation on olfactory system population codes was not previously demonstrated. This is a significant contribution that improves on any simple model based on overall spiking activity. The primary result is most strikingly supported by visualization of a principal components analysis in Figure 4. However, there are some weaknesses in the data and analyses that limit confidence in the overall conclusions.

      We thank the reviewer for evaluating our work and highlighting its strengths and deficiencies. We have revised the manuscript with expanded behavioral datasets and additional analyses that we believe convincingly support our conclusion. 

      (1) Behavioral work interpreted to demonstrate discrimination of different odor concentrations yields inconsistent results. Only two of the four odorants follow the pattern that is emphasized in the text (Figure 1F). Though it's a priori unlikely that animals are incapable of distinguishing odor concentrations at any stage in adaptation, the evidence presented is not sufficient to reach this conclusion.

      We have expanded our dataset and now show that the behavioral response is significantly different for high and low concentration exposures of the same odorant. This was observed for all four odorants in our study (refer to Revised Fig. 1F).

      (2) While conclusions center on concepts related to the combination of activated neurons or the "active ensemble", this specific level of description is not directly demonstrated in any part of the results. We see individual neural responses and dimensional reduction analyses, but we are unable to assess to what extent the activated ensemble is maintained across experience.

      We have done several additional analyses (see provisional response). Notably, we have corroborated our dimensionality reduction and correlation analysis results with a quantitative classification analysis that convincingly demonstrates that odor identity and intensity of the odorant can be decoded from the ensemble neural activity, and this could be achieved in an adaptation-invariant fashion (refer to Revised Supplementary Fig. 4). 

      (3) There is little information about the variance or statistical strength of results described at the population level. While the PCA presents a compelling picture, the central point that concentration changes and adaptation alter population responses across separable dimensions is not demonstrated quantitatively. The correlation analysis that might partially address this question is presented to be visually interpreted with no additional testing.

      We have included a plot that compares the odor-evoked responses across all neurons (mean ± variance) at both intensity levels for each odorant (Revised Supplementary Fig. 5). This plot clearly shows how the ensemble neural activity profile varies with odor intensity and how these response patterns are robustly maintained across trials. 

      (4) Results are often presented separately for each odor stimulus or for separate datasets including two odor stimuli. An effort should be made to characterize patterns of results across all odor stimuli and their statistical reliability. This concern arises throughout all data presentations.

      We had to incorporate a 15-minute window between presentations of odorants to reset adaptation. Due to this, we were unable to extracellularly record from all four odorants at two intensities from a single experiment (~ 3.5 hours of recording for just 2 odorants at two intensities with one odorant at higher intensity repeated at the end; Fig. 2a). Therefore, we recorded two datasets. Each dataset captured the responses of ~80 PNs to two odorants at two intensities, one odorant at the higher concentration repeated at the end of the experiment to show repeatability of changes due to adaptation. 

      (5) The relevance of the inconclusive analysis of inferred adaptation mechanisms in Figure 2d-f and the single experiment including a complex mixture in Figure 7 to the motivating questions for this study are unclear.

      Figure 2d-f has been revised. While we agree that the adaptation mechanisms are not fully clear, there is a trend that the most active PNs are the neurons that change the most across trials. This change and the response in the first trial are negatively correlated, indicating that vesicle depletion could be an important contributor to the observed results. However, neurons that adapt strongly at higher intensities are not the ones that adapt at lower intensities. This complicates the understanding of how neural responses vary with intensities and the adaptation that happens due to repetition. This has been highlighted in the revised manuscript. 

      Regarding Figure 7, we wanted to examine the odor-specificity of the changes that happen due to repeated encounters of an odorant. Specifically, wondered if the neural response reduction and behavioral enhancements were a global, non-specific state change in the olfactory system brought about by the repetition of any odorant, or are the observed neural and behavioral response changes odor-specific.

      (6) Throughout the description of the results, typical standards for statistical reporting (sample size, error bars, etc.) are not followed. This prevents readers from assessing effect sizes and undermines the ability to assign a confidence to any particular conclusion.

      We have revised the manuscript to fix these issues and included sample size and error bars in our plots.  

      Reviewer #2 (Public Review):

      Summary:

      The authors' main goal was to evaluate how both behavioral responses to odor, and their early sensory representations are modified by repeated exposure to odor, asking whether the process of adaptation is equivalent to reducing the concentration of an odor. They open with behavioral experiments that actually establish that repeated odor presentation increases the likelihood of evoking a behavioral response in their experimental subjects - locusts. They then examine neural activity patterns at the second layer of the olfactory circuit. At the population level, repeated odor exposure reduces total spike counts, but at the level of individual cells there seems to be no consistent guiding principle that describes the adaptation-related changes, and therefore no single mechanism could be identified.

      Both population vector analysis and pattern correlation analysis indicate that odor intensity information is preserved through the adaptation process. They make the closely related point that responses to an odor in the adapted state are distinct from responses to lower concentration of the same odor. These analyses are appropriate, but the point could be strengthened by explicitly using some type of classification analysis to quantify the adaptation effects. e.g. a confusion matrix might show if there is a gradual shift in odor representations, or whether there are trials where representations change abruptly.

      Strengths:

      One strength is that the work has both behavioral read-out of odor perception and electrophysiological characterization of the sensory inputs and how both change over repeated stimulus presentations. It is particularly interesting that behavioral responses increase while neuronal responses generally decrease. Although the behavioral effect could occur fully downstream of the sensory responses the authors measure, at least those sensory responses retain the core features needed to drive behavior despite being highly adapted.

      Weaknesses:

      Ultimately no clear conceptual framework arises to understand how PN responses change during adaptation. Neither the mechanism (vesicle depletion versus changes in lateral inhibition) nor even a qualitative description of those changes. Perhaps this is because much of the analysis is focused on the entire population response, while perhaps different mechanisms operate on different cells making it difficult to understand things at the single PN level.

      From the x-axis scale in Fig 2e,f it appeared to me that they do not observe many strong PN responses to these stimuli, everything being < 10 spikes/sec. So perhaps a clearer effect would be observed if they managed to find the stronger responding PNs than captured in this dataset.

      We thank the reviewer for his/her evaluation of our work. Indeed, our work does not clarify the mechanism that underlies the adaptation over trials, and how this mechanism accounts for adaptation that is observed at two different intensities of the same odorant. However, as we highlight in the revised manuscript, there is some evidence for the vesicle depletion hypothesis. For the plots shown in Fig. 2, the firing rates were calculated after averaging across time bins and trials. Hence, the lower firing rates. The peak firing rates of the most active neurons are ~100 Hz. So, we are certain that we are collecting responses from a representative ensemble of neurons in this circuit.

      Reviewer #3 (Public Review):

      Summary:

      How does the brain distinguish stimulus intensity reduction from response reductions due to adaptation? Ling et al study whether and how the locust olfactory system encodes stimulus intensity and repetition differently. They show that these stimulus manipulations have distinguishable effects on population dynamics.

      Strengths:

      (1) Provides a potential strategy with which the brain can distinguish intensity decrease from adaptation. -- while both conditions reduce overall spike counts, intensity decrease can also changes which neurons are activated and adaptation only changes the response magnitude without changing the active ensemble.

      (2) By interleaving a non-repeated odor, they show that these changes are odor-specific and not a non-specific effect.

      (3) Describes how proboscis orientation response (POR) changes with stimulus repetition., Unlike the spike counts, POR increases in probability with stimulus. The data portray the variability across subjects in a clear way.

      We thank the reviewer for the summary and for highlighting the strengths of our work.

      Weaknesses:

      (1) Behavior

      a. While the "learning curve" of the POR is nicely described, the behavior itself receives very little description. What are the kinematics of the movement, and do these vary with repetition? Is the POR all-or-nothing or does it vary trial to trial?

      The behavioral responses were monitored in unconditioned/untrained locusts. Hence, these are innate responses to the odorants. These innate responses are usually brief and occur after the onset of the stimulus. However, there is variability across locusts and trials (refer Revised Supplementary Fig. 1). When the same odorant is conditioned with food reward, the POR responses become more stereotyped and occur rapidly within a few hundred milliseconds. 

      Author response image 1.

      POR response dynamics in a conditioned locust. The palps were painted in this case (left panel), and the distance between the palps was tracked as a function of time (right panel).

      b. What are the reaction times? This can constrain what time window is relevant in the neural responses. E.g., if the reaction time is 500 ms, then only the first 500 ms of the ensemble response deserves close scrutiny. Later spikes cannot contribute.

      This is an interesting point. We had done this analysis for conditioned POR responses. For innate POR, as we noted earlier, there is variability across locusts. Many responses occur rapidly after odor onset (<1 s), while some responses do occur later during odor presentation and in some cases after odor termination. It is important to note that these dynamical aspects of the POR response, while super interesting, should occur at a much faster time scale compared to the adaptation that we are reporting across trials or repeated encounters of an odorant.

      c. The behavioral methods are lacking some key information. While references are given to previous work, the reader should not be obligated to look at other papers to answer basic questions: how was the response measured? Video tracking? Hand scored?

      We agree and apologize for the oversight. We have revised the methods and added a video to show the POR responses. Videos were hand-scored. 

      d. Can we be sure that this is an odor response? Although airflow out of the olfactometer is ongoing throughout the experiment, opening and closing valves usually creates pressure jumps that are likely to activate mechanosensors in the antennae.

      Interesting. We have added a new Supplementary Fig. 2 that shows that the POR to even presentations of paraffin oil (solvent; control) is negligible.  This should confirm that the POR is a behavioral response to the odorant. 

      Furthermore, all other potential confounds identified by the reviewer are present for every odorant and every concentration presented.  However, the POR varies in an odor-identity and intensity-specific manner. 

      e. What is the baseline rate of PORs in the absence of stimuli?

      Almost zero. 

      f. What can you say about the purpose of the POR? I lack an intuition for why a fly would wiggle the maxillary palps. This is a question that is probably impossible to answer definitively, but even a speculative explanation would help the reader better understand.

      The locusts use these finger-like maxillary palps to grab a grass blade while eating. Hence, we believe that this might be a preparatory response to feeding. We have noted that the PORs are elicited more by food-related odorants. Hence, we think it is a measure of odor appetitiveness. This has been added to the manuscript. 

      (2) Physiology

      a. Does stimulus repetition affect "spontaneous" activity (i.e., firing in the interstimulus interval? To study this question, in Figures 2b and c, it would be valuable to display more of the prestimulus period, and a quantification of the stability or lability of the inter-stimulus activity.

      Done. Yes, the spontaneous activity does appear to change in an odor-specific manner. We have done some detailed analysis of the same in this preprint:

      Ling D, Moss EH, Smith CL, Kroeger R, Reimer J, Raman B, Arenkiel BR. Conserved neural dynamics and computations across species in olfaction. bioRxiv [Preprint]. 2023 Apr 24:2023.04.24.538157. doi: 10.1101/2023.04.24.538157. PMID: 37162844; PMCID: PMC10168254

      b. When does the response change stabilize? While the authors compare repetition 1 to repetition 25, from the rasters it appears that the changes have largely stabilized after the 3rd or 4th repetition. In Figure 5, there is a clear difference between repetition 1-3 or so and the rest. Are successive repetitions more similar than more temporally-separated repetitions (e.g., is rep 13 more similar to 14 than to 17?). I was not able to judge this based on the dendrograms of Figure 5. If the responses do stabilize at it appears, it would be more informative to focus on the dynamics of the first few repetitions.

      The reviewer makes an astute observation. Yes, the changes in firing rates are larger in the first three trials (Fig. 3c). The ensemble activity patterns, though, are relatively stable across all trials as indicated by the PCA plots and classification analysis results.

      Author response image 2.

      Correlation as a function of trial number. All correlations were made with respect to the odor-evoked responses in the last odor trial of hex(H) and bza(H).

      c. How do temporal dynamics change? Locust PNs have richly varied temporal dynamics, but how these may be affected is not clear. The across-population average is poorly suited to capture this feature of the activity. For example, the PNs often have an early transient response, and these appear to be timed differently across the population. These structures will be obscured in a cross population average. Looking at the rasters, it looks like the initial transient changes its timing (e.g., PN40 responses move earlier; PN33 responses move later.). Quantification of latency to first spike after stimulus may make a useful measure of the dynamics.

      As noted earlier, to keep our story simple in this manuscript, we have only focused on the variations across trials (i.e., much slower response dynamics). We did this as we are not recording neural and behavioral responses from the same locust. We plan to do this and directly compare the neural and behavioral dynamics in the same locust.

      d.How legitimate is the link between POR and physiology? While their changes can show a nice correlation, the fact the data were taken from separate animals makes them less compelling than they would be otherwise. How feasible is it to capture POR and physiology in the same prep?

      This would be most helpful, but I suspect may be too technically challenging to be within scope.

      The antennal lobe activity in the input about the volatile chemicals encountered by the locust. The POR is a behavioral output. Hence, we believe that examining the correlation between the olfactory system's input and output is a valid approach. However, we have only compared the mean trends in neural and behavioral datasets, and dynamics on a much slower timescale. We are currently developing the capability to record neural responses in behaving animals. This turned out to be a bit more challenging than we had envisioned. We plan to do fine-grained comparisons of the neural and behavioral dynamics, recommended by this reviewer, in those preparations.

      Further, we will also be able to examine whether the variability in behavioral responses could be predicted from neural activity changes in that prep.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      This manuscript investigated the mechanism underlying boundary formation necessary for proper separation of vestibular sensory end organs. In both chick and mouse embryos, it was shown that a population of cells abutting the sensory (marked by high Sox2 expression) /nonsensory cell populations (marked by Lmx1a expression) undergo apical expansion, elongation, alignment and basal constriction to separate the lateral crista (LC) from the utricle. Using Lmx1a mouse mutant, organ cultures, pharmacological and viral-mediated Rock inhibition, it was demonstrated that the Lmx1a transcription factor and Rock-mediated actomyosin contractility is required for boundary formation and LC-utricle separation.

      Strengths:

      Overall, the morphometric analyses were done rigorously and revealed novel boundary cell behaviors. The requirement of Lmx1a and Rock activity in boundary formation was convincingly demonstrated.

      Weaknesses:

      However, the precise roles of Lmx1a and Rock in regulating cell behaviors during boundary formation were not clearly fleshed out. For example, phenotypic analysis of Lmx1a was rather cursory; it is unclear how Lmx1a, expressed in half of the boundary domain, control boundary cell behaviors and prevent cell mixing between Lmx1a+ and Lmx1a- compartments? Well-established mechanisms and molecules for boundary formation were not investigated (e.g. differential adhesion via cadherins, cell repulsion via ephrin-Eph signaling). Moreover, within the boundary domain, it is unclear whether apical multicellular rosettes and basal constrictions are drivers of boundary formation, as boundary can still form when these cell behaviors were inhibited. Involvement of other cell behaviors, such as radial cell intercalation and oriented cell division, also warrant consideration. With these lingering questions, the mechanistic advance of the present study is somewhat incremental.

      We have acknowledged the lingering questions this referee points out in our Discussion and agree that the roles of differential cell adhesion and cell intercalation would be worth exploring in further studies. Despite these remaining questions, the conceptual advances are significant, since this study provides the first evidence that a tissue boundary forms in between segregating sensory organs in the inner ear (there are only a handful of embryonic tissues in which a tissue boundary has been found in vertebrates) and highlights the evolutionary conservation of this process. This work also provides a strong descriptive basis for any future study investigating the mechanisms of tissue boundary formation in the mouse and chicken embryonic inner ear. 

      Reviewer #2 (Public review):

      Summary:

      Chen et al. describe the mechanisms that separate the common pan-sensory progenitor region into individual sensory patches, which presage the formation of the sensory epithelium in each of the inner ear organs. By focusing on the separation of the anterior and then lateral cristae, they find that long supra-cellular cables form at the interface of the pansensory domain and the forming cristae. They find that at these interfaces, the cells have a larger apical surface area, due to basal constriction, and Sox2 is down-regulated. Through analysis of Lmx1 mutants, the authors suggest that while Lmx1 is necessary for the complete segregation of the sensory organs, it is likely not necessary for the initial boundary formation, and the down-regulation of Sox2.

      Strengths:

      The manuscript adds to our knowledge and provides valuable mechanistic insight into sensory organ segregation. Of particular interest are the cell biological mechanisms: The authors show that contractility directed by ROCK is important for the maintenance of the boundary and segregation of sensory organs.

      Weaknesses:

      The manuscript would benefit from a more in-depth look at contractility - the current images of PMLC are not too convincing. Can the authors look at p or ppMLC expression in an apical view? Are they expressed in the boundary along the actin cables? Does Y-27362 inhibit this expression?

      The authors suggest that one role for ROCK is the basal constriction. I was a little confused about basal constriction. Are these the initial steps in the thinning of the intervening nonsensory regions between the sensory organs? What happens to the basally constricted cells as this process continues?

      In our hands, the PMLC immunostaining gave a punctate staining in epithelial cells and was difficult to image and interpret in whole-mount preparations, which did not allow us to investigate its specific association to the actin-cable-like structures. It is a very valuable suggestion to try alternative methods of fixation to improve the quality of the staining and images in future work. 

      The basal constriction of the cells at the border of the sensory organs was not always clearly visible in freshly-fixed samples, and was absent in the majority of short-term organotypic cultures in control medium, which made it impossible to ascertain the role of ROCK in its formation using pharmacological approaches in vitro (see Figure 7 and corresponding Result section).  On the other hand, the overexpression of a dominant-negative form of ROCK (RCII-GFP) in ovo using RCAS revealed a persistence of basal constriction in transfected cells despite a disorganisation of the boundary domain (Figure 8). We conclude from these experiments that ROCK activity is not necessary for the formation and maintenance of the basal constriction. We also remain uncertain about the exact role of this basal constriction. It could be either a cause or consequence of the expansion of the apical surface of cells in the boundary domain, it could contribute to the limitation of cell intermingling and the formation of the actin-cable-like structure at the interface of Lmx1a-expressing and non-expressing cells, and may indeed prefigure some of the further changes in cell morphology occurring in non-sensory domains separating the sensory organs (cell flattening and constrictions of the epithelial walls in between sensory organs). 

      The steps the authors explore happen after boundaries are established. This correlates with a down-regulation of Sox2, and the formation of a boundary. What is known about the expression of molecules that may underlie the apparent interfacial tension at the boundaries? Is there any evidence for differential adhesion or for Eph-Ephrin signalling? Is there a role for Notch signalling or a role for Jag1 as detailed in the group's 2017 paper?

      Great questions. It is indeed likely that some form of differential cell tension and/or adhesion participates to the formation and maintenance of this boundary, and we have mentioned in the discussion some of the usual suspects (cadherins, eph/ephrin signalling,…) although it is beyond the scope of this paper to determine their roles in this context. 

      As we have discussed in this paper and in our 2017 study (see also Ma and Zhang, Development,  2015 Feb 15;142(4):763-73. doi: 10.1242/dev.113662) we believe that Notch signalling is maintaining prosensory character, and its down-regulation by Lmx1a/b expression is required for the specification of the non-sensory domains in between segregating sensory organs. Although we have not tested this directly in this study, any disruption in Notch signalling would be expected to affect indirectly the formation or maintenance of the boundary domain. 

      A comment on whether cellular intercalation/rearrangements may underlie some of the observed tissue changes.

      We have not addressed this topic directly in the present study but we have included a brief comment on the potential implication of cellular intercalation and rearrangements in the discussion: “It is also possible that the repositioning of cells through medial intercalation could contribute to the straightening of the boundary as well as the widening of the nonsensory territories in between sensory patches.”

      The change in the long axis appears to correlate with the expression of Lmx1a (Fig 5d). The authors could discuss this more. Are these changes associated with altered PCP/Vangl2 expression?

      We are not sure about the first point raised by the referee. We have quantified cell elongation and orientation in Lmx1a-GFP heterozygous and homozygous (null) mice, and our results suggest that the elongation of the cells occurs throughout the boundary domain, and is probably not dependent on Lmx1a expression (boundary cells are in fact more elongated in the Lmx1a mutant).  We have not investigated the expression of components of the planar cell polarity pathway. This is a very interesting suggestion, worth exploring in further studies.

      Reviewer #3 (Public review):

      Summary:

      Lmx1a is an orthologue of apterous in flies, which is important for dorsal-ventral border formation in the wing disc. Previously, this research group has described the importance of the chicken Lmx1b in establishing the boundary between sensory and non-sensory domains in the chicken inner ear. Here, the authors described a series of cellular changes during border formation in the chicken inner ear, including alignment of cells at the apical border and concomitant constriction basally. The authors extended these observations to the mouse inner ear and showed that these morphological changes occurred at the border of Lmx1a positive and negative regions, and these changes failed to develop in Lmx1a mutants. Furthermore, the authors demonstrated that the ROCK-dependent actomyosin contractility is important for this border formation and blocking ROCK function affected epithelial basal constriction and border formation in both in vitro and in vivo systems.

      Strengths:

      The morphological changes described during border formation in the developing inner ear are interesting. Linking these changes to the function of Lmx1a and ROCK dependent actomyosin contractile function are provocative.

      Weaknesses:

      There are several outstanding issues that need to be clarified before one could pin the morphological changes observed being causal to border formation and that Lmx1a and ROCK are involved.

      We have addressed the specific comments and suggestions of the reviewer below. We wish however to point out that we do not think that ROCK activity is required for the formation or maintenance of the basal constriction at the interface of Lmx1a-expressing and nonexpressing cells (see previous answer to referee #2)

      Reviewer #1 (Recommendations for the authors):

      Specific comments:

      (1) Figures 1 and 2, and related text. Based on the whole-mount images shown, the anterior otocyst appeared to be a stratified epithelium with multiple cell layers. If so, it should be clarified whether the x-y view of in the "apical" and "basal" plane are from cells residing in the apical and basal layers, respectively. Moreover, it would be helpful to include a "stage 4", a later stage to show if and when basal constrictions resolve.

      In fact, at these early stages of development, the otic epithelium is “pseudostratified”: it is formed by a single layer of irregularly shaped cells, each extending from the base to the apical aspect of the epithelium, but with their nuclei residing at distinct positions along this basal-apical axis as mitotic cells progress through the cell cycle.  The nuclei divide at the surface of the epithelium, then move back to the most basal planes within daughter cells during interphase. This process, known as interkinetic nuclear migration, has been well described in the embryonic neural tube and occurs throughout the developing otic epithelium (e.g. Orr, Dev Biol. 1975, 47,325-340, Ohta et al., Dev Biol. 2010 Sep 15;347(2):369–381. doi: 10.1016/j.ydbio.2010.09.002; ). Consequently, the nuclei visible in apical or basal planes in x-y views belong to cells extending from the base to the apex of the epithelium, but which are at different stages of the cell cycle. 

      We have not included a late stage of sensory organ segregation in this study (apart from a P0 stage in the mouse inner ear, see Figure 4) since data about later stages of sensory organ morphogenesis are available in other studies, including our Mann et al. eLife 2017 paper describing Lmx1a-GFP expression in the embryonic mouse inner ear.

      (2) Related to above, the observed changes in cell organization raised the possibility that the apical multicellular rosettes and basal constrictions observed in Stage 3 (and 2) could be intermediates of radial cell intercalations, which would lead to expansion of the space between sensory organs and thinning of the boundary domains. To see if it might be happening, it would be helpful to include DAPI staining to show the overall tissue architecture at different stages and use optical reconstruction to assess the thickness of the epithelium in the presumptive boundary domain over time.

      We agree with this referee. Besides cell addition by proliferation and/or changes in cell morphology, radial cell intercalations could indeed contribute to the spatial segregation of inner ear sensory organs (a brief statement on this possibility was added to the Discussion). It is clear from images shown in Figure 4 (and from other studies) that the non-sensory domain separating the cristae from the utricle gets flatter and its cells also enlarge as development proceeds. We do not think that DAPI staining is required to demonstrate this. Perhaps the best way to show that radial cell intercalations occur would be to perform liveimaging of the otic epithelium, but this is technically challenging in the mouse or chicken inner ear. An alternative model system might be the zebrafish inner ear, in which some liveimaging data have shown a progressive down-regulation of Jag1 expression during sensory organ segregation (and a flattening of “boundary domains”), suggesting a conservation of the basic mechanisms at play (Ma and Zhang, Development,  2015 Feb 15;142(4):763-73. doi: 10.1242/dev.113662).

      (3) Similarly, it would be helpful to include the DAPI counterstain in Figures 4, 7, and 8 to show the overall tissue architecture.

      We do not have DAPI staining for these particular images but in most cases, Sox2 immunostaining gives a decent indication of tissue morphology. 

      (4) Figure 2(z) and Figure 4d. The arrows pointing at the basal constrictions are obstructing the view of the basement membrane area, making it difficult to appreciate the morphological changes. They should be moved to the side. Can the authors comment whether they saw evidence for radial intercalations (e.g. thinning of the boundary domain) or partial unzippering of adjoining compartments along the basal constrictions?

      The arrows in Figure 2(z) and Figure 4d have been moved to the side of the panels. 

      See previous comment. Besides the presence of multicellular rosettes, we have not seen direct evidence of radial cell intercalation – this would be best investigated using liveimaging. As development proceeds, the epithelial domain separating adjoining sensory organs becomes wider. The cells that compose it gradually enlarge and flatten, as can be seen for example at P0 in the mouse inner ear (Figure 4g). 

      (5) Figures 3 and 5, and related text. It should be clarified whether the measurements were all taken from the surface cells. For Fig. 3e and 5d, the mean alignment angles of the cell long axis in the boundary regions should be provided in the text.

      The sensory epithelium in the otocyst is pseudostratified, hence, the measurement was taken from the surface of all epithelial cells labelled with F-actin. 

      We have added histograms representing the angular distribution of the cell long axis orientations in the boundary region to Figure 3 and Figure 5 Supplementary 1. We believe that this type of representation is more informative than the numerical value of the mean alignment angles of the cell long axis for defined sub-domains. 

      (6) It would be helpful to also quantify basal constrictions using the cell skeleton analysis. In addition, it would be helpful to show x-y views of cell morphology at the level of basal constrictions in the mouse tissue, similar to the chick otocyst shown in Figure 2.

      The data that we have collected do not allow a precise quantification of basal constrictions with cell skeleton analysis, due to the generally fuzzy nature of F-actin staining in the basal planes of the epithelium. However, we have followed the referee’s advice and analysed Factin staining in x-y views in the Lmx1a-GFP knock-in (heterozygous) mice. We found that the first signs of basal F-actin enrichment and multicellular actin-cable like structures at the interface of Lmx1a-positive and negative cells are visible at E11.5, and F-actin staining in the basal planes increases in intensity and extent at E13.5. (shown in new Figure 4 – Supplementary Figure 1).

      (7) Figure 5 and related text. It would be informative to analyze Lmx1a mutants at early stages (E11-E13) to pinpoint cell behavior defects during boundary formation.

      We chose the E15 stage because it is one at which we can unequivocally recognize and easily image and analyse the boundary domain from a cytoarchitectural point of view. We recognize that it would have been worth including earlier stages in this analysis but have not been able to perform these additional studies due to time constraints and unavailability of biological material. 

      (8) Figure 5-Figure S1, the quantifications suggest that Lmx1a loss had both cellautonomous and non-autonomous effects on boundary cell behaviors. This is an interesting finding, and its implication should be discussed.

      It is well-known that the absence of Lmx1a function induces a very complex (and variable) phenotype in terms of inner ear morphology and patterning defects. It is also clear from this study that the absence of Lmx1 causes non-cell autonomous defects in the boundary domain and we have already mentioned this in the discussion: “Finally, the patterning abnormalities in Lmx1a<sup>GFP/GFP</sup> samples occurred in both GFP-positive and negative territories, which points at some type of interaction between Lmx1a-expressing and nonexpressing cells, and the possibility that the boundary domain is also a signalling centre influencing the differentiation of adjacent territories.”

      (9) Figure 6 and related text. To correlate myosin II activity with boundary cell behaviors, it would be important to immunolocalize pMLC in the boundary domain in whole-mount otocyst preparations from stage 1 to stage 3.

      We tried to perform the suggested immunostaining experiments, but in our hands at least, the antibody used did not produce good quality staining in whole-mount preparations. We have therefore included images of sectioned otic tissue, which show some enrichment in pMLC immunostaining at the interface of segregating organs (Figure 6).

      (10) Figures 7 and 8. A caveat of long-term Rock inhibition is that it can affect cell proliferation and differentiation of both sensory and non-sensory cells, which would cause secondary effects on boundary formation. This caveat was not adequately addressed. For example, does Rock signaling control either the rate or the orientation of cell division to promote boundary formation? Together with the mild effect of acute Rock inhibition, the precise role of Rock signaling in boundary formation remains unclear.

      We absolutely agree that the exact function of ROCK could not be ascertained in the in vitro experiments, for the reasons we have highlighted in the manuscript (no clear effect in short term treatments, great level of tissue disorganisation in long-term treatments). This prompted us to turn to an in ovo approach. The picture remains uncertain in relation to the role of ROCK in regulating cell division/intercalation but we have been at least able to show a requirement for the maintenance of an organized and regular boundary. 

      (11) Figure 8. RCII-GFP likely also have non-autonomous effects on cell apical surface area. In 8d, it would be informative to include cell area quantifications of the GFP control for comparison.

      It is possible that some non-autonomous effects are produced by RCII-GFP expression, but these were not the focus of the present study and are not particularly relevant in the context of large patches of overexpression, as obtained with RCAS vectors. 

      We have added cell surface area quantifications of the control RCAS-GFP construct for comparison (Figure 8e).

      (12) The significance of the presence of cell divisions shown in Figure 9 is unclear. It would be informative to include some additional analysis, such as a) quantify orientation of cell divisions in and around the boundary domain and b) determine whether patterns of cell division in the sensory and nonsensory regions are disrupted in Lmx1a mutants.

      These are indeed fascinating questions, but which would require considerable work to answer and are beyond the scope of this paper. 

      Minor comments:

      (1) Figure 1. It should be clarified whether e', h' and k' are showing cortical F-actin of surface cells. Do the arrowheads in i' and l' correspond to the position of either of the arrowheads in h' and k', respectively?

      The epithelium in the otocyst is pseudostratified. Therefore, images e’, h’, k’ display F-actin labelling on the surface of tissue composed of a single cell layer. We have added arrows to images e”, h”, and k” to indicate the corresponding position of z-projections and included appropriate explanation in the legend of Figure 1: “Black arrows on the side of images e”, h”, and k” indicate the corresponding position of z-projections.”

      (2) Figure 3-Figure S1. Please mark the orientation of the images shown.

      We labelled the sensory organs in the figure to allow for recognizing the orientation. 

      (3) Figure 4. Orthogonal reconstructions should be labeled (z) to be consistent with other figures.

      We have corrected the labelling in the orthogonal reconstruction to (z). 

      (4) Figure 4g. It is not clear what is in the dark area between the two bands of Lmx1a+ cells next to the utricle and the LC. Are those cells Lmx1a negative? It is unclear whether a second boundary domain formed or the original boundary domain split into two between E15 and P0? Showing the E15 control tissue from Figure 5 would be more informative than P0.

      In this particular sample there seems to be a folding of the tissue (visible in z-reconstructions) that could affect the appearance of the projection shown in 4g. We believe the P0 is a valuable addition to the E15 data, showing a slightly later stage in the development of the vestibular organs.

      (5) Figure 5a, e. Magnified regions shown in b and f should be boxed correspondingly.

      This figure has been revised. We realized that the previous low-magnification shown in (e) (now h) was from a different sample than the one shown in the high-magnification view. The new figure now includes the right low-magnification sample (in h) and the regions shown in the high-magnification views have been boxed.

      (6) Figure 8f, h, j. Magnified regions shown in g, i and k should be boxed correspondingly.

      The magnified regions were boxed in Figure 8 f, h, and j. Additionally, black arrows have been placed next to images 8g", 8i", and 8k" to highlight the positions of the z-projections. An appropriate explanation has also been added to the figure legend.

      (9) Figure 8. It would be helpful to show merged images of GFP and F-actin, to better appreciate cell morphology of GFP+ and GFP- cells.

      As requested, we have added images showing overlap of GFP and F-actin channels in Figure 8.

      Reviewer #2 (Recommendations for the authors):

      The PMLC staining could be improved. Two decent antibodies are the p-MLC and pp-MLC antibodies from CST. pp-MLC works very well after TCA fixation as detailed in https://www.researchsquare.com/article/rs-2508957/latest . As phalloidin does not work well after TCA fixation, affadin works very well for segmenting cells.

      If the authors do not wish to repeat the pMLC staining, the details of the antibody used should be mentioned.

      We used mouse IgG1 Phospho-Myosin Light Chain 2 (Ser19) from Cell Signaling Technology (catalogue number #3675) in our immunohistochemistry for PMLC. This is one of the two antibodies recommended by the reviewer #2. Information about this antibody has now been included in material and methods. This antibody has been referenced by many manuscripts, but unfortunately, in our hands at least, it did not perform well in whole-mount preparations.

      A statement on the availability of the data should be included.

      We have included a statement on the data availability: “All data generated or analysed during this study is available upon request.”

      Reviewer #3 (Recommendations for the authors):

      Outstanding issues:

      (1) Morphological description: The apical alignment of epithelial cells at the border is clear but not the upward pull of the basal lamina. Very often, it seems to be the Sox2 staining that shows the upward pull better than the F-actin staining. Perhaps, adding an anti-laminin staining to indicate the basement membrane may help.

      Indeed, the upward pull of the basement membrane is not always very clear. We performed some anti-laminin immunostaining on mouse cryosections and provide below (Figure 1) an example of such experiment. The results appear to confirm an upward displacement of the basement membrane in the region separating the lateral crista from the utricle in the E13 mouse inner ear, but given the preliminary nature of these experiments, we believe that these results do not warrant inclusion in the manuscript. The term “pull” is somehow implying that the epithelial cells are responsible for the upward movement of the basement membrane, but since we do not have direct evidence that this is the case, we have replaced “pull” by “displacement” throughout the text. 

      (2) It is not clear how well the cellular changes are correlated with the timing of border formation as some of the ages shown in the study seem to be well after the sensory patches were separated and the border was established.

      For some experiments (for example E15 in the comparison of mouse Lmx1a-GFP heterozygous and homozygous inner ear tissue; E6 for the RCAS experiments), the early stages of boundary formation are not covered because we decided to focus our analysis on the late consequences of manipulating Lmx1a/ROCK activity in terms of sensory organ segregation. The dataset is more comprehensive for the control developmental series in the chicken and mouse inner ear. 

      (3) The Lmx1a data, as they currently stand could be explained by Lmx1a being required for non-sensory development and not necessarily border formation. Additionally, the relationship between ROCK and Lmx1a was not investigated. Since the investigators have established the molecular mechanisms of Lmx1 function using the chicken system previously, the authors could try to correlate the morphological events described here with the molecular evidence for Lmx1 functioning during border formation in the same chicken system. Right now, only the expression of Sox2 is used to correlate with the cellular events, and not Lmx1, Jag1 or notch.

      These are valid points. Exploring in detail the epistatic relationships between Notch signalling/Lmx1a/ROCK/boundary formation in the chicken model would be indeed very interesting but would require extensive work using both gain and loss-of-function approaches, combined with the analysis of multiple markers (Jag1/Sox2/Lmx1b/PMLC/Factin..). At this point, and in agreement with the referee’s comment, we believe that Lmx1a is above all required for the adoption of the non-sensory fate. The loss of Lmx1a function in the mouse inner ear produce defects in the patterning and cellular features of the boundary domain, but these may be late consequences of the abnormal differentiation of the nonsensory domains that separate sensory organs. Furthermore, ROCK activity does not appear to be required for Sox2 expression (i.e. adoption or maintenance of the sensory fate) since the overexpression of RCII-GFP does not prevent Sox2 expression in the chicken inner ear. This fits with a model in which Notch/Lmx1a regulate cell differentiation whilst ROCK acts independently or downstream of these factors during boundary formation. 

      Specific comments:

      (1) Figure 1. The downregulation of Sox2 is consistent between panels h and k, but not between panels e and h. The orthogonal sections showing basal constriction in h' and k' are not clear.

      The downregulation is noticeable along the lower edge of the crista shown in h; the region selected for the high-magnification view sits at an intermediate level of segregation (and Sox2 downregulation). 

      The basal constriction is not very clear in h, but becomes easier to visualize in k. We have displaced the arrow pointing at the constriction, which hopefully helps. 

      (2) Figure 2. Where was the Z axis taken from? One seems to be able to imagine the basal constriction better in the anti-Sox2 panel than the F-actin panel. A stain outlining the basement membrane better could help.

      Arrows have been added on the side of the horizontal views to mark the location of the zreconstruction. See our previous replies to comments addressing the upward displacement of the basement membrane.

      (3) Figure 4

      I question the ROI being chosen in this figure, which seems to be in the middle of a triad between LC, prosensory/utricle and the AC, rather than between AC and LC. If so, please revise the title of the figure. This could also account for the better evidence of the apical alignment in the upper part of the f panel.

      We have corrected the text. 

      In this figure, the basal constriction is a little clearer in the orthogonal cuts, but it is not clear where these sections were taken from.

      We have added black arrows next to images 4c’, 4f’, and 4i’ to indicate the positions of the zprojections.  

      By E13.5, the LC is a separate entity from the utricle, it makes one wonder how well the basal constriction is correlated with border formation. The apical alignment is also present by P0, which raises the question that the apical alignment and basal restriction may be more correlated with differentiation of non-sensory tissue rather than associated with border formation.

      We agree E13.5 is a relatively late stage, and the basal constriction was not always very pronounced. The new data included in the revised version include images of basal planes of the boundary domain at E11.5, which reveal F-actin enrichment and the formation of an actin-cable-like structure (Figure 4 suppl. Fig1). Furthermore, the chicken dataset shows that the changes in cell size, alignment, and the formation of actin-cable-like structure precede sensory patch segregation and are visible when Sox2 expression starts to be downregulated in prospective non-sensory tissue (Figure 1, Figure 2). Considering the results from both species, we conclude that these localised cellular changes occur relatively early in the sequence of events leading to sensory patch segregation, as opposed to being a late consequence of the differentiation of the non-sensory territories.  

      I don't follow the (x) cuts for panels h and I, as to where they were taken from and why there seems to be an epithelial curvature and what it was supposed to represent.

      We have added black arrows next to the panels 4c’, 4f’, and 4i’ to indicate the positions of the z-projections and modified the legend accordingly. The epithelial curvature is probably due to the folding of the tissue bordering the sensory organs during the manipulation/mounting of the tissue for imaging.

      (4) Figure 5 The control images do not show the apical alignment and the basal constriction well. This could be because of the age of choice, E15, was a little late. Unfortunately, the unclarity of the control results makes it difficult for illustrating the lack of cellular changes in the mutant. The only take-home message that one could extract from this figure is a mild mixing of Sox2 and Lmx1a-Gfp cells in the mutant and not much else. Also, please indicate the level where (x) was taken from.

      Black arrows have been placed next to images 5e and 5l to highlight the positions of the zprojections. The stage E15 chosen for analysis was appropriate to compare the boundary domains once segregation is normally completed. We believe the results show some differences in the cellular features of the boundary domain in the Lmx1a-null mouse, and we have in fact quantified this using Epitool in Figure 5 – Suppl. Fig 1. Cells are more elongated and better aligned in the Lmx1a-null than in the heterozygous samples.  

      (5) Figure 7. I think the cellular disruption caused by the ROCK inhibitor, shown in q', is too severe to be able to pin to a specific effect of ROCK on border formation. In that regard, the ectopic expression of the dominant negative form of ROCK using RCAS approach is better, even though because it is a replication competent form of RCAS, it is still difficult to correlate infected cells to functional disruption.

      We used a replication-competent construct to induce a large patch of infection, increasing our chances of observing a defect in sensory organ segregation and boundary formation. We agree that this approach does not allow us to control the timing of overexpression, but the mosaicism in gene expression, allowing us to compare in the same tissue large regions with/without perturbed ROCK activity, proved more informative than the pharmacological/in vitro experiments.

      (6) Figure 8. Outline the ROI of i in h, and k in j. Outline in k the comparable region in k'. In k", F-actin staining is not uniform. Indicate where (x) was taken from in K.

      The magnified regions were boxed in Figure 8 f, h, and j. Region outlined in figures k’-k” has also been outlined in corresponding region in figure k. Additionally, black arrows have been placed next to images 8g", 8i", and 8k" to highlight the positions of the z-projections. An appropriate explanation has also been added to the figure legend.

      Minor comments:

      (1) P.18, 1st paragraph, extra bracket at the end of the paragraph.

      Bracket removed

      (2) P.22, line 11, in ovo may be better than in vivo in this case.

      We agree, this has been corrected. 

      (3) P.25, be consistent whether it is GFP or EGFP.

      Corrected to GFP.

      (4) P.26, line 5. Typo on "an"

      Corrected to “and”

      Author response image 1.

      Expression of Laminin and Sox2 in the E13 mouse inner ear. a-a’’’) Low magnification view of the utricle, the lateral crista, and the non-sensory (Sox2-negative) domain separating these. Laminin staining is detected at relatively high levels in the basement membrane underneath the sensory patches. At higher magnification (b-b’’’), an upward displacement of the basement membrane (arrow) is visible in the region of reduced Sox2 expression, corresponding to the “boundary domain” (bracket). 

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Weaknesses:

      (1). Analysis of transcript expression is limited to the CT-peptide encoding gene, while no gene expression analysis was attempted for the three identified receptors. Differences in the activation of downstream signaling pathways between the three receptors are also questionable due to unclarities in the statistical analysis and variation in the control and experimental data in heterologous assays. Together, this makes it difficult to propose a mechanism underlying differences in the functions of the two CT-like peptides in muscle control and growth regulation.

      We appreciate the reviewer's rigorous critique. The manuscript has been comprehensively revised as follows:

      (1) For the expression analysis of the three identified receptors, the updated results are presented in Figure 5, with the detailed descriptions in Results section 2.4 (line 287-290) and Materials and Methods section 4.5 (line 767).

      (2) For the statistical tests and methodological clarity, statistical tests were indeed performed for all experiments. However, we acknowledge that the original labeling methods required enhanced methodological clarity, and we apologize for any confusion caused. All figures have been revised to improve the visibility of differences, and statistical test information has been added to both the figure legends and the Materials and methods section “4.10 Statistical Analysis” (line 900-910).

      (3) For the variation in the control and experimental data, the minor observed variations in control conditions across experiments primarily arise from two methodological factors: 1) Each experimental set used cells transfected with distinct receptor subtypes (e.g., AjPDFR1 vs. AjPDFR2), inherently introducing baseline variability due to differential receptor expression profiles. 2) Independent cell culture batches were employed for replicate experiments to ensure biological reproducibility.  Importantly, these minor variations ‌did not compromise‌ the statistical significance of downstream signaling differences (p < 0.01 for all comparative analyses). Therefore, differences in the activation of downstream signaling pathways between the three receptors are reliable.

      (2) The authors also suggest a putative orexigenic role for the CT-like peptidergic system in feeding behavior. This effect is not well supported by the experimental data provided, as no detailed analysis of feeding behavior was carried out (only indirect measurements were performed that could be influenced by other peptidergic effects, such as on muscle relaxation) and no statistically significant differences were reported in these assays.

      Thank you for the reviewer’s valuable comments. Our revised manuscript now includes the following multidimensional analyses to strengthen evidence of the orexigenic role of AjCT2: Firstly, in sea cucumbers, the mass of remaining bait is a common indicator of feeding condition. After long-term AjCT2 injection, this value was significantly decreased in comparison with control group during phase V (Figure 8A-figure supplement 1), which indicates that AjCT2 promotes feeding in A. japonicus. Correspondingly, in long-term loss-of-function experiments (newly added in the revised manuscript), the remaining bait in the siAjCTP1/2-1 group was significantly increased in comparison with siNC group form phase II to IV (Figure 10B). The detailed descriptions of these supplementary experiments have been added to‌ Results Section 2.6 (lines 390-396) and Materials and Methods Section 4.9 (line 879-888).

      Secondly, after 24 days of continuous injections of siAjCTP1/2-1, we monitored the feeding behavior of these sea cucumbers over three consecutive days. Each day, we removed residual bait and feces, then repositioned fresh food at the tank center.‌ We calculated the aggregation percentage (AP) of sea cucumbers around the food during the feeding peak (2:00-4:00) each day, which is the most reliable indicator of feeding behavior in this species‌. The results showed that the AP in siAjCTP1/2-1 group was significantly lower than that in control group. Post-dissection observations revealed reduced intestinal food content and significant intestinal degeneration in the siAjCTP1/2-1 group (The figure has been added below). These results indicate that long-term functional loss of AjCT2 reduces food intake and influences the feeding behavior of A. japonicus.

      In response to the comment regarding “No statistically significant differences were reported in these assays”, we have modified the figures to clearly visualize the differences and added statistical test details in both the figure legends and the Materials and methodssection “4.10 Statistical analysis” (lines 900–910).

      Author response image 1.

      The feeding behavior of A. japonicus after long-term loss-of-function of AjCT2. (A) A record of feeding behavior. The red arrow refers to the food and the red box represents the feeding area. The numbers in the figure represent individuals entering into the feeding area. (B) The aggregation percentage (AP) of sea cucumbers around the food during the feeding peak (2:00-4:00) (n=3 days). (C) The degenerated intestine of sea cucumber after 24 days of siAjCTP1/2-1 injection. Data in the graph represent the mean ± standard deviation. *Significant differences between groups (p < 0.05). Control: siNC injection group; CT-SiRNA: siAjCTP1/2 injection group.<br />

      (3) Overall, details regarding statistical analyses are not (clearly) specified in the manuscript, and there are several instances where statements are not supported by literature evidence.

      Thank you for the reviewer’s comments. Again, we sincerely apologize for the confusion caused. To clarify, statistical tests were performed for all experiments. However, the original labeling may have been somewhat messy. We have revised all figures to enhance the visibility of differences and provided detailed statistical test information in both the figure legends and the Materials and Methods section titled “4.10 Statistical Analysis” (lines 900–910). Additionally, we have supplemented the revised manuscript with further literature evidence to support our statements: (1) citation to Furuya et al. (2000), Johnson et al. (2005), Jékely (2013) and Mirabeau et al. (2013) have been added to clarify the foundation studies on DH31 and DH31 receptors in invertebrates (line 73-74); (2) Conzelmann et al. (2013) and Furuya et al. (2000) were cited to validate the present of two different types of CT-related peptides in protostomes: CT-type peptides (with an N-terminal disulphide bridge) and DH31-type peptides (lacking this feature) (line 78-79); (3) Johnson et al. (2005) was referenced to support the dual ligand-receptor interactions of DH31 in Drosophila, specifically its binding to both CG17415 (a CTR/CLR-related protein) and CG13758 (the PDF receptor)  (line 94); (4) Johnson et al. (2005) and Goda et al. (2019) were cited to reinforce the functional significance of dual DH31 receptor pathways in Drosophila, as extensively studied in prior research (line 95-97).

      Reviewer #2 (Public review):

      Weaknesses:

      (1) The authors claim that A. japonicus CTs activate "PDF" receptors and suggest that this cross-talk is evolutionarily ancient since a similar phenomenon also exists in the fly Drosophila melanogaster. These conclusions are not fully supported for several reasons. The authors perform phylogenetic analysis to show that the two "PDF" receptors form an independent clade. This clade is sister to the clade comprising CT receptors. This phylogenetic analysis suffers from several issues. Firstly, the phylogenies lack bootstrap support. Secondly, the resolution of the phylogeny is poor because representative members from diverse phyla have not been included. For instance, insect or other protostomian PDF receptors have not been included so how can the authors distinguish between "PDF" receptors or another group of CT receptors? Thirdly, no in vivo evidence has been presented to support that CT can activate "PDF" receptors in vivo.

      We thank the reviewers for their constructive comments. As suggested, ‌we expanded our taxon sampling to include more representative members across diverse phyla‌ and reanalyzed the phylogenetic relationships (including bootstrap tests) in Figure 1C. The revised analysis revealed two distinct clades‌: one containing CTR/CLR-type receptors and the other PDF-type receptors. Specifically, AjCTR clustered within the CTR/CLR-type receptor group, while AjPDFR1 and AjPDFR2 were placed in the PDF-type receptor clade. The full species names for all taxa were provided in the Supplementary Table 2.

      To provide in vivo evidence supporting CT-mediated activation of "PDF" receptors‌, we conducted the following experiments: Firstly, we confirmed that AjPDFR1 and AjPDFR2 were the functional receptors of AjCT1 and AjCT2 (Figure 2, 3 and 4). Secondly, injection of AjCT2 and siAjCTP1/2-1 in vivo induced corresponding changes in AjPDFR1 and AjPDFR2 expression levels in the intestine (Figure 8C, 9A, 9B and 9C).

      (2) The source of CT which mediates the effects on longitudinal muscles and intestine is unclear. Is it autocrine or paracrine signaling by CT from the same tissue or is it long-range hormonal signaling?

      Thank you for this feedback. We have now analysed CT-type neuropeptide expression in A. japonicus using immunohistochemistry with the antiserum to the A. rubens CT-type peptde ArCT, which has previously been shown to cross-react with CT-type neuropeptides in other echinoderms (Aleotti et al., 2022). We have added related descriptions in the following sections: Results (section 2.4, line 299-336), Discussion (section 3.3, line 545-554) and Materials and methods (section 4.6, line 785-817). Consistent with this previous finding, the ArCT antiserum labelled neuronal cells and fibers in the central and peripheral nervous system and in the digestive system of A. japonicus (Figure 6). The specificity of immunostaining was confirmed by performing pre-absorption tests with the ArCT antigen peptide (Figure 6-figure supplement 1). The detection of immunostaining in the innervation of the intestine is consistent with PCR results and the relaxing effect of AjCT2 on intestine preparations. Interestingly, no immunostaining was observed in longitudinal muscle, which is inconsistent with the detection of AjCT1/2 transcripts in this tissue. This may reflect differences in the sensitivity of the methods employed to detect transcripts (PCR) and mature peptide (immunohistochemistry). The absence of ArCT-like immunoreactivity in the longitudinal muscles suggests that AjCT1 and AjCT2 may exert relaxing effects on this tissue in vivo via hormonal signaling mechanisms. However, because AjCT1/2 expression in the longitudinal muscles may be below the detection threshold of the ArCT antibodies, we can’t rule out the possibility that AjCT1/2 are released within the longitudinal muscles physiologically.   

      (3) Pharmacology experiments showing the effects of CT1 and CT2 on ACh-induced contractions were performed. Sample traces have been provided but no traces with ACh alone have been included. How long do ACh-induced contractions persist? These controls are necessary to differentiate between the eventual decay of ACh effects and relaxation induced by CT1 and CT2. The traces also do not reflect the results portrayed in dose-response curves. For instance, in Figure 6B, maximum relaxation is reported for 10-6M. Yet, the trace hardly shows any difference before and after the addition of 10-6M peptide. The maximum effect in the trace appears to be after the addition of 10-8M peptide.

      Thank you for the reviewer’s comments. ‌As requested, we have included representative traces of ACh-induced contraction of longitudinal muscle and intestinal preparations (Figure 7—figure supplement 1B and 1C). Notably, the positive control (ACh) maintained contraction effects for at least 15 minutes‌, consistent with its known pharmacological properties. Regarding Figure 7B (previous Figure 6B), ‌the trace illustrates the cumulative effects of successive neuropeptide treatments at increasing concentrations‌. A gradual reduction in response amplitude was observed at the highest peptide concentration, ‌likely reflecting receptor desensitization‌, a phenomenon previously reported for neuropeptide Y and oxytocin (Tsurumaki et al., 2003; Arrowsmith and Wray, 2014). These results are now explicitly described in the Results Section 2.5 (lines 340-345 and 348-352) and discussed in Section 3.3 (lines 569-574). In response to the reviewer’s suggestion‌, we further tested the pharmacological effects of AjCT2 at 10⁻⁶ M. ‌As shown in Figure 7—figure supplement 1A, this concentration induced maximal relaxation‌, confirming its dose-dependent efficacy.

      (4) I am unsure how differences in wet mass indicate feeding and growth differences since no justification has been provided. Couldn't wet mass also be influenced by differences in osmotic balance, a key function of calcitonin-like peptides in protostomian invertebrates? The statistical comparisons have not been included in Figure 7B.

      We appreciate the reviewer's insightful comments. We fully concur that wet mass constitutes an inadequate indicator for evaluating feeding and growth variations. Consequently, we reassessed A. japonicus growth parameters using two established metrics: weight gain rate (WGR) and specific growth rate (SGR), to delineate differences between experimental and control groups. Notably, the high-concentration AjCT2 injection group exhibited statistically significant increases in both WGR and SGR relative to controls (Figure 8A). This demonstrates a putative physiological role of AjCT2 signaling in enhancing feeding efficiency and growth performance in A. japonicus. Detailed methodologies are provided in the Materials and methods Section 4.8 (lines 847-851), with corresponding results presented in the Results Section 2.6 (lines 370-375). Besides, Cong et al., (2024) reported holotocin-induced osmoregulatory function in A. japonicus, manifested by significant wet weight elevation and body bloating. However, our AjCT2 intervention showed no such phenotypic alterations, suggesting that AjCT2 likely does not participate in osmotic balance regulation, at least under these experimental conditions. Crucially, the observed WGR and SGR enhancements following AjCT2 administration was not caused by osmoregulatory effects.

      (5) While the authors succeeded in knocking down CT, the physiological effects of reduced CT signaling were not examined.

      Thank you for the reviewer’s comment. We have supplemented the experiments to investigate the physiological effects of long-term reduced CT signaling following the reviewer’s suggestions, including measuring the dry weight of remaining bait and excrement, calculating the weight gain rate and specific growth rate, and testing the expression levels of three growth factors (AjMegf6, AjGDF-8 and AjIgf) to further assess AjCT2’s role in feeding and growth. The results demonstrated that weight gain rate and specific growth rate in the siAjCTP1/2-1 group were significantly decreased (As shown in Figure 10A). Correspondingly, except in phase I, the siAjCTP1/2-1 group exhibited a significant increase in remaining bait and a decrease in excrement during phases II-VI (Figure 10B). Furthermore, the growth inhibitory factor AjGDF-8 was significantly up-regulated and the growth promoting factor AjMegf6 was significantly down-regulated in siAjCTP1/2-1 group (Figure 10C). These findings further support the potential physiological role of AjCT2 signaling in promoting feeding and growth in A. japonicus. The added results are presented in Figure 10, with related descriptions in Section 2.6 (Results, lines 390-396), Section 3.4 (Discussion, line 597-603) and Section 4.9 (Materials and Methods, lines 879-888).

      Reviewer #1 (Recommendations for the authors):

      (1) The abstract states that loss-of-function tests (RNAi knockdown) reveal a potential physiological role for AjCT2 signaling in promoting feeding and growth in A. japonicus. However, RNAi knockdown was only followed by analysis of transcript expression of CT-like receptors and not by the assessment of feeding or growth.

      Thank you for this helpful feedback. In the revised manuscript, we have supplemented the experiments to investigate the physiological effects of long-term reduced CT signaling, as suggested by the reviewer. These include measuring the dry weight of remaining bait and excrement, calculating the weight gain rate and specific growth rate, and testing the expression levels of the three growth factors (AjMegf6, AjGDF-8 and AjIgf) to further assess the function of AjCT2 on feeding and growth in A. japonicus. The results are as follows:

      (1) The weight gain rate and specific growth rate in the siAjCTP1/2-1 group were significantly decreased (As shown in Figure 10A).

      (2) Correspondingly, except for the phase I, the siAjCTP1/2-1 group had significantly increased remaining bait and decreased excrement during phases II-VI (Figure 10B).

      (3) The growth inhibitory factor AjGDF-8 was significantly up-regulated, while the growth promoting factor AjMegf6 was significantly down-regulated in the siAjCTP1/2-1 group (Figure 10C).

      These findings further support the potential physiological role of AjCT2 signaling in promoting feeding and growth in A. japonicus. We have incorporated these results into ‌Figure 10‌ and added related descriptions in the following sections: Results (section 2.6, line 390-396), Discussion (section 3.4, line 597-603) and Materials and methods (section 4.9, line 879-888).

      Regarding the original statement in the abstract “Furthermore, in vivo pharmacological experiments and loss-of-function tests revealed a potential physiological role for AjCT2 signaling in promoting feeding and growth in A. japonicus.” This sentence effectively summarizes our findings. Therefore, we have retained it in the revised manuscript while supplementing the missing experimental details as requested.

      (2) Information on the statistical tests that were performed is lacking for most experiments. It is recommended to include this information in the figure legends, in addition to the methods section. Details on the phylogenetic analysis (parameters and statistics used) and calculation of half maximal effective concentrations (calculation methods and confidence intervals) also need to be included in the manuscript.

      Thank you for this constructive feedback. As the reviewer suggested, statistical test information‌ has been incorporated into both the figure legends and the “4.10 Statistical Analysis” subsection of the Materials and methods (lines 900-910). Specifically:

      (1)Phylogenetic analysis details‌ (parameters and statistical approaches) are now provided in the Materials and methods section 4.2 (line 675-682);

      (2) Bootstrap test results‌ supporting the phylogenetic trees have been added to Figure 1B and 1C‌;

      (3)Half-maximal effective concentration (EC₅₀) calculations‌, including methodologies and confidence intervals, are documented in both the Figure 2B legend and the “4.10 Statistical Analysis” section (lines 900-910)‌‌.

      (3) In some figures (e.g. Figure 5A, 7A), the n number indicated does not match the number of data points shown in the figure panel. It is not clear what n represents here. In Figure 6B, an x-axis label is missing. In some figure legends (e.g. Figure 4 - Figure Supplement 1), the error bars and significance levels are not defined.

      We apologize for this error; we have corrected all quantity errors related to "n" in the manuscript’ figure legends. And also, the x-axis label was added in Figure 7B (previous Figure 6B), error bars and significance levels were defined in all figure legends clearly

      (4) It would be useful to explain what the difference is between the Cre and SRE luciferase assay and why these two assays were used to study receptor-activated signaling cascades. The source of the synthetic peptides is mentioned, but it is recommended to also state the purity of the synthetic peptides.

      Thank you for the valuable comments. As stated in the introduction (line 66-69)- “binding of CT to CTR in the absence of RAMPs can activate signaling via several downstream pathways, including cAMP accumulation, Ca<sup>2+</sup> mobilization, and ERK activation.” Based on this established mechanism, we selected ‌cAMP and Ca²⁺ signaling pathways‌ as biomarkers for studying receptor-activated cascades, with the following experimental rationale: CRE-Luc Reporter System functions as a cAMP response element detector and SRE-Luc Reporter System serves as an intracellular Ca²⁺ level indicator. In CRE-Luc detection, when the receptor is activated by a ligand, it couples with Gαs protein to activate the cAMP/PKA signaling pathway. The accumulation of cAMP can lead to the phosphorylation of PKA, and then enhance the transcription of CRE-containing genes. Therefore, significant increase in CRE-Luc activity directly correlates with cAMP accumulation. Similarly, SRE-Luc activity reflects dynamic changes in intracellular Ca<sup>2+</sup> levels. We have added the explanation of this part in the materials and methods section 4.4 (line 715-721). The purity of the synthetic peptides was >95%, and we have also added this information in section 4.4 (line 715) according to the reviewer’s suggestion.

      (5) In Figure 3B, it is difficult to see receptor internalization in response to the application of synthetic CT-like peptides, and a control condition (without peptide application) is lacking.

      Thank you for the reviewer’s comment. The control condition (without peptide application) was added in Figure 3-figure supplement 1, which shows the localization of pEGFP-N1/receptors in the cell membrane. Upon stimulation with synthetic CT-like peptides (‌Materials and methods section 2.3‌), the receptors exhibit clear internalization into the cytoplasm, as visualized in ‌Figure 3B‌ through comparative analysis.

      (6) Differences in the activation of downstream signaling cascades between the three receptors are questionable because there is substantial variation in the experimental data and control conditions in different experiments (for example, in Figures 3A and 4A). To better represent this variation, it is recommended to plot individual data points onto the bar graphs in all figures and to nuance the interpretation of putative differences in downstream signaling of different receptors. Differences in the physiological roles of CT-like peptides may be explained by various mechanisms, including differences in peptide/receptor expression or in the potency of peptides to activate different receptors in vivo. It would be useful to elaborate on these different explanations in the discussion.

      We appreciate the reviewer's critical assessment. The observed variations in control conditions across experiments (e.g., Figures 3A & 4A) primarily arise from two methodological factors: ① Each experimental set used cells transfected with distinct receptor subtypes (e.g., AjPDFR1 vs. AjPDFR2), inherently introducing baseline variability due to differential receptor expression profiles. ② Independent cell culture batches were employed for replicate experiments to ensure biological reproducibility.  Importantly, these minor variations ‌did not compromise‌ the statistical significance of downstream signaling differences (p < 0.01 for all comparative analyses). And according to the reviewer’s suggestion, we have plotted individual data points onto the bar graphs in all figures.

      And also, according to the reviewer’s suggestion, we have expanded the discussion on receptor-specific signaling cascades in Section 3.4 (lines 589-609). Key findings include: In vivo pharmacological assays demonstrated that ‌only high concentrations of AjCT2 significantly enhanced feeding and growth rates in A. japonicus‌. In contrast, neither a low concentration of AjCT2 nor any concentration of AjCT1 (low or high) induced detectable effects. Furthermore, ‌long-term knockdown of AjCTP1/2 further validated the essential role of AjCT2 in regulating feeding and growth‌ in this species. To elucidate the receptor mediating AjCT2’s feeding- and growth-promoting effects, we selected AjPDFR2 based on its distinct activation profile:‌ AjCT2 selectively activated AjPDFR2, inducing downstream ERK1/2 phosphorylation, whereas AjCT1 exhibited no activity‌ toward this receptor. Given this receptor specificity, we performed AjPDFR2 knockdown experiments, which revealed phenotypic changes ‌consistent with those in AjCTP1/2 knockdown animals‌, including ‌significantly reduced WGR and SGR‌, alongside ‌increased remaining bait accumulation and diminished excrement output‌ compared to control. Collectively, these results support a model wherein AjCT2 promotes feeding and growth in A. japonicus via AjPDFR2-dependent activation of the cAMP/PKA/ERK1/2 and Gαq/Ca²⁺/PKC/ERK1/2 cascades‌. Considering the inherent complexity of neuropeptide signaling systems, which involve multiple GPCR subtypes coupled to diverse signaling cascades, ligands bound to the same receptor may activate distinct G protein subforms within a single cell (Møller et al., 2003; Mendel et al., 2020). Receptor activation modes may be modulated by structural polymorphisms or binding site diversity (Wong et al., 2000; Changeux, 2010), as well as by the differential efficacy of peptides in activating receptors in vivo‌.  

      (7) For the peptide injection experiments, it is recommended to explain the different animal groups in the results section. In addition, injection in the control condition seems to have a small effect on the wet weight. Therefore, it would be useful to compare control-injected and peptide-injected groups after injection.

      Thank you for the reviewer’s comments. We have provided an expanded explanation of the animal group classifications in Section 2.6 (lines 367–375). We fully agree that a comparative analysis between the experimental and control groups post-injection is essential. However, since wet weight measurement is suboptimal for demonstrating feeding and growth variations, we re-evaluated the data using two validated metrics: weight gain rate (WGR) and specific growth rate (SGR) of A. japonicus. The results revealed that the high-concentration AjCT2 injection group exhibited significantly elevated weight gain rate and specific growth rate compared to the control group, suggesting a potential role of AjCT2 signaling in promoting feeding and growth in A. japonicus. These results are presented in Figure 8A, with detailed descriptions in Results Section 2.6 (lines 370–375) and methodology in Materials and Methods Section 4.8 (lines 847-851).

      (8) Regarding the RNAi knockdown experiments, it is not clear from the methods section what the siNC control exactly is, and how the interference rate is calculated.

      Thank you for this comment. The siNC control was siRNA which does not target any genes in A. japonicus, with interference rates quantified through the 2<sup>-ΔΔCT</sup> method to assess siRNA inhibition efficiency.‌ These methodological details have been incorporated into Materials and Methods Section 4.9 (lines 866–867 and 874-876) for enhanced clarity.‌

      Reviewer #2 (Recommendations for the authors):

      (1) Both the phylogenies are missing bootstrap tests. Please include this analysis. The phylogenetic analyses should also include other Family B ligands and receptors from both vertebrates and invertebrates because it is widely assumed that PDF is related to VIP given their shared roles in circadian clock and gut regulation. Therefore, this analysis needs to be more comprehensive than currently presented. Drosophila melanogaster receptors have also been excluded in spite of the Drosophila PDFR exhibiting ligand promiscuity. The legend should also include the full species names of the various taxa (or modify the figure to include full names) instead of referring to another table. The supplementary table was not available to this reviewer.

      Thank you for the reviewer’s constructive comments. According to the reviewer’s suggestion, we have incorporated the VIPRs and Drosophila melanogaster receptors into the comparative analysis and reanalyzed the phylogenies in Figure 1C, and both phylogenies included bootstrap tests (Figure 1B, 1C) in the revised manuscript. The full species names of the various taxa are listed in supplementary tables 1 and 2 in the revised manuscript.

      (2) Expression data indicate that AjCTP1/2 is expressed in both the longitudinal muscles and intestine. What are the cell types that express AjCTP1/2? Given that the authors show an effect of CT1 and CT2 on both of these tissues, it would be important to know whether this is local regulation (paracrine or autocrine) vs long-distance hormonal control by the nervous system. This can be addressed by performing in situ hybridization or immunohistochemistry of CT (using Asterias rubens CT antibody: https://doi.org/10.3389/fnins.2018.00382) on these tissues.

      Thank you for this feedback. We have now analysed CT-type neuropeptide expression in A. japonicus using immunohistochemistry with the antiserum to the A. rubens CT-type peptde ArCT, which has previously been shown to cross-react with CT-type neuropeptides in other echinoderms (Aleotti et al., 2022). We have added related descriptions in the following sections: Results (section 2.4, line 299-336), Discussion (section 3.3, line 545-554) and Materials and methods (section 4.6, line 785-817). ‌Consistent with this previous finding, the ArCT antiserum labelled neuronal cells and fibers in the central and peripheral nervous system and in the digestive system of A. japonicus (Figure 6). The specificity of immunostaining was confirmed by performing pre-absorption tests with the ArCT antigen peptide (Figure 6-figure supplement 1). The detection of immunostaining in the innervation of the intestine is consistent with PCR results and the relaxing effect of AjCT2 on intestine preparations. Interestingly, no immunostaining was observed in longitudinal muscle, which is inconsistent with the detection of AjCT1/2 transcripts in this tissue. This may reflect differences in the sensitivity of the methods employed to detect transcripts (PCR) and mature peptide (immunohistochemistry). The absence of ArCT-like immunoreactivity in the longitudinal muscles suggests that AjCT1 and AjCT2 may exert relaxing effects on this tissue in vivo via hormonal signaling mechanisms. However, because AjCT1/2 expression in the longitudinal muscles may be below the detection threshold of the ArCT antibodies, we can’t rule out the possibility that AjCT1/2 are released within the longitudinal muscles physiologically.       

      (3) While Drosophila DH31 can activate both PDF and DH31 receptors, the EC50 values differ drastically. Importantly, there is an independent gene encoding PDF which is a more sensitive ligand for the PDF receptor. This is in stark contrast to the situation presented here where the authors have yet to identify the PDF gene in their system. Outside Drosophila this cross signaling between the two systems has not been observed in any species. Based on this, I would argue that the ability of CTs to activate PDFR is not an evolutionary ancient property but rather an example of convergent evolution if supported by more evidence.

      We sincerely appreciate the reviewers' insightful comments.‌ We agree that we cannot rule out the possibilty that ability of CT-type peptides to activate PDF-type receptors in Drosophila and A. japonicus has arisen independently. Therefore, we have modified the text in the discussion accordingly so that this alternative explanation for the effects of CT-type peptides on PDF-type receptors is also presented: “Alternatively, the ability of CT-type neuropeptides to act as ligands for PDF-type receptors in D. melanogaster and A. japonicus may have evolved independently. Further studies on a wider variety of both protostome (e.g. molluscs, annelids) and deuterostome taxa (e.g. other echinoderms, hemichordates) are needed to address this issue.”

      (4) AjCT1 and CT2 can activate the two PDF receptors ex vivo. However, their EC50 values are larger and the responses are lower compared to those seen for the CT receptor. Similar cross-talk between closely related peptide families is often observed in ex vivo systems (see: https://doi.org/10.1016/j.bbrc.2010.11.089 , https://doi.org/10.1073/pnas.162276199 , https://doi.org/10.1093/molbev/mst269 and others). However, very few signaling systems exhibit this type of cross-talk in vivo. Without any in vivo evidence, I suspect that the more likely possibility is that the bona fide endogenous ligand for PDF receptors remains to be discovered. The authors could, however, perform peptide and receptor knockdown experiments and show overlap in phenotypes following CT knockdown and PDFR knockdown to support their claim.

      We sincerely appreciate the reviewers' insightful critique. According to the reviewer’s suggestion, we have supplemented CTP and AjPDFR2 knockdown experiments, and measured the dry weight of remaining bait and excrement, as well as calculating the weight gain rate and specific growth rate in response to phenotypic changes. The results showed that weight gain rate and specific growth rate in experimental groups were significantly decreased respectively (As shown in Figure 10A and 11B), Correspondingly, except for the I phase, the siAjCTP1/2-1 group had significantly increased remaining bait and decreased excrement in II-VI phases (Figure 10B), the remaining bait weight was significantly increased in siAjPDFR2-1 group (except during phase I), while the weight of excrement was significantly decreased in phase V and VI (Figure 11C). Therefore, AjCT and AjPDFR2 knockdown experiments showed overlap in phenotypes, providing evidence that AjCT does act as an endogenous ligand for PDFR. These results were added in Figure 10 and Figure 11. The related description was added in the results section 2.6 (line 390-396), section 2.7 (line 427-439) and the materials and methods section 4.9 (line 879-898). We acknowledge, however, that other peptides, in addition AjCT1 and AjCT2, may also act as ligands for AjPDFR1 and AjPDFR2 in vivo and on-going studies in the Chen (OUC) and Elphick (QMUL) labs are attempting to address this issue

      (5) Why are receptor transcripts upregulated following peptide injection? Usually, increased ligand levels/signaling result in a compensatory decrease in receptor levels. These negative feedback loops maintain optimum signaling levels. Since the authors have successfully implemented RNAi for this CT precursor, what are the phenotypes on growth and feeding?

      We thank the reviewers for raising these critical points. Our responses are structured as follows: Firstly, our findings align with established mechanisms of neuropeptide-induced receptor modulation (Please check the reference Tiptanavattana et al. 2022). Secondly, based on the reviewer’s suggestion, we have supplemented the experiments to detect the phenotype variations on growth and feeding based on long-term reduced CT signaling, including measuring the dry weight of remaining bait and excrement, calculating the weight gain rate and specific growth rate, as well as testing the expression levels of the three growth factors (AjMegf6, AjGDF-8 and AjIgf). The results showed that weight gain rate and specific growth rate in siAjCTP1/2-1 group were significantly decreased (As shown in Figure 10A), Correspondingly, except for the I phase, the siAjCTP1/2-1 group had more remaining bait and less excrement in II-VI phases (Figure 10B). Furthermore, the growth inhibitory factor AjGDF-8 was significantly up-regulated and the growth promoting factors AjMegf6 were significantly down-regulated in siAjCTP1/2-1 group (Figure 10C). We have added these results in Figure 10, with detailed description in the results section 2.6 (line 390-396) and in the materials and methods section 4.9 (line 879-888). And after long-term continuous injections of siAjCTP1/2-1, we further recorded the feeding behavior of these sea cucumbers for three consecutive days. The remaining bait and feces were cleaned and the food was re-placed in the middle of the tank each day. We calculated the aggregation percentage (AP) of sea cucumbers around the food during the peak feeding period (2:00-4:00) each day, which is the best indicator for sea cucumber feeding behavior detecting. The results showed that the AP in siAjCTP1/2-1 group was significantly lower than that in control group. After dissection, we also found the intestines of siAjCTP1/2-1 group had less food and significantly degenerated (see author response image 1). All these results supported that long-term functional loss of AjCT2 negatively influence the feeding and growth of A. japonicus.

      Other comments:

      (6) What criteria do the authors use to classify some proteins as "type", some as "like" and others as "related"? In my opinion, DH31 could be referred to as CT-like or CT-type. Please use one term for clarity unless there is a scientific explanation behind this terminology.

      Thank you for the reviewer’s comment. If you look at the paper by Cai et al. (2018) you will see in Figure 14 that CT-type peptides and DH31-type peptides are paralogous, probably due to a gene duplication in the common ancestor of the protostomes. The CT-related peptides in protostomes that have a disulphide bridge we would describe as CT-type because they have conserved a feature that is found in CT-type peptides in deuterostomes. Whereas the DH31 peptides we would describe as CT-like. But there is not a formal rule on this. It is possible the duplication event that gave rise to DH31 and CT-type peptides occurred in the common ancestor of the Bilateria but DH31-type signaling was lost in deuterostomes. On the other hand, if the gene duplication that gave rise to DH31-type peptides and CT-type peptides in protostomes did occur in a common ancestor of the protostomes, then DH31 and CT-type peptides in protostomes could be described as co-orthologs of CT-type peptides in deuterostomes. In this case, both CT peptides and DH31 peptides in protostomes could be described as CT-type. Here is a useful link for explanation of terms: https://omabrowser.org/oma/type/

      (7) Was genomic DNA removal step performed before cDNA synthesis for qRT-PCR?

      Thank you for the reviewer’s comment. The genomic DNA removal step was performed before cDNA synthesis for qRT-PCR and we have added the information in the section 4.5 (line 774-776).

      (8) Line 70: The presence of calcitonin-like peptides (DH31) and DH31 receptors in invertebrates was discovered long before the discoveries by Jekely 2013 and Mirabeau and Joly 2013. Please credit these original studies: https://pubmed.ncbi.nlm.nih.gov/10841553/ and https://pubmed.ncbi.nlm.nih.gov/15781884/.

      Thank you for the reviewer’s comment. We have credited these original studies in the revised manuscript.

      (9) Lines 72-74: Please cite https://pubmed.ncbi.nlm.nih.gov/24359412/.

      Thank you for the reviewer’s comment. We have cited it in the revised manuscript.

      (10) Line 87: Please cite https://pubmed.ncbi.nlm.nih.gov/15781884/.

      Thank you for the reviewer’s comment. We have cited it in the revised manuscript.

      (11) Lines 89-91: The functional significance of DH31 signalling to PDFR in Drosophila is known. See: https://pubmed.ncbi.nlm.nih.gov/15781884/ and https://pubmed.ncbi.nlm.nih.gov/30696873/. There are several studies that have shown the functions of DH31 signalling via DH31R.

      Thank you for the reviewer’s comment. We have corrected it and added all this studies in the revised manuscript.

      (12) Figure 1 Supplement 1: The tertiary models for CT1 and CT2 look completely different. This prediction is not in line with both ligands activating the same receptor.

      Thank you for the reviewer’s comment. We have deleted this supplementary figure.

      (13) Figure 1 Supplement 3 legend: Please add panel labels next to the corresponding receptor.

      Thank you for the reviewer’s comment. We have added panel labels next to the corresponding receptors as you suggested.

      (14) Figure 2: What does CO refer to?

      Thank you for the reviewer’s comment. CO (Control) refers to the stimulation of HEK293T transfected cells with serum-free DMEM, and we have added the detailed information in Figure 2 legend (line 251-252).

      (15) Figure 3: Due to the low magnification of the cells, it is difficult to see the localization of the receptor. It would also be more appropriate to use a membrane marker rather than DAPI which does not label the cytoplasm or membrane where the receptor can be found.

      we appreciate the reviewer's insightful comment regarding the experimental controls.‌ The baseline receptor localization data under non-stimulated conditions are presented in ‌Figure 3—figure supplement 1‌, demonstrating constitutive membrane distribution of pEGFP-N1-tagged receptors. Upon stimulation with synthetic CT-like peptides, qualitative imaging analysis revealed significant ligand-induced receptor internalization into the cytoplasm (Figure 3B).

      (16) Figure 9: Please include PDF precursor and receptor as separate columns. Also, Drosophila CT/DH31 receptors have been characterized.

      Thank you for the reviewer’s comment. We have added PDF precursor, predicted peptides and receptors as separate columns in the revised manuscript Figure 12. And also, we corrected the error summary of Drosophila CT/DH31 receptors according to your suggestions.

      (17) Table 1: It is not very clear why there are multiple columns for ERK1/2 with different outcomes.

      Thank you for the reviewer’s comment. Although the cAMP/PKA or Gαq/Ca<sup>2+</sup>/PKC signaling is activated after ligand binding to receptors, the downstream ERK1/2 cascade is not necessarily activated. Therefore, we counted the activation status of cAMP/PKA and its downstream ERK1/2 cascade, and Gαq/Ca<sup>2+</sup>/PKC and its downstream cascade in Table 1 respectively. We have optimized Table1 to make it clearer in the revised manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary: As TDP-43 mislocalization is a hallmark of multiple neurodegenerative diseases, the authors seek to identify pathways that modulate TDP-43 levels. To do this, they use a FACS based genome wide CRISPR KD screen in a Halo tagged TDP-43 KI iPSC line. Their screen identifies a number of genetic modulators of TDP-43 expression including BORC which plays a role in lysosome transport.

      Strengths:

      Genome wide CRISPR based screen identifies a number of modulators of TDP-43 expression to generate hypotheses regarding RNA BP regulation and perhaps insights into disease.

      Weaknesses:

      It is unclear how altering TDP-43 levels may relate to disease where TDP-43 is not altered in expression but mislocalized. This is a solid cell biology study, but the relation to disease is not clear without providing evidence of BORC alterations in disease or manipulation of BORC reversing TDP-43 pathology in disease.

      We thank the reviewer for this comment and have updated the discussion to include more discussion of the role TDP-43 may play in the BORCS8-associated neurodegenerative disorder and how understanding how lysosome localization changing TDP-43 levels may help patients (lines 313-321).

      The mechanisms by which BORC and lysosome transport modulate TDP-43 expression are unclear. Presumably, this may be through altered degradation of TDP protein but this is not addressed.

      We agree with the reviewer that understanding the mechanism by which lysosome transport regulates TDP-43 levels is important and plan to examine this in future studies.

      Previous studies have demonstrated that TDP-43 levels can be modulated by altering lysosomal degradation so the identification of lysosomal pathways is not particularly novel.

      We thank the reviewer for this comment and have updated the text to make this clearer (lines 310-313). What hasn’t been observed previously is a change in lysosome localization affecting TDP-43 levels.

      It is unclear whether this finding is specific to TDP-43 levels or whether lysosome localization may more broadly impact proteostasis in particular of other RNA BPs linked to disease.

      We agree that this is an interesting question and something that should be investigated in future studies.

      Unclear whether BORC depletion alters lysosome function or simply localization.

      We thank the reviewer for this comment. Lysosome function related to protein turnover has not yet been examined in the literature after loss of BORC, but other aspects of lysosome function (including lipid metabolism and autophagic flux) have been shown to be disrupted upon loss of BORC. We have updated the discussion to address this (lines 292-296).

      Reviewer #2 (Public review):

      Summary: The authors employ a novel CRISPRi FACS screen and uncover the lysosomal transport complex BORC as a regulator of TDP-43 protein levels in iNeurons. They also find that BORC subunit knockouts impair lysosomal function, leading to slower protein turnover and implicating lysosomal activity in the regulation of TDP-43 levels. This is highly significant for the field given that a) other proteins could also be regulated in this way, b) understanding mechanisms that influence TDP-43 levels are significant given that its dysregulation is considered a major driver of several neurodegenerative diseases and c) the novelty of the proposed mechanism.

      Strengths:

      The novelty and information provided by the CRISPRi screen. The authors provide evidence indicating that BORC subunit knockouts impair lysosomal function, leading to slower protein turnover and implicating lysosomal activity in the regulation of TDP-43 levels and show a mechanistic link between lysosome mislocalization and TDP-43 dysregulation. The study highlights the importance of localized lysosome activity in axons and suggests that lysosomal dysfunction could drive TDP-43 pathologies associated with neurodegenerative diseases like FTD/ALS. Further, the methods and concepts will have an impact to the larger community as well. The work also sets up for further work to understand the somewhat paradoxical findings that even though the tagged TDP-43 protein is reduced in the screen, it does not alter cryptic exon splicing and there is a longer TDP-43 half-life with BORC KD.

      Weaknesses:

      While the data is very strong, the work requires some additional clarification.

      We thank the reviewer for these comments. Our detailed responses are included below in the “recommendations for authors” section.

      Reviewer #3 (Public review):

      Summary: In this work, Ryan et al. have performed a state-of-the-art full genome CRISP-based screen of iNeurons expressing a tagged version of TDP-43 in order to determine expression modifiers of this protein. Unexpectedly, using this approach the authors have uncovered a previously undescribed role of the BORC complex in affecting the levels of TDP-43 protein, but not mRNA expression. Taken together, these findings represent a very solid piece of work that will certainly be important for the field.

      Strengths:

      BORC is a novel TDP-43 expression modifier that has never been described before and it seemingly acts on regulating protein half life rather than transcriptome level. It has been long known that different labs have reported different half-lives for TDP-43 depending on the experimental system but no work has ever explained these discrepancies. Now, the work of Ryan et al. has for the time identified one of these factors which could account for these differences and play an important role in disease (although this is left to be determined in future studies).

      The genome wide CRISPR screening has demonstrated to yield novel results with high reproducibility and could eventually be used to search for expression modifiers of many other proteins involved in neurodegeneration or other diseases

      Weaknesses:

      The fact that TDP-43 mRNA does not change following BORCS6 KD is based on a single qRT- PCR that does not really cover all possibilities. For example, the mRNA total levels may not change but the polyA sites may have switched from the highly efficient pA1 to the less efficient and nuclear retained pA4. There are therefore a few other experiments that could have been performed to make this conclusion more compelling, maybe also performing RNAscope experiments to make sure that no change occurred in TDP-43 mRNA localisation in cells.

      We thank the reviewer for this comment. To address this point, we performed an analysis of polyA sites on our RNA sequencing data using REPAC and did not find a change in TDP-43 poly adenylation after BORC KD (Figure S6C). Other transcripts do have altered polyA sites, which are summarized in Figure S6C. We also performed HCR FISH for TARDBP mRNA in TDP-43 and BORC KD neurons. While we did not see a difference in RNA localization (see A below, numbers on brackets indicate p-values), we also were not able to detect a significant difference in total TARDBP mRNA levels upon TDP-43 KD (see B below, numbers on brackets indicate p-values), suggesting that some of the signal detected is non-specific to TARDBP. Because of this, we cannot conclusively say that BORC KD does not alter TARDBP mRNA localization using the available tools.

      Author response image 1.

      Even assuming that the mRNA does not change, no explanation for the change in TDP-43 protein half life has been proposed by the authors. This will presumably be addressed in future studies: for example, are mutants that lack different domains of TDP-43 equally affected in their half-lives by BORC KD?. Alternatively, can a mass-spec be attempted to see whether TDP-43 PTMs change following BORCS6 KD?

      We agree with the reviewer that these are important experiments that could be done in the future to further examine the mechanism by which loss of BORC alters TDP-43 half-life. We examined our proteomics data for differential phosphorylation and ubiquitination in NT vs BORC KD (Figure S7G-H). We were unable to detect PTMs on TDP-43, so we cannot say if they contribute to the change in TDP-43 half-life we observed.

      Reviewer #1 (Recommendations for the authors):

      Recommendations are detailed in the public review.

      Reviewer #2 (Recommendations for the authors):

      Ryan et al, employ a CRISPRi FACS screen and uncover the lysosomal transport complex BORC as a regulator of TDP-43 protein levels in iNeurons. The authors provide strong evidence indicating that BORC subunit knockouts impair lysosomal function, leading to slower protein turnover and implicating lysosomal activity in the regulation of TDP-43 levels. The authors then provided additional evidence of TDP-43 perturbations under lysosome-inhibiting drug conditions, underscoring a mechanistic link between lysosome mislocalization and TDP-43 dysregulation. The study highlights the importance of localized lysosome activity in axons and suggests that lysosomal dysfunction could drive TDP-43 pathologies associated with neurodegenerative diseases like FTD/ALS. The work is exciting and could be highly informative for the field.

      Concerns: There are some disconnects between the figures and the main text that can benefit from refining of the figures to align better with the main text. This does not require additional experiments other than perhaps Figure 4B. The impact of the work could be further discussed - it is an interesting disconnect between the fact BORC KD causes decreased IF of the Halo-tagged TDP-43 and lysosomal transport, however this reduction does not impact cryptic exon expression and also increases TDP-43 half life (and of other proteins). It is a very interesting and potentially informative part of the manuscript.

      We thank the reviewer for their detailed reading of our manuscript. We have endeavored to better match the figures and the text and have added more discussion of the impact of the work.

      Minor:

      (1) Suggestion: relating to the statement "Gene editing was efficient, with almost all selected clones correctly edited." - please provide values or %.

      We updated the text to remove the statement about the editing efficiency, instead saying we identified a clone that was correct for both sequence and karyotype (lines 83-85).

      (2) Relating to Figure 1A: Please provide clarification regarding tagging strategy with the halotag - e.g. why in front of exon2.

      We updated the figure legend to reflect that the start codon for TDP-43 is in exon 2, hence why we placed the HaloTag there.

      (3) Relating to Figure S1: A and B seems to have been swapped.

      We thank the reviewer for catching this mistake and have fixed the figure/text.

      (4) Relating to Figure 1B: figure legend does not indicate grayscale coloring of TDP-43 signal.

      We have added text in the figure legend to indicate that the Halo signal is shown in grayscale in the left-handed panels.

      (5) Relating to Figure 1C: can the authors clarify abbreviation for 'NT' in text and legend.

      We thank the reviewer for catching this and have indicated in the text and figure legend that NT refers to the non-targeting sgRNA that was used as a control for comparison to the TDP-43 KD sgRNA.

      (6) Relating to figure 2B and S2A: main text mentioned "Non-targeting Guides" however the figure does not show non-targeting guides to confirm.

      We thank the reviewer for catching this oversight, we updated the figure legends for these figures to indicate that the non-targeting (NT) guides are shown in gray on the rank plot. They cluster towards the middle, more horizontal portion of the graphs, showing that the more vertical sections of the graph are hits.

      (7) Suggestion: To make it easier on the reader, please provide overlap numbers for the following statement ..."In comparing the top GO terms associated with genes that increase or decrease Halo-TDP-43 levels in iNeurons, we found that almost none altered Halo-TDP-43 levels in iPSCs...".

      We thank the reviewer for this comment and have updated the text to indicate that only a single term is shared between the iPSC and iNeuron screens (lines 113-117).

      (8) Relating to the statement "We cloned single sgRNA plasmids for 59 genes that either increased or decreased Halo-TDP-43 in iNeurons but not in iPSCs." Can the authors provide a list of the 59 genes.

      We have included a new column in the supplemental table S1 indicating the result of the Halo microscopy validation to hopefully clarify which genes lead to a validated phenotype and which did not.

      (9) Relating to the statement "To rule out the possibility of neighboring gene or off-target effects of CRISPRi, as has been reported previously15, we examined the impact of BORC knockout (KO) on TDP-43 levels. Using the pLentiCRISPR system, which expresses the sgRNA of interest on the same plasmid as an active Cas916 we found that KO of BORCS7 using two different sgRNAs decreased TDP-43 levels by immunofluorescence (Figure 5C-D)." Please provide clarification as to why BORCS7 was chosen out of all the BORCS? From the data presentation thus far (Figure 4B & 5A), the reader might have anticipated testing BORCS6 for panels 5C-D.

      We thank the reviewer for this comment. We tried a couple of BORCs with the pLentiCRISPR system, but BORCS7 was the only one we were convinced we got functional knockout for based on lysosome localization. We think that either the guides were not ideal for the other BORC components we tried, or we did not get efficient gene editing across the population of cells tested. Because we had previously been working with knock down and CRISPRi guides are not the same as CRISPR knock out guides, we couldn’t use the existing guide sequences we know work well for BORC. Since loss of one BORC gene causes functional loss of the complex and restricts lysosomes to the soma, we did not feel it necessary to assay all 8 genes.

      (10) Relating to the statement "We treated Halo-TDP-43 neurons with various drugs that disrupt distinct processes in the lysosome pathway and asked if Halo-TDP-43 levels changed. Chloroquine (decreases lysosomal acidity), CTSBI (inhibits cathepsin B protease), ammonium chloride (NH4Cl, inhibits lysosome-phagosome fusion), and GPN (ruptures lysosomal membranes) all consistently decreased Halo-TDP-43 levels (Figure 6A-B, S5A-C)" Please provide interpretations for Figures S5A and S5C in text.

      We thank the reviewer for catching this oversight and have updated the text accordingly (lines 183-191).

      (11) Relating to figure 6E: please provide in legend what the different colors used correlate with (i.e. green/brown for BORCS7 KD)?

      We thank the reviewer for pointing this out. These colors were mistakenly left in the figure from a version looking to see if the observed effects were driven by a single replicate rather than a consistent change (each replicate has a slightly different color). As the colors are intermingled and not separated, we concluded the effect was not driven by a single replicate. The colors have been removed from the updated figure for simplicity.

      (12) Relating to the statement "We observed a similar trend for many proteins in the proteome (Figure 8B)" This statement can benefit from stating which trend the authors are referring to, it is currently unclear from the volcano plot shown for Figure 8B.

      We thank the reviewer for catching this and have updated the text accordingly.

      (13) Relating to the statement "For almost every gene, we observed an increase or decrease in Halo-TDP-43 levels without a change in Halo-TDP-43 localization or compartment specific level changes (Figure 4B)." Please provide: (1) the number of genes examined, (2) additional clarification of "localization" and "compartment specific" level changes, (3) some quantification and or additional supporting data of the imaging results. Figures 5A-B presents with the same concern relating to the comment "To determine if results from Halo-TDP-43 expression assays also applied to endogenous, untagged TDP-43 levels, we selected 22 genes that passed Halo validation and performed immunofluorescence microscopy for endogenous (untagged) TDP-43 (Figure 4D-G,5A-B, S4E-F)." please clarify further.

      We thank the reviewer for requesting this clarification. This statement refers to all 59 genes tested by Halo imaging; only one (MFN2) showed any hints of aggregation or changes in localization, every other gene (58) showed what appeared to be global changes in Halo-TDP-43 levels. We were initially intrigued by the MFN2 phenotype; however, we were unable to replicate it on endogenous TDP-43 and thus concluded that this might be an effect specific to the tagged protein. The representative images shown in Figure 4B are representative of the changes we observed across all 59 genes tested (if changes were present). From the 59 genes that we observed a change in Halo-TDP-43 levels by microscopy, we selected a smaller number to move forward to immunofluorescence for TDP-43. We picked a subset of genes from each of the different categories we had identified (mitochondria, m6A, ubiquitination, and some miscellaneous) to validate by immunofluorescence, thinking that genes in the same pathway would act similarly. We have added a column to the supplemental table S1 indicating which genes were tested by immunofluorescence and what the result was. We have also attempted to clarify the results section to make the above clearer.

      (14) Relating to the statement "To determine if results from Halo-TDP-43 expression assays also applied to endogenous, untagged TDP-43 levels, we selected 22 genes that passed Halo validation and performed immunofluorescence microscopy for endogenous (untagged) TDP-43 (Figure 4D-G, 5A-B, S4E-F). Of these, 18 (82%) gene knockdowns showed changes in endogenous TDP-43 levels (Figure 4D-G, S4E-F)." It is difficult to identify the 18 or 22 genes in the figures as described in the main text.

      We added columns to the supplemental table S1 listing the genes and the result in each assay.

      (15) Relating to figures S7A and 8A and the first part of the section "TDP-43, like the proteome, shows longer turnover time in BORC KD neurons" Can the authors provide clarification why the SunTag assay was performed with BORCS6 KD (S7A) but the follow-up experiment (8A) was performed with BORCS7 KD. Does BORCS6 KD show similar results as BORCS7 with the SunTag assay, and does TDP-43 protein abundance with BORCS7 KD show similar results as BORCS6?

      Because loss of any of the 8 BORC genes causes functional loss of BORC and lysosomes to be restricted to the peri-nuclear space, we used BORC KDs interchangeably. Additionally, all BORC KDs had similar effects on Halo-TDP-43 levels.

      Reviewer #3 (Recommendations for the authors):

      Adding more control experiments that TDP-43 mRNA is really not affected following BORC KD

      We performed a FISH experiment to examine TARDBP mRNA localization upon BORC KD but were unable to conclusively say whether BORC KD changes TARDBP mRNA localization (see above). We also analyzed our RNA sequencing experiment for alternative polyadenylation sites upon BORC KD. Results are in Figure S6C.

      Although this could be part of a future study, the authors should try and determine what are the changes to TDP-43 that drive a change in the half-life.

      We agree with the reviewer that these are important experiments and hope to figure this out in the future.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      The manuscript by Sayeed et al. uses a comprehensive series of multi-omics approaches to demonstrate that late-stage human cytomegalovirus (HCMV) infection leads to a marked disruption of TEAD1 activity, a concomitant loss of TEAD1-DNA interactions, and extensive chromatin remodeling. The data are thoroughly presented and provide evidence for the role of TEAD1 in the cellular response to HCMV infection.

      However, a key question remains unresolved: is the observed disruption of TEAD1 activity a direct consequence of HCMV infection, or could it be secondary to the broader innate antiviral response? In this respect, the study would benefit from experiments that assess the effect of TEAD1 overexpression or knockdown/deletion on HCMV replication dynamics. Such functional assays could help delineate whether TEAD1 perturbation directly influences viral replication or is part of a downstream/indirect cellular response, providing deeper mechanistic insights.

      To examine the effect of TEAD1 on HCMV, we performed an experiment in primary human foreskin fibroblasts (HFF) which were stably transduced with constitutive TEAD1. To constitutively express TEAD1, we cloned the open reading frame of TEAD1 into pLenti-puro (Plasmid #39481 from Addgene). We selected for transduced cells using puromycin. For these experiments, we first assessed two multiplicities of infection (MOI): 1 and 10 (Reviewer Response Figure 1). Based on the TEAD1 expression in these cells relative to non-transduced HFF cells, we performed HCMV infection experiments in cells transduced with TEAD1 lentivirus at an MOI of 1.

      For infections, we used a version of HCMV in which the C terminus of the capsi-associated tegument protein pUL32 (pp150) is tagged by enhanced green fluorescent protein (GFP) (PMID: 15708994). This experimental design allowed us to assess the impact of constitutive TEAD1 expression on HCMV infection. GFP and immediate early protein expression levels were measured 48 hours after infection by flow cytometry.

      After infecting parent cells (no constitutive TEAD1) and TEAD1 constitutively expressing cells with a GFP-positive HCMV at MOIs of 0.3 and 1, we identified equivalent GFP expression in the two conditions, indicating equivalent levels of HCMV infection 48 hours after initial infection (Reviewer Response Figure 1A). We also identified equivalent immediate early protein expression at 48 hours after infection, as measured both by percent positivity (Reviewer Response Figure 1B) and mean florescent intensity (Reviewer Response Figure 1C). At 96 hours with an MOI of 3, constitutive expression of TEAD1 led to a slight reduction in the expression of the HCMV proteins pp65 (encoded by UL83) and UL44 at 72 and 96 hours post initial infection (Reviewer Response Figure 1D). These results suggest that TEAD1 expression has minimal effects, if any, on the expression of these two late HCMV proteins in fibroblasts.  Regulation of particular HCMV genes by TEAD1 is likely to be central for HCMV replication and reactivation in other specialized cell types relevant to viral pathogenesis and disease. However, definitive studies are beyond the scope of the current study. 

      Author response image 1.

      Constitutive TEAD1 expression reduces expression of two HCMV late genes at 72 and 96 hours after infection. A-C. Primary human foreskin fibroblasts with and without constitutive TEAD1 expression were infected with pp150-GFP HCMV at a multiplicity of infection (MOI) of 0.3 or 1 and assessed 48 hours post infection. A. HCMV positive cells were quantified by measuring the percent of cells that were GFP positive. B. The percentages of immediate early (IE1/IE2) positive cells were quantified by flow cytometry. C. The mean florescence intensity of immediate early positive cells was quantified by flow cytometry. D. Primary human foreskin fibroblasts with and without constitutive TEAD1 expression were infected with pp150-GFP HCMV at an MOI of 1 and assessed by Western blot at various time point post infection. UL44 and pp65 are expressed late in the cascade of HCMV gene expression. TEAD1 expression levels and uncropped Westerns are provided in Supplemental Figure S8

      Reviewer Response Methods:

      Flow cytometric analysis of viral entry and spread using GFP expression and HCMV immediate early (IE) protein staining

      Parental and TEAD1 transduced human foreskin fibroblasts were seeded into 12-well plates at 1.0 × 10<sup>5</sup> cells per well and either mock infected or infected with pp150-GFP HCMV (PMID: 15708994) at MOIs of 0.3 or 1 on the same day. Cells were trypsinized at appropriate time points and then neutralized with complete medium. Cell suspensions were spun down at 500g for 5 minutes, and the cell pellet was fixed in 70% ethanol for 30 minutes. Following fixation, cells were permeabilized in phosphate-buffered saline (PBS) containing 0.5% bovine serum albumin (BSA) and 0.5% Tween 20 for 10 minutes at 4°C, pelleted, and then stained with IE1/IE2 antibody (mAb810-Alexa Fluor 488) diluted in PBS supplemented with 0.5% BSA for 2 hours. Cells were washed with PBS supplemented with 0.5% BSA–0.5% Tween 20 and then resuspended in PBS. Cells were analyzed using a flow cytometer (BD Biosciences). Infected cells were also trypsinized at appropriate time points, neutralized in the appropriate media, and directly analyzed for GFP positivity on the flow cytometer.

      Western blot analyses of HCMV protein expression in infected cells with and without constitutive TEAD1 expression

      TEAD1 transduced and parental human foreskin fibroblasts were seeded into 6-well cell culture plates at a density of 3.0 × 10<sup>5</sup> cells per well and either mock infected or infected with pp150-GFP HCMV (PMID: 15708994) at an MOI of 1. Whole-cell lysates were collected at various time points post-infection, separated by SDS-PAGE, and transferred to nitrocellulose for Western blot analysis. Western blots were probed with the following primary antibodies: anti-IE1/IE2 (Chemicon), anti-UL44 (kind gift of John Shanley), anti-pp65 (Virusys Corporation), and cellular β-actin antibody (Bethyl Laboratories). Next, each blot was incubated with appropriate horseradish peroxidase-conjugated anti-rabbit or anti-mouse IgG secondary antibodies. Chemiluminescence was detected and quantified using a C-DiGit blot scanner from Li-Cor.

      Reviewer #2 (Public review):

      Summary:

      This work uses genomic and biochemical approaches for HCMV infection in human fibroblasts and retinal epithelial cell lines, followed by comparisons and some validations using strategies such as immunoblots. Based on these analyses, they propose several mechanisms that could contribute to the HCMV-induced diseases, including closing of TEAD1-occupying domains and reduced TEAD1 transcript and protein levels, decreased YAP1 and phospho-YAP1 levels, and exclusion of TEAD1 exon 6.

      Strengths:

      The genomics experiments were done in duplicates and data analyses show good technical reproducibility. Data analyses are performed to show changes at the transcript and chromatin level changes, followed by some Western blot validations.

      Weaknesses:

      This work, at the current stage, is quite correlative since no functional studies are done to show any causal links. For readers who are outside the field, some clarifications of the system and design need to be stated.

      Reviewer #2 (Recommendations for the authors):

      Here are some specific questions:

      (1) Since all current analyses are correlative, it is difficult to know which changes are of biological significance. For example, experiments manipulating TEAD transcription factor or YAP with effects on how cells respond to HCMV infection would significantly strengthen the conclusions, which are largely speculations now.

      Please see response to Reviewer 1, which highlights newly added functional assays that include the constitutive (forced) expression of TEAD1, as suggested.

      (2) How similar are these cell lines (human fibroblasts and retinal epithelial cell lines) resembling the actually infected cells in patients that lead to symptoms?

      In infected cells in patients, HCMV initially infects both fibroblasts and epithelial cells. HCMV penetrates fibroblasts by fusion at the cell surface but is endocytosed into epithelial cells (PMID: 18077432). Thus, most experimental studies of HCMV in vitro use primary human foreskin fibroblasts and a retinal epithelial cell line, as we do in this study.

      Additional information on primary human fibroblasts as a model of HCMV infection in humans

      There is a nice review article that provides the history of the study of the molecular biology of HCMV that describes how Stanley Plotkin from the Wistar Institute first identified human fibroblast HCMV infected cells (PMID: 24639214). The primary fibroblasts of the foreskin of neonates are available commercially (sometimes called HS68) and model neonatal HCMV infection. Neonatal HCMV, or Congenital Cytomegalovirus, is a leading cause of congenital infection and a significant cause of non-genetic hearing loss in the US (https://www.cdc.gov/cytomegalovirus/congenital-infection/index.html). While many infected newborns appear healthy at birth, a substantial percentage experience long-term health problems, including hearing loss, developmental delays, and vision problems (PMID: 39070527). 

      More information on ARPE-3 as a model of HCMV infection in humans

      HCMV retinitis is a leading cause of vision loss and results from HCMV infection of retinal cells. Retinal epithelial cells are the primary target for HCV infection in the eye. The cell line ARPE-19 is derived from a primary human adult retinal pigment epithelium explant and is commonly used to study HCMV and is thought to be physiologically relevant to the human infection (PMID: 8558129 and 28356702). When compared to primary retinal pigment epithelia, ARPE-19 cells develop a similar cellular and molecular phenotype to primary cells from adults and neonates (PMID: 28356702).

      (3) What is the rationale for using 48 hours' infection? Is this the typical timeframe for patients to develop symptoms?

      HCMV genes are expressed in a temporally controlled manner (PMID: 35417700). Early genes (within the first 4 hours) are involved in regulating transcription, while genes within 4-48 hours are involved in DNA replication and further transcriptional regulation. The 48 hour mark corresponds to the onset of significant viral replication and interactions between the virus and the host immune response. After 48 hours, late genes are expressed, which encode structural proteins as well as viral proteins that inhibit host anti-viral responses.  Most studies that focus on the role of HCMV’s early and immediate early genes are performed at 24 or 48 hours. Similarly, most studies that assess the initial innate immune response to HCMV are performed within the initial 48 hours after in vitro infection.

      In most people with healthy immune systems, there are no symptoms (PMID: 34168328). While 60% of people in developed countries and 90% of those in developing countries are serologically positive for past infection, it is challenging to study the kinetics of symptom development due to heterogeneity in the initial virion exposure, the cell types that are initially infected, and immune response. HCMV persists throughout the lifetime of the infected individual by establishing latent infection.

      Also, among all these large-scale global changes, what are primary and what are secondary?

      A kinetic study with many timepoints would be needed to identify the primary and secondary genomic changes associated with HCMV infection. These experiments, while exciting, are beyond the scope of this manuscript.

      (4) Fig.2: In addition to the changes for each cell type, comparison of unchanged, closed and opened with infection regions between the two cell types could be informative for commonalities and differences between cell types.

      This was a good suggestion.  We have added a new Supplemental Figure S2, which compares the differentially accessible regions between the two cell types:

      We have also added the following sentence to the Results section:

      “Comparison of differentially accessible chromatin between ARPE and HFF revealed that the vast majority of the HCMV-induced changes are specific to one of the two cell types (Supplemental Figure S2).”

      (5) "Of the 23,018 loops present in both infected and uninfected cells, only 10 are differential at a 2-fold cutoff and a false discovery rate (FDR) <0.01."

      We thank the reviewer for drawing our attention to the differential chromatin looping analysis.  Your comment prompted us to re-examine the methodologies we employed to identify differential chromatin looping events between uninfected and infected cells.  In the process, we realized that the relatively low resolution of chromatin looping assays such as HiChIP might require additional care in classifying a particular loop as shared or differential when comparing two experimental conditions. We have thus revamped our differential chromatin looping methodologies by adding 5kb “pads” to either end of each chromatin loop “anchor”.

      The corresponding passage now reads:

      “We next used the HiChIP data to identify HCMV-dependent differential chromatin looping events (see Methods). In total, uninfected cells have 143,882 loops. With HCMV infection, 90,198 of these loops are lost, and 44,045 new loops are gained (Supplemental Dataset 3). Because the number of altered loops was large, we repeated loop calling and differential analysis with FDR values less than 0.05, 0.01, and 0.001 (Supplemental Dataset 3). For all three cutoffs, the percentage of loops specific to an infection state were very similar. We also randomly downsampled the number of input pairs used for calling loops to verify that our results were not due to a difference in read depth (Supplemental Dataset 3). For the three smaller subsets of data, the number of loops specific to an infection state only changed slightly. The full quantification of each chromatin looping event and comparisons of events between conditions are provided in Supplemental Dataset 6.”

      Are these cells asynchronous and how to determine whether certain changes are not due to cell cycle stage differences?

      Cells were plated to an identical density of cells per well before either mock or HCMV infection for this study. Based on the differentially expressed genes cell cycle pathways were not amongst the top 50 enriched molecular pathways.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      LRRK2 protein is familially linked to Parkinson's disease by the presence of several gene variants that all confer a gain-of-function effect on LRRK2 kinase activity. 

      The authors examine the effects of BDNF stimulation in immortalized neuron-like cells, cultured mouse primary neurons, hIPSC-derived neurons, and synaptosome preparations from the brain. They examine an LRRK2 regulatory phosphorylation residue, LRRK2 binding relationships, and measures of synaptic structure and function. 

      Strengths: 

      The study addresses an important research question: how does a PD-linked protein interact with other proteins, and contribute to responses to a well-characterized neuronal signalling pathway involved in the regulation of synaptic function and cell health? 

      They employ a range of good models and techniques to fairly convincingly demonstrate that BDNF stimulation alters LRRK2 phosphorylation and binding to many proteins. Some effects of BDNF stimulation appear impaired in (some of the) LRRK2 knock-out scenarios (but not all). A phosphoproteomic analysis of PD mutant Knock-in mouse brain synaptosomes is included. 

      We thank this Reviewer for pointing out the strengths of our work. 

      Weaknesses: 

      The data sets are disjointed, conclusions are sweeping, and not always in line with what the data is showing. Validation of 'omics' data is very light. Some inconsistencies with the major conclusions are ignored. Several of the assays employed (western blotting especially) are likely underpowered, findings key to their interpretation are addressed in only one or other of the several models employed, and supporting observations are lacking. 

      We appreciate the Reviewer’s overall evaluaVon. In this revised version, we have provided several novel results that strengthen the omics data and the mechanisVc experiments and make the conclusions in line with the data.

      As examples to aid reader interpretation: (a) pS935 LRRK2 seems to go up at 5 minutes but goes down below pre-stimulation levels after (at times when BDNF-induced phosphorylation of other known targets remains very high). This is ignored in favour of discussion/investigation of initial increases, and the fact that BDNF does many things (which might indirectly contribute to initial but unsustained changes to pLRRK2) is not addressed.  

      We thank the Reviewer for raising this important point, which we agree deserves additional investigation. Although phosphorylation does decrease below pre-stimulation levels, a reduction is also observed for ERK/AKT upon sustained exposure to BDNF in our experimental paradigm (figure 1F-G). This phenomenon is well known in response to a number of extracellular stimuli and can be explained by mechanisms related to cellular negative feedback regulation, receptor desensitization (e.g. phosphorylation or internalization), or cellular adaptation. The effect on pSer935, however, is peculiar as phosphorylation goes below the unstimulated level, as pointed by the reviewer. In contrast to ERK and AKT whose phosphorylation is almost absent under unstimulated conditions (Figure 1F-G), the stoichiometry of Ser935 phosphorylation under unstimulated conditions is high. This observation is consistent with MS determination of relative abundance of pSer935 (e.g. in whole brain LRRK2 is nearly 100% phosphorylated at Ser935, see Nirujogi et al., Biochem J 2021).  Thus we hypothesized that the modest increase in phosphorylation driven by BDNF likely reflects a saturation or ceiling effect, indicating that the phosphorylation level is already near its maximum under resting conditions. Prolonged BDNF stimulation would bring phosphorylation down below pre-stimulation levels, through negative feedback mechanisms (e.g. phosphatase activity) explained above. To test this hypothesis, we conducted an experiment in conditions where LRRK2 is pretreated for 90 minutes with MLi-2 inhibitor, to reduce basal phosphorylation of S935. After MLi-2 washout, we stimulated with BDNF at different time points. We used GFP-LRRK2 stable lines for this experiment, since the ceiling effect was particularly evident (Figure S1A) and this model has been used for the interactomic study. As shown below (and incorporated in Fig. S1B in the manuscript), LRRK2 responds robustly to BDNF stimulation both in terms of pSer935 and pRABs. Phosphorylation peaks at 5-15 mins, while it decreases to unstimulated levels at 60 and 180 minutes. Notably, while the peak of pSer935 at 5-15 mins is similar to the untreated condition (supporting that Ser935 is nearly saturated in unstimulated conditions), the phosphorylation of RABs during this time period exceeds unstimulated levels. These findings support the notion that, under basal conditions, RAB phosphorylation is far from saturation. The antibodies used to detect RAB phosphorylation are the following: RAB10 Abcam # ab230261 e RAB8 (pan RABs) Abcam # ab230260.

      Given the robust response of RAB10 phosphorylation upon BDNF stimulation, we further investigated RAB10 phosphorylation during BDNF stimulation in naïve SH-SY5Y cells. We confirmed that the increase in pSer935 is coupled to increase in pT73-RAB10. Also in this case, RAB10 phosphorylation does not go below the unstimulated level, which aligns with the  low pRAB10 stoichiometry in brain (Nirujogi et al., Biochem J 2021). This experiment adds the novel and exciting finding that BDNF stimulation increases LRRK2 kinase activity (RAB phosphorylation) in neuronal cells. 

      Note that new supplemental figure 1 now includes: A) a comparison of LRRK2 pS935 and total protein levels before and after RA differentiation; B) differentiated GFP-LRRK2 SH-SY5Y (unstimulated, BDNF, MLi-2, BDNF+MLi-2); C) the kinetic of BDNF response in differentiated GFP-LRRK2 SH-SY5Y.

      (b) Drebrin coIP itself looks like a very strong result, as does the increase after BDNF, but this was only demonstrated with a GFP over-expression construct despite several mouse and neuron models being employed elsewhere and available for copIP of endogenous LRRK2. Also, the coIP is only demonstrated in one direction. Similarly, the decrease in drebrin levels in mice is not assessed in the other model systems, coIP wasn't done, and mRNA transcripts are not quantified (even though others were). Drebrin phosphorylation state is not examined.  

      We appreciate the Reviewer suggestions and provided additional experimental evidence supporting the functional relevance of LRRK2-drebrin interaction.

      (1) As suggested, we performed qPCR and observed that 1 month-old KO midbrain and cortex express lower levels of Dbn1 as compared to WT brains (Figure 5G). This result is in agreement with the western blot data (Figure 5H). 

      (2)To further validate the physiological relevance of LRRK2-drebrin interaction we performed two experiments:

      i) Western blots looking at pSer935 and pRab8 (pan Rab) in Dbn1 WT and knockout brains. As reported and quantified in Figure 2I, we observed a significant decrease in pSer935 and a trend decrease in pRab8 in Dbn1 KO brains. This finding supports the notion that Drebrin forms a complex with LRRK2 that is important for its activity, e.g. upon BDNF stimulation. 

      ii) Reverse co-immunoprecipitation of YFP-drebrin full-length, N-terminal domain (1-256 aa) and C-terminal domain (256-649 aa) (plasmids kindly received from Professor Phillip R. Gordon-Weeks, Worth et al., J Cell Biol, 2013) with Flag-LRRK2 co-expressed in HEK293T cells. As shown in supplementary Fig. S2C, we confirm that YFP-drebrin binds LRRK2, with the Nterminal region of drebrin appearing to be the major contributor to this interaction. This result is important as the N-terminal region contains the ADF-H (actin-depolymerising factor homology) domain and a coil-coil region known to directly bind actin (Shirao et al., J Neurochem 2017; Koganezawa et al., Mol Cell Neurosci. 2017). Interestingly, both full-length Drebrin and its truncated C-terminal construct cause the same morphological changes in Factin, indicating that Drebrin-induced morphological changes in F-actin are mediated by its N-terminal domains rather than its intrinsically disordered C-terminal region (Shirao et al., J Neurochem, 2017; Koganezawa et al., Mol Cell Neurosci. 2017). Given the role of LRRK2 in actin-cytoskeletal dynamics and its binding with multiple actin-related protein binding (Fig. 2 and Meixner et al., Mol Cell Proteomics. 2011; Parisiadou and Cai, Commun Integr Biol 2010), these results suggest the possibility that LRRK2 controls actin dynamics by competing with drebrin binding to actin and open new avenues for futures studies.

      (3) To address the request for examining drebrin phosphorylation state, we decided to perform another phophoproteomic experiment, leveraging a parallel analysis incorporated in our latest manuscript (Chen et al., Mol Theraphy 2025). In this experiment, we isolated total striatal proteins from WT and G2019S KI mice and enriched the phospho-peptides. Unlike the experiment presented in Fig. 7, phosphopeptides were enriched from total striatal lysates rather than synaptosomal fractions, and phosphorylation levels were normalized to the corresponding total protein abundance. This approach was intended to avoid bias toward synaptic proteins, allowing for the analysis of a broader pool of proteins derived from a heterogeneous ensemble of cell types (neurons, glia, endothelial cells, pericytes etc.). We were pleased to find that this new experiment confirmed drebrin S339 as a differentially phosphorylated site, with a 3.7 fold higher abundance in G2019S Lrrk2 KI mice. The fact that this experiment evidenced an increased phosphorylation stoichiometry in G2019S mice rather than a decreased is likely due to the normalization of each peptide by its corresponding total protein. Gene ontology analysis of differentially phosphorylated proteins using stringent term size (<200 genes) showed post-synaptic spines and presynaptic active zones as enriched categories (Fig. 3F). A SynGO analysis confirms both pre and postsynaptic categories, with high significance for terms related to postsynaptic cytoskeleton (Fig. 3G). As pointed, this is particularly interesting as the starting material was whole striatal tissue – not synaptosomes as previously – indicating that most significant phosphorylation differences occur in synaptic compartments. This once again reinforces our hypothesis that LRRK2 has a prominent role in the synapse. Overall, we confirmed with an independent phosphoproteomic analysis that LRRK2 kinase activity influences the phosphorylation state of proteins related to synaptic function, particularly postsynaptic cytoskeleton. For clarity in data presentation, as mentioned by the Reviewers, we removed Figure 7 and incorporated this new analysis in figure 3, alongside the synaptic cluster analysis. 

      Altogether, three independent OMICs approaches – (i) experimental LRRK2 interactomics in neuronal cells, (ii) a literature-based LRRK2 synaptic/cytoskeletal interactor cluster, and (iii) a phospho-proteomic analysis of striatal proteins from G2019S KI mice (to model LRRK2 hyperactivity) – converge to synaptic actin-cytoskeleton as a key hub of LRRK2 neuronal function.

      (c) The large differences in the CRISPR KO cells in terms of BDNF responses are not seen in the primary neurons of KO mice, suggesting that other differences between the two might be responsible, rather than the lack of LRRK2 protein. 

      Considering that some variability is expected for these type of cultures and across different species, any difference in response magnitude and kinetics could be attributed to the levels of TrKB  and downstream components expressed by the two cell types. 

      We are confident that differentiated SH-SY5Y cells provide a reliable model for our study as we could translate the results obtained in SH-SY5Y cells in other models. However, to rule out the possibility that the more pronounced effect observed in SH-SY5Y KO cells as respect to Lrrk2 KO primary neurons was due to CRISPR off-target effect, we performed an off-target analysis. Specifically, we selected the first 8 putative off targets exhibiting a CDF (Cutting Frequency Determination) off-target-score >0.2. 

      As shown in supplemental file 1, sequence disruption was observed only in the LRRK2 ontarget site in LRRK2 KO SH-SY5Y cells, while the 8 off-target regions remained unchanged across the genotypes and relative to the reference sequence. 

      (d) No validation of hits in the G2019S mutant phosphoproteomics, and no other assays related to the rest of the paper/conclusions. Drebrin phosphorylation is different but unvalidated, or related to previous data sets beyond some discussion. The fact that LRRK2 binding occurs, and increases with BDNF stimulation, should be compared to its phosphorylation status and the effects of the G2019S mutation. 

      As illustrated in the response to point (b), we performed a new phosphoproteomics investigation – with total striatal lysates instead of striatal synaptosomes and normalization phospho-peptides over total proteins – and found that S339 phosphorylation increases when LRRK2 kinase activity increases (G2019S). To address the request of validating drebrin phosphorylation, the main limitation is that there are no available antibodies against Ser339. While we tried phos-Tag gels in striatal lysates, we could not detect any reliable and specific signal with the same drebrin antibody used for western blot (Thermo Fisher Scientific: MA120377) due to technical limitations of the phosTag method. We are confident that phosphorylation at S339 has a physiological relevance, as it was identified 67 times across multiple proteomic discovery studies and they are placed among the most frequently phosphorylated sites in drebrin (https://www.phosphosite.org/proteinAction.action?id=2675&showAllSites=true).

      To infer a possible role of this phosphorylation, we looked at the predicted pathogenicity of using AlphaMissense (Cheng et al., Science 2023). included as supplementary figure (Fig. S3), aminoacid substitutions within this site are predicted not to be pathogenic, also due to the low confidence of the AlphaFold structure. 

      Ser339 in human drebrin is located just before the proline-rich region (PP domain) of the protein. This region is situated between the actin-binding domains and the C-terminal Homerbinding sequences and plays a role in protein-protein interactions and cytoskeletal regulation (Worth et al., J Cell Biol, 2013). Of interest, this region was previously shown to be the interaction site of adafin (ADFN), a protein involved in multiple cytoskeletal-related processes, including synapse formation and function by regulating puncta adherentia junctions, presynaptic differentiation, and cadherin complex assembly, which are essential for hippocampal excitatory synapses, spine formation, and learning and memory processes (Beaudoin, G. M., 3rd et al., J Neurosci, 2013). Of note, adafin is in the list of LRRK2 interacting proteins (https://www.ebi.ac.uk/intact/home), supporting a possible functional relevance of LRRK2-mediated drebrin phosphorylation in adafin-drebrin complex formation. This has been discussed in the discussion section.

      The aim of this MS analysis in G2019S KI mice – now included in figure 3 – was to further validate the crucial role of LRRK2 kinase activity in the context of synaptic regulation, rather than to discover and characterize novel substrates. Consequently, Figure 7 has been eliminated. 

      Reviewer #2 (Public Review):  

      Taken as a whole, the data in the manuscript show that BDNF can regulate PD-associated kinase LRRK2 and that LRRK2 modifies the BDNF response. The chief strength is that the data provide a potential focal point for multiple observations across many labs. Since LRRK2 has emerged as a protein that is likely to be part of the pathology in both sporadic and LRRK2 PD, the findings will be of broad interest. At the same time, the data used to imply a causal throughline from BDNF to LRRK2 to synaptic function and actin cytoskeleton (as in the title) are mostly correlative and the presentation often extends beyond the data. This introduces unnecessary confusion. There are also many methodological details that are lacking or difficult to find. These issues can be addressed. 

      We appreciate the Reviewer’s positive feedback on our study. We also value the suggestion to present the data in a more streamlined and coherent way. In response, we have updated the title to better reflect our overall findings: “LRRK2 Regulates Synaptic Function through Modulation of Actin Cytoskeletal Dynamics.” Additionally, we have included several experiments that we believe enhance and unify the study.

      (1) The writing/interpretation gets ahead of the data in places and this was confusing. For example, the abstract highlights prior work showing that Ser935 LRRK2 phosphorylation changes LRRK2 localization, and Figure 1 shows that BDNF rapidly increases LRRK2 phosphorylation at this site. Subsequent figures highlight effects at synapses or with synaptic proteins. So is the assumption that LRRK2 is recruited to (or away from) synapses in response to BDNF? Figure 2H shows that LRRK2-drebrin interactions are enhanced in response to BDNF in retinoic acid-treated SH-SY5Y cells, but are synapses generated in these preps? How similar are these preps to the mouse and human cortical or mouse striatal neurons discussed in other parts of the paper (would it be anticipated that BDNF act similarly?) and how valid are SHSY5Y cells as a model for identifying synaptic proteins? Is drebrin localization to synapses (or its presence in synaptosomes) modified by BDNF treatment +/- LRRK2? Or do LRRK2 levels in synaptosomes change in response to BDNF? The presentation requires re-writing to stay within the constraints of the data or additional data should be added to more completely back up the logic. 

      We thank the Reviewer for the thorough suggestions and comments. We have extensively revised the text to accurately reflect our findings without overinterpreting. In particular, we agree with the Reviewer that differentiated SH-SY5Y cells are not  identical to primary mouse or human neurons; however both neuronal models respond to BDNF. Supporting our observations, it is known that SH-SY5Y cells respond to BDNF.  In fact, a common protocol for differentiating SH-SY5Y cells involve BDNF in combination with retinoic acid (Martin et al., Front Pharmacol, 2022; Kovalevich et al., Methods in mol bio, 2013). Additionally, it has been reported that SH-SY5Y cells can form functional synapses (Martin et al., Front Pharmacol, 2022). While we are aware that BDNF, drebrin or LRRK2 can also affect non-synaptic pathways, we focused on synapses when moved to mouse models since: (i) MS and phosphoMS identified several cytoskeletal proteins enriched at the synapse, (ii) we and others have previously reported a role for LRRK2 in governing synaptic and cytoskeletal related processes; (iii) the synapse is a critical site that becomes dysfunctional in the early  stages of PD. We have now clarified and adjusted the text as needed. We have also performed additional experiments to address the Reviewer’s concern:

      (1) “Is the assumption that LRRK2 is recruited to (or away from) synapses in response to BDNF”? This is a very important point. There is consensus in the field that detecting endogenous LRRK2 in brain slices or in primary neurons via immunofluorescence is very challenging with the commercially available  antibodies (Fernandez et al., J Parkinsons Dis, 2022). We established a method in our previous studies to detect LRRK2 biochemically in synaptosomes (Cirnaru et al., Front Mol Neurosci, 2014; Belluzzi et al., Mol Neurodegener., 2016). While these data indicate LRRK2 is present in the synaptic compartments, it would be quite challenging to apply this method to the present study. In fact, applying acute BDNF stimulation in vivo and then isolate synaptosomes is a complex experiment beyond the timeframe of the revision due to the need of mouse ethical approvals. However, this is definitely an intriguing angle to explore in the future.

      (2)“Is drebrin localization to synapses (or its presence in synaptosomes) modified by BDNF treatment +/- LRRK2?” To try and address this question, we adapted a previously published assay to measure drebrin exodus from dendritic spines. During calcium entry and LTP, drebrin exits dendritic spines and accumulates in the dendritic shafts and cell body (Koganezawa et al., 2017). This facilitates the reorganization of the actin cytoskeleton (Shirao et al., 2017). Given the known role of drebrin and its interaction with LRRK2, we hypothesized that LRRK2 loss might affect drebrin relocalization during spine maturation.

      To test this, we treated DIV14 primary cortical neurons from Lrrk2 WT and KO mice with BDNF for 5, 15, and 24 hours, then performed confocal imaging of drebrin localization (Author response image 1). Neurons were transfected at DIV4 with GFP (cell filler) and PSD95 (dendritic spines) for visualization, and endogenous drebrin was stained with an anti-drebrin antibody. We then measured drebrin's overlap with PSD95-positive puncta to track its localization at the spine.

      In Lrrk2 WT neurons, drebrin relocalized from spines after BDNF stimulation, peaking at 15 minutes and showing higher co-localization with PSD95 at 24 hours, indicating the spine remodeling occurred. In contrast, Lrrk2 KO neurons showed no drebrin exodus. These findings support the notion that LRRK2's interaction with drebrin is important for spine remodeling via BDNF. However, additional experiments with larger sample sizes are needed, which were not feasible within the revision timeframe (here n=2 experiments with independent neuronal preparations, n=4-7 neurons analyzed per experiment). Thus, we included the relevant figure as Author response image 1 but chose not to add it in the manuscript (figure 3).

      Author response image 1.

      Lrrk2 affects drebrin exodus from dendritic spines. After the exposure to BDNF for different times (5 minutes, 15 minutes and 24 hours), primary neurons from Lrrk2 WT and KO mice have been transfected with GFP and PSD95 and stained for endogenous drebrin at DIV4. The amount of drebrin localizing in dentritic spines outlined by PSD95 has been assessed at DIV14. The graph shows a pronounced decrease in drebrin content in WT neurons during short time treatments and an increase after 24 hours. KO neurons present no evident variations in drebrin localization upon BDNF stimulation. Scale bar: 4 μm.<br />

      (2) The experiments make use of multiple different kinds of preps. This makes it difficult at times to follow and interpret some of the experiments, and it would be of great benefit to more assertively insert "mouse" or "human" and cell type (cortical, glutamatergic, striatal, gabaergic) etc. 

      We thank the Reviewer for pointing this out. We have now more clearly specified the cell type and species identity throughout the text to improve clarity and interpretation.

      (3) Although BDNF induces quantitatively lower levels of ERK or Akt phosphorylation in LRRK2KO preps based on the graphs (Figure 4B, D), the western blot data in Figure 4C make clear that BDNF does not need LRRK2 to mediate either ERK or Akt activation in mouse cortical neurons and in 4A, ERK in SH-SY5Y cells. The presentation of the data in the results (and echoed in the discussion) writes of a "remarkably weaker response". The data in the blots demand more nuance. It seems that LRRK2 may potentiate a response to BDNF that in neurons is independent of LRRK2 kinase activity (as noted). This is more of a point of interpretation, but the words do not match the images.  

      We thank the Reviewer for pointing this out. We have rephrased our data  presentation to better convey  our findings. We were not surprised to find that loss of LRRK2 causes only a reduction of ERK and AKT activation upon BDNF rather than a complete loss. This is because these pathways are complex and redundant and are activated by a number of cellular effectors. The fact that LRRK2 is one among many players whose function can be compensated by other signaling molecules is also supported by the phenotype of Lrrk2 KO mice that is measurable at 1 month but disappears with adulthood (4 and 18 months) (figure 5).

      Moreover, we removed the sentence “Of note, 90 mins of Lrrk2 inhibition (MLi-2) prior to BDNF stimulation did not prevent phosphorylation of Akt and Erk1/2, suggesting that LRRK2 participates in BDNF-induced phosphorylation of Akt and Erk1/2 independently from its kinase activity but dependently from its ability to be phosphorylated at Ser935 (Fig. 4C-D and Fig. 1B-C)” since the MLi-2 treatment prior to BDNF stimulation was not quantified and our new data point to an involvement of LRRK2 kinase activity upon BDNF stimulation.

      (4) Figure 4F/G shows an increase in PSD95 puncta per unit length in response to BDNF in mouse cortical neurons. The data do not show spine induction/dendritic spine density/or spine morphogenesis as suggested in the accompanying text (page 8). Since the neurons are filled/express gfp, spine density could be added or spines having PSD95 puncta. However, the data as reported would be expected to reflect spine and shaft PSDs and could also include some nonsynaptic sites. 

      The Reviewer is right. We have rephrased the text to reflect an increase in postsynaptic density (PSD) sites, which may include both spine and shaft PSDs, as well as potential nonsynaptic sites.

      (5) Experimental details are missing that are needed to fully interpret the data. There are no electron microscopy methods outside of the figure legend. And for this and most other microscopy-based data, there are few to no descriptions of what cells/sites were sampled, how many sites were sampled, and how regions/cells were chosen. For some experiments (like Figure 5D), some detail is provided in the legend (20 segments from each mouse), but it is not clear how many neurons this represents, where in the striatum these neurons reside, etc. For confocal z-stacks, how thick are the optical sections and how thick is the stack? The methods suggest that data were analyzed as collapsed projections, but they cite Imaris, which usually uses volumes, so this is confusing. The guide (sgRNA) sequences that were used should be included. There is no mention of sex as a biological variable. 

      We thank the Reviewer for pointing out this missing information. We have now included:

      (1) EM methods (page 24)

      (2) Methods for ICC and confocal microscopy now incorporates the Z-stack thickness (0.5 μm x 6 = 3 μm) on page 23.

      (3) Methods for Golgi-Cox staining now incorporates the Z-stack thickness and number of neurons and segments per neuron analyzed. 

      (4) The sex of mice is mentioned in the material and methods (page 17): “Approximately equal numbers of males and females were used for every experiment”.

      (6) For Figures 1F, G, and E, how many experimental replicates are represented by blots that are shown? Graphs/statistics could be added to the supplement. For 1C and 1I, the ANOVA p-value should be added in the legend (in addition to the post hoc value provided). 

      The blots relative to figure 1F,G and E are representative of several blots (at least n=5). The same redouts are part of figure 4 where quantifications are provided. We added the ANOVA p-value in the legend for figure 1C, 1I and 1K.

      (7) Why choose 15 minutes of BDNF exposure for the mass spec experiments when the kinetics in Figure 1 show a peak at 5 mins?  

      This is an important point. We repeated the experiment in GFP-LRRK2 SH-SY5Y cells (figure S1C) and included the 15 min time point. In addition to confirming that pSer935 increases similarly at 5 and 15 minutes, we also observed an increase in RAB phosphorylation at these time points. As mentioned in our response to Reviewer’s 1, we pretreated with MLi-2 for 90 minutes in this experiment to reduce the high basal phosphorylation stoichiometry of pSer935. 

      (8) The schematic in Figure 6A suggests that iPSCs were plated, differentiated, and cultured until about day 70 when they were used for recordings. But the methods suggest they were differentiated and then cryopreserved at day 30, and then replated and cultured for 40 more days. Please clarify if day 70 reflects time after re-plating (30+70) or total time in culture (70). If the latter, please add some notes about re-differentiation, etc. 

      We thank the reviewer for providing further clarity on the iPSC methodology. In the submitted manuscript 70DIV represents the total time in vitro and the process involved a cryostorage event at 30DIV, with a thaw of the cells and a further 40 days of maturation before measurement.  We have adjusted the methods in both the text and figure (new schematic) to clarify this.  The cryopreservation step has been used in other iPSC methods to great effect (Drummond et al., Front Cell Dev Biol, 2020). Due to the complexity and length of the iPSC neuronal differentiation process, cryopreservation represents a useful method with which to shorten and enhance the ability to repeat experiments and reduce considerable variation between differentiations. User defined differences in culture conditions for each batch of neurons thawed can usefully be treated as a new and separate N compared to the next batch of neurons.

      (9) When Figures 6B and 6C are compared it appears that mEPSC frequency may increase earlier in the LRRK2KO preps than in the WT preps since the values appear to be similar to WT + BDNF. In this light, BDNF treatment may have reached a ceiling in the LRRK2KO neurons.

      We thank the reviewer for his/her comment and observations about the ceiling effects. It is indeed possible that the loss of LRRK2 and the application of BDNF could cause the same elevation in synaptic neurotransmission. In such a situation, the increased activity as a result of BDNF treatment would be masked by the increased activity  observed as a result of LRRK2 KO. To better visualize the difference between WT and KO cultures and the possible ceiling effect, we merged the data in one single graph.  

      (10) Schematic data in Figures 5A and C and Figures 5B and E are too small to read/see the data. 

      We thank the Reviewer for this suggestion. We have now enlarged figure 5A and moved the graph of figure 5D in supplemental figure S5, since this analysis of spine morphology is secondary to the one shown in figure 5C.

      Reviewer #1 (Recommendations For The Authors): 

      Please forgive any redundancy in the comments, I wanted to provide the authors with as much information as I had to explain my opinion. 

      Primary mouse cortical neurons at div14, 20% transient increase in S935 pLRRK2 5min after BDNF, which then declines by 30 minutes (below pre-stim levels, and maybe LRRK2 protein levels do also). 

      In differentiated SHSY5Y cells there is a large expected increase in pERK and pAKT that is sustained way above pre-stim for 60 minutes. There is a 50% initial increase in pLRRK2 (but the blot is not very clear and no double band in these cells), which then looks like reduced well below pre-stim by 30 & 60 minutes. 

      We thank the Reviewer for bring up this important point. We have extensively addressed this issue in the public review rebuttal. In essence, the phosphorylation of Ser935 is near saturation under unstimulated conditions, as evidenced by its high basal stoichiometry, whereas Rab phosphorylation is far from saturation, showing an increase upon BDNF stimulation before returning to baseline levels. This distinction highlights that while pSer935 exhibits a ceiling effect due to its near-maximal phosphorylation at rest, pRab responds dynamically to BDNF, indicating low basal phosphorylation and a significant capacity for increase. Figure 1 in the rebuttal summarizes the new data collected. 

      GFP-fused overexpressed LRRK2 coIPs with drebrin, and this is double following 15 min BDNF. Strong result.

      We thank the Reviewer.

      BDNF-induced pAKT signaling is greatly impaired, and pERK is somewhat impaired, in CRISPR LKO SHSY5Y cells. In mouse primaries, both AKT and Erk phosph is robustly increased and sustained over 60 minutes in WT and LKO. This might be initially less in LKO for Akt (hard to argue on a WB n of 3 with huge WT variability), regardless they are all roughly the same by 60 minutes and even look higher in LKO at 60. This seems like a big disconnect and suggests the impairment in the SHSy5Y cells might have more to do with the CRISPR process than the LRRK2. Were the cells sequenced for off-target CRISPR-induced modifications?  

      Following the Reviewer suggestion – and as discussed in the public review section - we performed an off-target analysis. Specifically, we selected the first 8 putative off targets exhibiting a CDF (Cutting Frequency Determination) off-target-score >0.2. As shown in supplemental file 1, sequence disruption was observed only in the LRRK2 on-target site in LRRK2 KO SH-SY5Y cells, while the 8 off-target regions remained unchanged across the genotypes and relative to the reference sequence.  

      No difference in the density of large PSD-95 puncta in dendrites of LKO primary relative to WT, and the small (10%) increase seen in WT after BDNF might be absent in LKO (it is not clear to me that this is absent in every culture rep, and the data is not highly convincing). This is also referred to as spinogenesis, which has not been quantified. Why not is confusing as they did use a GFP fill... 

      The Reviewer is right that spinogenesis is not the appropriate term for the process analyzed. We replaced “spinogenesis” with “morphological alternation of dendritic protrusions” or “synapse maturation” which is correlated with the number of PSD95 positive puncta (ElHusseini et al., Science, 2000) . 

      There is a difference in the percentage of dendritic protrusions classified as filopodia to more being classified as thin spines in LKO striatal neurons at 1 month, which is not seen at any other age, The WT filopodia seems to drop and thin spine percent rise to be similar to LKO at 4 months. This is taken as evidence for delayed maturation in LKO, but the data suggest the opposite. These authors previously published decreased spine and increased filopodia density at P15 in LKO. Now they show that filopodia density is decreased and thin spine density increased at one month. How is that shift from increased to decreased filopodia density in LKO (faster than WT from a larger initial point) evidence of impaired maturation? Again this seems accelerated? 

      We agree with the Reviewer that the initial interpretation was indeed confusing. To adhere closely to our data and avoid overinterpretation – as also suggested by Reviewer 2 – we revised  the text and moved figure 5D to supplementary materials. In essence, our data point out to alterations in the structural properties of dendritic protrusions in young KO mice, specifically a reduction in  their size (head width and neck height) and a decrease in postsynaptic density (PSD) length, as observed with TEM. These findings suggest that LRRK2 is involved in morphological processes during spine development. 

      Shank3 and PSD95 mRNA transcript levels were reduced in the LKO midbrain, only shank3 was reduced in the striatum and only PSD was reduced in the cortex. No changes to mRNA of BDNF-related transcripts. None of these mRNA changes protein-validated. Drebrin protein (where is drebrin mRNA?) levels are reduced in LKO at 1&4 but not clearly at 18 months (seems the most robust result but doesn't correlate with other measures, which here is basically a transient increase (1m) in thin striatal spines).  

      As illustrated before, we performed qPCR for Dbn1 and found that its expression is significantly reduced in the cortex and midbrain and non-significantly reduced in the striatum (1 months old mice, a different cohort as those used for the other analysis in figure 5).  

      24h BDNF increases the frequency of mEPSCs on hIPSC-derived cortical-like neurons, but not LKO, which is already high. There are no details of synapse number or anything for these cultures and compares 24h treatment. BDNF increases mEPSC frequency within minutes PMC3397209, and acute application while recording on cells may be much more informative (effects of BDNF directly, and no issues with cell-cell / culture variability). Calling mEPSC "spontaneous electrical activity" is not standard.  

      We thank the reviewer for this point. We provided information about synapse number (Bassoon/Homer colocalization) in supplementary figure S7. The lack of response of LRRK2 KO cultures in terms of mEPSC is likely due to increase release probability as the number of synapses does not change between the two genotypes. 

      The pattern of LRRK2 activation is very disconnected from that of BDNF signalling onto other kinases. Regarding pLRRK2, s935 is a non-autophosph site said to be required for LRRK2 enzymatic activity, that is mostly used in the field as a readout of successful LRRK2 inhibition, with some evidence that this site regulates LRRK2 subcellular localization (which might be more to do with whether or not it is p at 935 and therefor able to act as a kinase). 

      The authors imply BDNF is activating LRRK2, but really should have looked at other sites, such as the autophospho site 1292 and 'known' LRRK2 substrates like T73 pRab10 (or other e.g., pRab12) as evidence of LRRK2 activation. One can easily argue that the initial increase in pLRRK2 at this site is less consequential than the observation that BDNF silences LRRK2 activity based on p935 being sustained to being reduced after 5 minutes, and well below the prestim levels... not that BDNF activates LRRK2. 

      As described above, we have collected new data showing that BDNF stimulation increases LRRK2 kinase activity toward its physiological substrates Rab10 and Rab8 (using a panphospho-Rab antibody) (Figure 1 and Figure S1). Additionally, we have also extensively commented the ceiling effect of pS935.

      BDNF does a LOT. What happens to network activity in the neural cultures with BDNF application? Should go up immediately. Would increasing neural activity (i.e., through depolarization, forskolin, disinhibition, or something else without BDNF) give a similar 20% increase in pS935 LRRK2? Can this be additive, or occluded? This would have major implications for the conclusions that BDNF and pLRRK2 are tightly linked (as the title suggests).  

      These are very valuable observations; however, they fall outside the scope and timeframe of this study. We agree that future research should focus on gaining a deeper mechanistic understanding of how LRRK2 regulates synaptic activity, including vesicle release probability and postsynaptic spine maturation, independently of BDNF.

      Figures 1A & H "Western blot analysis revealed a rapid (5 mins) and transient increase of Ser935 phosphorylation after BDNF treatment (Fig. 1B and 1C). Of interest, BDNF failed to stimulate Ser935 phosphorylation when neurons were pretreated with the LRRK2 inhibitor MLi-2" . The first thing that stands out is that the pLRRK2 in WB is not very clear at all (although we appreciate it is 'a pig' to work with, I'd hope some replicates are clearer); besides that, the 20% increase only at 5min post-BDNF stimulation seems like a much less profound change than the reduction from base at 60 and more at 180 minutes (where total LRRK2 protein is also going down?). That the blot at 60 minutes in H is representative of a 30% reduction seems off... makes me wonder about the background subtraction in quantification (for this there is much less pLRRK2 and more total LRRK2 than at 0 or 5). LRRK2 (especially) and pLRRK2 seem very sketchy in H. Also, total LRRK2 appears to increase in the SHSY5Y cell not the neurons, and this seems even clearer in 2 H. 

      To better visualize the dynamics of pS935 variation relative to time=0, we presented the data as the difference between t=0 and t=x. It clearly shows that pSe935 goes below prestimulation levels, whereas pRab10 does not. The large difference in the initial stoichiometry of these two phosphorylation is extensively discussed above.

      That MLi2 eliminates pLRRK2 (and seems to reduce LRRK2 protein?) isn't surprising, but a 90min pretreatment with MLi-2 should be compared to MLi-2's vehicle alone (MLi-2 is notoriously insoluble and the majority of diluents have bioactive effects like changing activity)... especially if concluding increased pLRRK2 in response to BDNF is a crucial point (when comparing against effects on other protein modifications such as pAKT). This highlights a second point... the changes to pERK and pAKT are huge following BDNF (nothing to massive quantities), whereas pLRRK2 increases are 20-50% at best. This suggests a very modest effect of BDNF on LRRK in neurons, compared to the other kinases. I worry this might be less consequential than claimed. Change in S1 is also unlikely to be significant... 

      These comments have been thoroughly addressed in the previous responses. Regarding fig. S1, we added an additional experiment (Figure S1C) in GFP-LRRK2 cells showing robust activation of LRRK2 (pS935, pRabs) at the timepoint of MS (15 min).

      "As the yields of endogenous LRRK2 purification were insufficient for AP-MS/MS analysis, we generated polyclonal SH-SY5Y cells stably expressing GFP-LRRK2 wild-type or GFP control (Supplementary Fig. 1)" . I am concerned that much is being assumed regarding 'synaptic function' from SHSY5Y cells... also overexpressing GFP-LRRK2 and looking at its binding after BDNF isn't synaptic function.  

      We appreciate the reviewer’s comment. We would like to clarify that the interactors enriched upon BDNF stimulation predominantly fall into semantic categories related to the synapse and actin cytoskeleton. While this does not imply that these interactors are exclusively synaptic, it suggests that this tightly interconnected network likely plays a role in synaptic function. This interpretation is supported by several lines of evidence: (1) previous studies have demonstrated the relevance of this compartment to LRRK2 function; (2) our new phosphoproteomics data from striatal lysate highlight enrichment of synaptic categories; and (3) analysis of the latest GWAS gene list (134 genes) also indicates significant enrichment of synapse-related categories. Taken together, these findings justify further investigation into the role of LRRK2 in synaptic biology, as discussed extensively in the manuscript’s discussion section.

      Figure 2A isn't alluded to in text and supplemental table 1 isn't about LRRK2 binding, but mEPSCs. 

      We have added Figure 2A and added supplementary .xls table 1, which refers to the excel list of genes with modulated interaction upon BDNF (uploaded in the supplemental material).

      We added the extension .xls also for supplementary table 2 and 3. 

      Figure 2A is useless without some hits being named, and the donut plots in B add nothing beyond a statement that "35% of 'genes' (shouldn't this be proteins?) among the total 207 LRRK2 interactors were SynGO annotated" might as well [just] be the sentence in the text. 

      We have now included the names of the most significant hits, including cytoskeletal and translation-related proteins, as well as known LRRK2 interactors. We decided to retain the donut plots, as we believe they simplify data interpretation for the reader, reducing the need to jump back and forth between the figures and the text.

      Validation of drebrin binding in 2H is great... although only one of 8 named hits; could be increased to include some of the others. A concern alludes to my previous point... there is no appreciable LRRK2 in these cells until GFP-LRRK2 is overexpressed; is this addressed in the MS? Conclusions would be much stronger if bidirectional coIP of these binding candidates were shown with endogenous (GFP-ve) LRRK2 (primaries or hIPSCs, brain tissue?) 

      To address the Reviewer’s concerns to the best of our abilities, we have added a blot in Supplemental figure S1A showing how the expression levels of LRRK2 increase after RA differentiation. Moreover, we have included several new data further strengthening the functional link between LRRK2 and drebrin, including qPCR of Dbn1 in one-month old Lrrk2 KO brains, western blots of Lrrk2 and Rab in Dbn1 KO brains, and co-IP with drebrin N- and Cterm domains. 

      Figures 3 A-C are not informative beyond the text and D could be useful if proteins were annotated. 

      To avoid overcrowding, proteins were annotated in A and the same network structure reported for synaptic and actin-related interactors. 

      Figure 4. Is this now endogenous LRRK2 in the SHSY5Y cells? Again not much LRRK2 though, and no pLRRK shown. 

      We confirm that these are naïve SH-SY5Y cells differentiated with RA and LRRK2 is endogenous. We did not assess pS935 in this experiment, as the primary goal was to evaluate pAKT and pERK1/2 levels. To avoid signal saturation, we loaded less total protein (30 µg instead of the 80 µg typically required to detect pS935). pS935 levels were extensively assessed in Figure 1. This experimental detail has now been added in the material and methods section (page 18).

      In C (primary neurons) There is very little increase in pLRRK2 / LRRK2 at 5 mins, and any is much less profound a change than the reduction at 30 & 60 mins. I think this is interesting and may be a more substantial consequence of BDNF treatment than the small early increase. Any 5 min increase is gone by 30 and pLRRK2 is reduced after. This is a disconnect from the timing of all the other pProteins in this assay, yet pLRRK2 is supposed to be regulating the 'synaptic effects'? 

      The first part of the question has already been extensively addressed. Regarding the timing, one possibility is that LRRK2 is activated upstream of AKT and ERK1/2, a hypothesis supported by the reduced activation of AKT and ERK1/2 observed in LRRK2 KO cells, as discussed in the manuscript, and in MLi-2 treated cells (Author response image 2). Concerning the synaptic effects, it is well established that synaptic structural and functional plasticity occurs downstream of receptor activation and kinase signaling cascades. These changes can be mediated by both rapid mechanisms (e.g., mobilization of receptor-containing endosomes via the actin cytoskeleton) and slower processes involving gene transcription of immediate early genes (IEGs). Since structural and functional changes at the synapse generally manifest several hours after stimulation, we typically assessed synaptic activity and structure 24 hours post-stimulation.

      Akt Erk1&2 both go up rapidly after BDNF in WT, although Akt seems to come down with pLRRK2. If they aren't all the same Akt is probably the most different between LKO and WT but I am very concerned about an n=3 for wb, wb is semi-quantitative at best, and many more than three replicates should be assessed, especially if the argument is that the increases are quantitively different between WT v KO (huge variability in WT makes me think if this were done 10x it would all look same). Moreover, this isn't similar to the LKO primaries  "pulled pups" pooled presumably. 

      Despite some variability in the magnitude of the pAKT/pERK response in naïve SH-SY5Y cells, all three independent replicates consistently showed a reduced response in LRRK2 KO cells, yielding a highly significant result in the two-way ANOVA test. In contrast, the difference in response magnitude between WT and LRRK2 KO primary cultures was less pronounced, which justified repeating the experiments with n=9 replicates. We hope the Reviewer acknowledges the inherent variability often observed in western blot experiments, particularly when performed in a fully independent manner (different cultures and stimulations, independent blots).

      To further strengthen the conclusion that this effect is reproducible and dependent on LRRK2 kinase activity upstream of AKT and ERK, we probed the membranes in figure 1H with pAKT/total AKT and pERK/total ERK. All things considered and consistent with our hypothesis, MLi-2 significantly reduced BDNF-mediated AKT and ERK1/2 phosphorylation levels (Author response image 2). 

      Author response image 2.

      Western blot (same experiments as in figure 1) was performed using antibodies against phospho-Thr202/185 ERK1/2, total ERK1/2 and phospho-Ser473 AKT, total AKT protein levels Retinoic acid-differentiated SH-SY5Y cells stimulated with 100 ng/mL BDNF for 0, 5, 30, 60 mins. MLi-2 was used at 500 nM for 90 mins to inhibit LRRK2 kinase activity.

      G lack of KO effect seems to be skewed from one culture in the plot (grey). The scatter makes it hard to read, perhaps display the culture mean +/- BDNF with paired bars. The fact that one replicate may be changing things is suggested by the weirdly significant treatment effect and no genotype effect. Also, these are GFP-filled cells, the dendritic masks should be shown/explained, and I'm very surprised no one counted the number (or type?) of protrusions, especially as the text describes this assay (incorrectly) as spinogenesis... 

      As suggested by the Reviewer we have replotted the results as bar graphs. Regarding the number of protrusions, we initially counted the number of GFP+ puncta in the WT and did not find any difference (Author response image 3). Due to our imaging setup (confocal microscopy rather than super-resolution imaging and Imaris 3D reconstruction), we were unable to perform a fine morphometric analysis. However, this was not entirely unexpected, as BDNF is known to promote both the formation and maturation of dendritic spines. Therefore, we focused on quantifying PSD95+ puncta as a readout of mature postsynaptic compartments. While we acknowledge that we cannot definitively conclude that each PSD95+ punctum is synaptically connected to a presynaptic terminal, the data do indicate an increase in the number of PSD95+ structures following BDNF stimulation.

      Author response image 3.

      GFP+ puncta per unit of neurite length (µm) in DIV14 WT primary neurons untreated or upon 24 hour of BDNF treatment (100 ng/ml). No significant difference were observed (n=3).

      Figure 5. "Dendritic spine maturation is delayed in Lrrk2 knockout mice". The only significant change is at 1 month in KO which shows fewer filopodia and increased thin spines (50% vs wt). At 4 months the % of thin spines is increased to 60% in both... Filopodia also look like 4m in KO at 1m... How is that evidence for delayed maturation? If anything it suggests the KO spines are maturing faster. "the average neck height was 15% shorter and the average head width was 27% smaller, meaning that spines are smaller in Lrrk2 KO brains" - it seems odd to say this before saying that actually there are just MORE thin spines, the number of mature "mushroom' is same throughout, and the different percentage of thin comes from fewer filopodia. This central argument that maturation is delayed is not supported and could be backwards, at least according to this data. Similarly, the average PSD length is likely impacted by a preponderance of thin spines in KO... which if mature were fewer would make sense to say delayed KO maturation, but this isn't the case, it is the fewer filopodia (with no PSD) that change the numbers. See previous comments of the preceding manuscript. 

      We agree that thin spines, while often considered more immature, represent an intermediate stage in spine development. The data showing an increase in thin spines at 1 month in the KO mice, along with fewer filopodia, could suggest a faster stabilization of these spines, which might indeed be indicative of premature maturation rather than delayed maturation. This change in spine morphology may indicate that the dynamics of synaptic plasticity are affected. Regarding the PSD length, as the Reviewer pointed out, the increased presence of thin spines in KO might account for the observed changes in PSD measurements, as thin spines typically have smaller PSDs. This further reinforces the idea that the overall maturation process may be altered in the KO, but not necessarily delayed. 

      We rephrase the interpretation of these data, and moved figure 5D as supplemental figure S4.

      "To establish whether loss of Lrrk2 in young mice causes a reduction in dendritic spines size by influencing BDNF-TrkB expression" - there is no evidence of this.  

      We agree and reorganized the text, removing this sentence.  

      Shank and PSD95 mRNA changes being shown without protein adds very little. Why is drebrin RNA not shown? Also should be several housekeeping RNAs, not one (RPL27)? 

      We measured Dbn1 mRNA, which shows a significant reduction in midbrain and cortex. Moreover we have now normalized the transcript levels against the geometrical means of three housekeeping genes (RPL27, actin, and GAPDH) relative abundance.

      Drebrin levels being lower in KO seems to be the strongest result of the paper so far (shame no pLRRK2 or coIP of drebrin to back up the argument). DrebrinA KO mice have normal spines, what about haploinsufficient drebrin mice (LKO seem to have half derbrin, but only as youngsters?)  

      As extensively explained in the public review, we used Dbn1 KO mouse brains and were able to show reduced Lrrk2 activity.

      Figure 6. hIPSC-derived cortical neurons. The WT 'cortical' neurons have a very low mEPSC frequency at 0.2Hz relative to KO. Is this because they are more or less mature? What is the EPSC frequency of these cells at 30 and 90 days for comparison? Also, it is very very hard to infer anything about mEPSC frequency in the absence of estimates of cell number and more importantly synapse number. Furthermore, where are the details of cell measures such as capacitance, resistance, and quality control e.g., Ra? Table s1 seems redundant here, besides suggesting that the amplitude is higher in KO at base. 

      We agree that the developmental trajectory of iPSC-derived neurons is critical to accurately interpreting synaptic function and plasticity. In response, we have included additional data now presented in the supplementary figure S7 and summarize key findings below:

      At DIV50, both WT and LRRK2 KO neurons exhibit low basal mEPSC activity (~0.5 Hz) and no response to 24 h BDNF stimulation (50 ng/mL).

      At DIV70 WT neurons show very low basal activity (~0.2 Hz), which increases ~7.5-fold upon BDNF treatment (1.5 Hz; p < 0.001), and no change in synapse number. KO neurons display elevated basal activity (~1 Hz) similar to BDNF-treated WT neurons, with no further increase upon BDNF exposure (~1.3 Hz) and no change in synapse number.

      At DIV90, no significant effect of BDNF in both WT and KO, indicating a possible saturation of plastic responses. The lack of BDNF response at DIV90 may be due to endogenous BDNF production or culture-based saturation effects. While these factors warrant further investigation (e.g., ELISA, co-culture systems), they do not confound the key conclusions regarding the role of LRRK2 in synaptic development and plasticity:

      LRRK2 Enables BDNF-Responsive Synaptic Plasticity. In WT neurons, BDNF induces a significant increase in neurotransmitter release (mEPSC frequency) with no reduction in synapse number. This dissociation suggests BDNF promotes presynaptic functional potentiation. KO neurons fail to show changes in either synaptic function or structure in response to BDNF, indicating that LRRK2 is required for activity-dependent remodeling.

      LRRK2 Loss Accelerates Synaptic Maturation. At DIV70, KO neurons already exhibit high spontaneous synaptic activity equivalent to BDNF-stimulated WT neurons. This suggests that LRRK2 may act to suppress premature maturation and temporally gate BDNF responsiveness, aligning with the differences in maturation dynamics observed in KO mice (Figure 5).  

      As suggested by the reviewer we reported the measurement of resistance and capacitance for all DIV (Table 1, supplemental material). A reduction in capacitance was observed in WT neurons at DIV90, which may reflect changes in membrane complexity. However, this did not correlate with differences in synapse number and is unlikely to account for the observed differences in mEPSC frequency. To control for cell number between groups, cell count prior to plating was performed (80k/cm2; see also methods) on the non-dividing cells to keep cell number consistent.

      The presence of BDNF in WT seems to make them look like LKO, in the rest of the paper the suggestion is that the LKO lack a response to BDNF. Here it looks like it could be that BDNF signalling is saturated in LKO, or they are just very different at base and lack a response.

      Knowing which is important to the conclusions, and acute application (recording and BDNF wash-in) would be much more convincing.

      We agree with the Reviewer’s point that saturation of BDNF could influence the interpretation of the data if it were to occur. However, it is important to note that no BDNF exists in the media in base control and KO neuronal culture conditions. This is  different from other culture conditions and allows us to investigate the effects of  BDNF treatment. Thus, the increased mEPSC frequency observed in KO neurons compared to WT neurons is defined only by the deletion of the gene and not by other extrinsic factors which were kept consistent between the groups. The lack of response or change in mEPSC frequency in KO is proposed to be a compensatory mechanism due to the loss of LRRK2. Of Note, LRRK2 as a “synaptic break” has already been described (Beccano-Kelly et al., Hum Mol Gen, 2015). However, a comprehensive analysis of the underlying molecular mechanisms will  require future studies beyond  with the scope of this paper.

      "The LRRK2 kinase substrates Rabs are not present in the list of significant phosphopeptides, likely due to the low stoichiometry and/or abundance" Likely due to the fact mass spec does not get anywhere near everything. 

      We removed this sentence in light of the new phosphoproteomic analysis.

      Figure 7 is pretty stand-alone, and not validated in any way, hard to justify its inclusion?  

      As extensively explained we removed figure 7 and included the new phospho-MS as part of figure. 3

      Writing throughout shows a very selective and shallow use of the literature.  

      We extensively reviewed the citations.

      "while Lrrk1 transcript in this region is relatively stable during development" The authors reference a very old paper that barely shows any LRRK1 mRNA, and no protein. Others have shown that LRRK1 is essentially not present postnatally PMC2233633. This isn't even an argument the authors need to make. 

      We thank the reviewer and included this more appropriate citation. 

      Reviewer #2 (Recommendations For The Authors): 

      Cyfip1 (Fig 3A) is part of the WAVE complex (page 13). 

      We thank the reviewer and specified it.

      The discussion could be more focused. 

      We extensively revised the discussion to keep it more focused.

      Note that we updated the GO ontology analyses to reflect the updated information present in g:Profiler.

      References.

      Nirujogi, R. S., Tonelli, F., Taylor, M., Lis, P., Zimprich, A., Sammler, E., & Alessi, D. R. (2021). Development of a multiplexed targeted mass spectrometry assay for LRRK2phosphorylated Rabs and Ser910/Ser935 biomarker sites. The Biochemical journal, 478(2), 299–326. https://doi.org/10.1042/BCJ20200930

      Worth, D. C., Daly, C. N., Geraldo, S., Oozeer, F., & Gordon-Weeks, P. R. (2013). Drebrin contains a cryptic F-actin-bundling activity regulated by Cdk5 phosphorylation. The Journal of cell biology, 202(5), 793–806. https://doi.org/10.1083/jcb.201303005

      Shirao, T., Hanamura, K., Koganezawa, N., Ishizuka, Y., Yamazaki, H., & Sekino, Y. (2017). The role of drebrin in neurons. Journal of neurochemistry, 141(6), 819–834. https://doi.org/10.1111/jnc.13988

      Koganezawa, N., Hanamura, K., Sekino, Y., & Shirao, T. (2017). The role of drebrin in dendritic spines. Molecular and cellular neurosciences, 84, 85–92. https://doi.org/10.1016/j.mcn.2017.01.004

      Meixner, A., Boldt, K., Van Troys, M., Askenazi, M., Gloeckner, C. J., Bauer, M., Marto, J. A., Ampe, C., Kinkl, N., & Ueffing, M. (2011). A QUICK screen for Lrrk2 interaction partners--leucine-rich repeat kinase 2 is involved in actin cytoskeleton dynamics. Molecular & cellular proteomics: MCP, 10(1), M110.001172. https://doi.org/10.1074/mcp.M110.001172

      Parisiadou, L., & Cai, H. (2010). LRRK2 function on actin and microtubule dynamics in Parkinson disease. Communicative & integrative biology, 3(5), 396–400. https://doi.org/10.4161/cib.3.5.12286

      Chen, C., Masotti, M., Shepard, N., Promes, V., Tombesi, G., Arango, D., Manzoni, C., Greggio, E., Hilfiker, S., Kozorovitskiy, Y., & Parisiadou, L. (2024). LRRK2 mediates haloperidol-induced changes in indirect pathway striatal projection neurons. bioRxiv : the preprint server for biology, 2024.06.06.597594. https://doi.org/10.1101/2024.06.06.597594

      Cheng, J., Novati, G., Pan, J., Bycroft, C., Žemgulytė, A., Applebaum, T., Pritzel, A.,Wong, L. H., Zielinski, M., Sargeant, T., Schneider, R. G., Senior, A. W., Jumper, J., Hassabis, D., Kohli, P., & Avsec, Ž. (2023). Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science (New York, N.Y.), 381(6664), eadg7492. https://doi.org/10.1126/science.adg7492

      Beaudoin, G. M., 3rd, Schofield, C. M., Nuwal, T., Zang, K., Ullian, E. M., Huang, B., & Reichardt, L. F. (2012). Afadin, a Ras/Rap effector that controls cadherin function, promotes spine and excitatory synapse density in the hippocampus. The Journal of neuroscience : the official journal of the Society for Neuroscience, 32(1), 99–110. https://doi.org/10.1523/JNEUROSCI.4565-11.2012

      Fernández, B., Chittoor-Vinod, V. G., Kluss, J. H., Kelly, K., Bryant, N., Nguyen, A. P. T., Bukhari, S. A., Smith, N., Lara Ordóñez, A. J., Fdez, E., Chartier-Harlin, M. C., Montine, T. J., Wilson, M. A., Moore, D. J., West, A. B., Cookson, M. R., Nichols, R. J., & Hilfiker, S. (2022). Evaluation of Current Methods to Detect Cellular Leucine-Rich Repeat Kinase 2 (LRRK2) Kinase Activity. Journal of Parkinson's disease, 12(5), 1423–1447. https://doi.org/10.3233/JPD-213128

      Cirnaru, M. D., Marte, A., Belluzzi, E., Russo, I., Gabrielli, M., Longo, F., Arcuri, L., Murru, L., Bubacco, L., Matteoli, M., Fedele, E., Sala, C., Passafaro, M., Morari, M., Greggio, E., Onofri, F., & Piccoli, G. (2014). LRRK2 kinase activity regulates synaptic vesicle trafficking and neurotransmitter release through modulation of LRRK2 macromolecular complex. Frontiers in molecular neuroscience, 7, 49. https://doi.org/10.3389/fnmol.2014.00049

      Belluzzi, E., Gonnelli, A., Cirnaru, M. D., Marte, A., Plotegher, N., Russo, I., Civiero, L., Cogo, S., Carrion, M. P., Franchin, C., Arrigoni, G., Beltramini, M., Bubacco, L., Onofri, F., Piccoli, G., & Greggio, E. (2016). LRRK2 phosphorylates pre-synaptic Nethylmaleimide sensitive fusion (NSF) protein enhancing its ATPase activity and SNARE complex disassembling rate. Molecular neurodegeneration, 11, 1. https://doi.org/10.1186/s13024-015-0066-z

      Martin, E. R., Gandawijaya, J., & Oguro-Ando, A. (2022). A novel method for generating glutamatergic SH-SY5Y neuron-like cells utilizing B-27 supplement. Frontiers in pharmacology, 13, 943627. https://doi.org/10.3389/fphar.2022.943627

      Kovalevich, J., & Langford, D. (2013). Considerations for the use of SH-SY5Y neuroblastoma cells in neurobiology. Methods in molecular biology (Clifton, N.J.), 1078, 9–21. https://doi.org/10.1007/978-1-62703-640-5_2

      Drummond, N. J., Singh Dolt, K., Canham, M. A., Kilbride, P., Morris, G. J., & Kunath, T. (2020). Cryopreservation of Human Midbrain Dopaminergic Neural Progenitor Cells Poised for Neuronal Differentiation. Frontiers in cell and developmental biology, 8, 578907. https://doi.org/10.3389/fcell.2020.578907

      Tao, X., Finkbeiner, S., Arnold, D. B., Shaywitz, A. J., & Greenberg, M. E. (1998). Ca2+ influx regulates BDNF transcription by a CREB family transcription factor-dependent mechanism. Neuron, 20(4), 709–726. https://doi.org/10.1016/s0896-6273(00)810107

      El-Husseini, A. E., Schnell, E., Chetkovich, D. M., Nicoll, R. A., & Bredt, D. S. (2000). PSD95 involvement in maturation of excitatory synapses. Science (New York, N.Y.), 290(5495), 1364–1368.

      Glebov OO, Cox S, Humphreys L, Burrone J. Neuronal activity controls transsynaptic geometry. Sci Rep. 2016 Mar 8;6:22703. doi: 10.1038/srep22703. Erratum in: Sci Rep. 2016 May 31;6:26422. doi: 10.1038/srep26422. PMID: 26951792; PMCID: PMC4782104.

      Beccano-Kelly DA, Volta M, Munsie LN, Paschall SA, Tatarnikov I, Co K, Chou P, Cao LP, Bergeron S, Mitchell E, Han H, Melrose HL, Tapia L, Raymond LA, Farrer MJ, Milnerwood AJ. LRRK2 overexpression alters glutamatergic presynaptic plasticity, striatal dopamine tone, postsynaptic signal transduction, motor activity and memory. Hum Mol Genet. 2015 Mar 1;24(5):1336-49. doi: 10.1093/hmg/ddu543. Epub 2014 Oct 24. PMID: 25343991.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors use anatomical tracing and slice physiology to investigate the integration of thalamic (ATN) and retrosplenial cortical (RSC) signals in the dorsal presubiculum (PrS). This work will be of interest to the field, as the postsubiculum is thought to be a key region for integrating internal head direction representations with external landmarks. The main result is that ATN and RSC inputs drive the same L3 PrS neurons, which exhibit superlinear summation to near-coincident inputs. Moreover, this activity can induce bursting in L4 PrS neurons, which can pass the signals LMN (perhaps gated by cholinergic input).

      Strengths:

      The slice physiology experiments are carefully done. The analyses are clear and convincing, and the figures and results are well-composed. Overall, these results will be a welcome addition to the field.

      We thank this reviewer for the positive comment on our work.

      Weaknesses:

      The conclusions about the circuit-level function of L3 PrS neurons sometimes outstrip the data, and their model of the integration of these inputs is unclear. I would recommend some revision of the introduction and discussion. I also had some minor comments about the experimental details and analysis.

      Specific major comments:

      (1) I found that the authors' claims sometimes outstrip their data, given that there were no in vivo recordings during behavior. For example, in the abstract, their results indicate "that layer 3 neurons can transmit a visually matched HD signal to medial entorhinal cortex", and in the conclusion they state "[...] cortical RSC projections that carry visual landmark information converge on layer 3 pyramidal cells of the dorsal presubiculum". However, they never measured the nature of the signals coming from ATN and RSC to L3 PrS (or signals sent to downstream regions). Their claim is somewhat reasonable with respect to ATN, where the majority of neurons encode HD, but neurons in RSC encode a vast array of spatial and non-spatial variables other than landmark information (e.g., head direction, egocentric boundaries, allocentric position, spatial context, task history to name a few), so making strong claims about the nature of the incoming signals is unwarranted.

      We agree of course that RSC does not only encode landmark information. We have clarified this point in the introduction (line 69-70) and formulated more carefully in the abstract (removed the word ‘landmark’ in line 17) and in the  introduction (line 82-83). In the discussion we explicitly state that ‘In our slice work we are blind to the exact nature of the signal that is carried by ATN and RSC axons’ (line 522-523).

      (2) Related to the first point, the authors hint at, but never explain, how coincident firing of ATN and RSC inputs would help anchor HD signals to visual landmarks. Although the lesion data (Yoder et al. 2011 and 2015) support their claims, it would be helpful if the proposed circuit mechanism was stated explicitly (a schematic of their model would be helpful in understanding the logic). For example, how do neurons integrate the "right" sets of landmarks and HD signals to ensure stable anchoring? Moreover, it would be helpful to discuss alternative models of HD-to-landmark anchoring, including several studies that have proposed that the integration may (also?) occur in RSC (Page & Jeffrey, 2018; Yan, Burgess, Bicanski, 2021; Sit & Goard, 2023). Currently, much of the Discussion simply summarizes the results of the study, this space could be better used in mapping the findings to the existing literature on the overarching question of how HD signals are anchored to landmarks.

      We agree with the reviewer on the importance of the question, how do neurons integrate the “right” sets of landmarks and HD signals to ensure stable anchoring? Based on our results we provide a schematic to illustrate possible scenarios, and we include it as a supplementary figure (Figure 1, to be included in the ms as Figure 7—figure supplement 2), as well as a new paragraph in the discussion section (line 516-531).  We point out that critical information on the convergence and divergence of functionally defined inputs is still lacking, both for principal cells and interneurons

      Interestingly, recent evidence from functional ultrasound imaging and electrical single cell recording demonstrated that visual objects may refine head direction coding, specifically in the dorsal presubiculum (Siegenthaler et al. bioRxiv 2024.10.21.619417; doi: https://doi.org/10.1101/2024.10.21.619417). The increase in firing rate for HD cells whose preferred firing direction corresponds to a visual landmark could be supported by the supralinear summation of thalamic HD signals and retrosplenial input described in our study. We include this point in the discussion (line 460-462), and hope that our work will spur further investigations.

      Reviewer #2 (Public Review):

      Richevaux et al investigate how anterior thalamic (AD) and retrosplenial (RSC) inputs are integrated by single presubicular (PrS) layer 3 neurons. They show that these two inputs converge onto single PrS layer 3 principal cells. By performing dual-wavelength photostimulation of these two inputs in horizontal slices, the authors show that in most layer 3 cells, these inputs summate supra-linearly. They extend the experiments by focusing on putative layer 4 PrS neurons, and show that they do not receive direct anterior thalamic nor retrosplenial inputs; rather, they are (indirectly) driven to burst firing in response to strong activation of the PrS network.

      This is a valuable study, that investigates an important question - how visual landmark information (possibly mediated by retrosplenial inputs) converges and integrates with HD information (conveyed by the AD nucleus of the thalamus) within PrS circuitry. The data indicate that near-coincident activation of retrosplenial and thalamic inputs leads to non-linear integration in target layer 3 neurons, thereby offering a potential biological basis for landmark + HD binding.

      The main limitations relate to the anatomical annotation of 'putative' PrS L4 neurons, and to the presentation of retrosplenial/thalamic input modularity. Specifically, more evidence should be provided to convincingly demonstrate that the 'putative L4 neurons' of the PrS are not distal subicular neurons (as the authors' anatomy and physiology experiments seem to indicate). The modularity of thalamic and retrosplenial inputs could be better clarified in relation to the known PrS modularity.

      We thank the reviewer for their important feedback. We discuss what defines presubicular layer 4 in horizontal slices, cite relevant literature, and provide new and higher resolution images. See below for detailed responses to the reviewer’s comments, in the section ‘recommendations to authors’.

      Reviewer #3 (Public Review):

      Summary:

      The authors sought to determine, at the level of individual presubiculum pyramidal cells, how allocentric spatial information from the retrosplenial cortex was integrated with egocentric information from the anterior thalamic nuclei. Employing a dual opsin optogenetic approach with patch clamp electrophysiology, Richevaux, and colleagues found that around three-quarters of layer 3 pyramidal cells in the presubiculum receive monosynaptic input from both brain regions. While some interesting questions remain (e.g. the role of inhibitory interneurons in gating the information flow and through different layers of presubiculum, this paper provides valuable insights into the microcircuitry of this brain region and the role that it may play in spatial navigation).

      Strengths:

      One of the main strengths of this manuscript was that the dual opsin approach allowed the direct comparison of different inputs within an individual neuron, helping to control for what might otherwise have been an important source of variation. The experiments were well-executed and the data was rigorously analysed. The conclusions were appropriate to the experimental questions and were well-supported by the results. These data will help to inform in vivo experiments aimed at understanding the contribution of different brain regions in spatial navigation and could be valuable for computational modelling.

      Weaknesses:

      Some attempts were made to gain mechanistic insights into how inhibitory neurotransmission may affect processing in the presubiculum (e.g. Figure 5) but these experiments were a little underpowered and the analysis carried out could have been more comprehensively undertaken, as was done for other experiments in the manuscript.

      We agree that the role of interneurons for landmark anchoring through convergence in Presubiculum requires further investigation. In our latest work on the recruitment of VIP interneurons we begin to address this point in slices (Nassar et al., 2024 Neuroscience. doi: 10.1016/j.neuroscience.2024.09.032.); more work in behaving animals will be needed.

      Reviewer #1 (Recommendations For The Authors):

      Full comments below. Beyond the (mostly minor) issues noted below, this is a very well-written paper and I look forward to seeing it in print.

      Major comments:

      (1) I found that the authors' claims sometimes outstrip their data, given that there were no in vivo recordings during behavior. For example, in the abstract, their results indicate "that layer 3 neurons can transmit a visually matched HD signal to medial entorhinal cortex", and in the conclusion they state "[...] cortical RSC projections that carry visual landmark information converge on layer 3 pyramidal cells of the dorsal presubiculum". However, they never measured the nature of the signals coming from ATN and RSC to L3 PrS (or signals sent to downstream regions). Their claim is somewhat reasonable with respect to ATN, where the majority of neurons encode HD, but neurons in RSC encode a vast array of spatial and non-spatial variables other than landmark information (e.g., head direction, egocentric boundaries, allocentric position, spatial context, task history to name a few), so making strong claims about the nature of the incoming signals is unwarranted.

      Our study was motivated by the seminal work from Yoder et al., 2011 and 2015, indicating that visual landmark information is processed in PoS and from there transmitted to the LMN.  Based on that, and in the interest of readability, we may have used an oversimplified shorthand for the type of signal carried by RSC axons. There are numerous studies indicating a role for RSC in encoding visual landmark information (Auger et al., 2012; Jacob et al., 2017; Lozano et al., 2017; Fischer et al., 2020; Keshavarzi et al., 2022; Sit and Goard, 2023); we agree of course that this is certainly not the only variable that is represented. Therefore we change the text to make this point clear:

      Abstract, line 17: removed the word ‘landmark’

      Introduction, line 69: added “...and supports an array of cognitive functions including memory, spatial and non-spatial context and navigation (Vann et al., 2009; Vedder et al., 2017). ”

      Introduction, line 82: changed “...designed to examine the convergence of visual landmark information, that is possibly integrated in the RSC, and vestibular based thalamic head direction signals”.

      Discussion, line 522-523: added “In our slice work we are blind to the exact nature of the signal that is carried by ATN and RSC axons.”

      (2) Related to the first point, the authors hint at, but never explain, how coincident firing of ATN and RSC inputs would help anchor HD signals to visual landmarks. Although the lesion data (Yoder et al., 2011 and 2015) support their claims, it would be helpful if the proposed circuit mechanism was stated explicitly (a schematic of their model would be helpful in understanding the logic). For example, how do neurons integrate the "right" sets of landmarks and HD signals to ensure stable anchoring? Moreover, it would be helpful to discuss alternative models of HD-to-landmark anchoring, including several studies that have proposed that the integration may (also?) occur in RSC (Page & Jeffrey, 2018; Yan, Burgess, Bicanski, 2021; Sit & Goard, 2023). Currently, much of the Discussion simply summarizes the results of the study, this space could be better used in mapping the findings to the existing literature on the overarching question of how HD signals are anchored to landmarks.

      We suggest a physiological mechanism for inputs to be selectively integrated and amplified, based on temporal coincidence. Of course there are still many unknowns, including the divergence of connections from a single thalamic or retrosplenial input neuron. The anatomical connectivity of inputs will be critical, as well as the subcellular arrangement of synaptic contacts. Neuromodulation and changes in the balance of excitation and inhibition will need to be factored in. While it is premature to provide a comprehensive explanation for landmark anchoring of HD signals in PrS, our results have led us to include a schematic, to illustrate our thinking (Figure 1, see below).

      Do HD tuned inputs from thalamus converge on similarly tuned HD neurons only? Is divergence greater for the retrosplenial inputs? If so, thalamic input might pre-select a range of HD neurons, and converging RSC input might narrow down the precise HD neurons that become active (Figure 1). In the future, the use of activity dependent labeling strategies might help to tie together information on the tuning of pre-synaptic neurons, and their convergence or divergence onto functionally defined postsynaptic target cells. This critical information is still lacking, for principal cells, and also for interneurons. 

      Interneurons may have a key role in HD-to-landmark anchoring. SST interneurons support stability of HD signals (Simonnet et al., 2017) and VIP interneurons flexibly disinhibit the system (Nassar et al., 2024). Could disinhibition be a necessary condition to create a window of opportunity for updating the landmark anchoring of the attractor? Single PV interneurons might receive thalamic and retrosplenial inputs non-specifically. We need to distinguish the conditions for when the excitation-inhibition balance in pyramidal cells may become tipped towards excitation, and the case of coincident, co-tuned thalamic and retrosplenial input may be such a condition. Elucidating the principles of hardwiring of inputs, as for example, selective convergence, will be necessary. Moreover, neuromodulation and oscillations may be critical for temporal coordination and precise temporal matching of HD-to-landmark signals.

      We note that matching directional with visual landmark information based on temporal coincidence as described here does not require synaptic plasticity. Algorithms for dynamic control of cognitive maps without synaptic plasticity have been proposed (Whittington et al., 2025, Neuron): information may be stored in neural attractor activity, and the idea that working memory may rely on recurrent updates of neural activity might generalize to the HD system. We include these considerations in the discussion (line 497-501; 521-531) and hope that our work will spur further experimental investigations and modeling work.

      While the focus of our work has been on PrS, we agree that RSC also treats HD and landmark signals. Possibly the RSC registers a direction to a landmark rather than comparing it with the current HD (Sit & Goard, 2023). We suggest that this integrated information then reaches PrS. In contrast to RSC, PrS is uniquely positioned to update the signal in the LMN (Yoder et al., 2011), cf. discussion (line 516-520).

      Minor comments:

      (1) Fig 1 - Supp 1: It appears there is a lot of input to PrS from higher visual regions, could this be a source of landmark signals?

      Yes, higher visual regions projecting to PrS may also be a source of landmark information, even if the visual signal is not integrated with HD at that stage (Sit & Goard 2023). The anatomical projection from the visual cortex was first described by Vogt & Miller (1983), but not studied on a functional level so far.

      (2) Fig 2F, G: Although the ATN and RSC measurements look quite similar, there are no stats included. The authors should use an explicit hypothesis test.

      We now compare the distributions of amplitudes and of latencies, using the Mann-Whitney U test. No significant difference between the two groups were found. Added in the figure legend: 2F, “Mann-Whitney U test revealed no significant difference (p = 0.95)”. 2G, “Mann-Whitney U test revealed no significant difference (p = 0.13)”.

      (3) Fig 2 - Supp 2A, C: Again, no statistical tests. This is particularly important for panel A, where the authors state that the latencies are similar but the populations appear to be different.

      Inputs from ATN and RSC have a similar ‘jitter’ (latency standard deviation) and ‘tau decay’. We added in the Fig 2 - Supp 2 figure legend: A, “Mann-Whitney U test revealed no significant difference (p = 0.26)”. C, “Mann-Whitney U test revealed no significant difference (p = 0.87)”.

      As a complementary measure for the reviewer, we performed the Kolmogorov-Smirnov test which confirmed that the populations’ distributions for ‘jitter’ were not significantly different, p = 0.1533.

      (4) Fig 4E, F: The statistics reporting is confusing, why are asterisks above the plots and hashmarks to the side?

      Asterisks refer to a comparison between ‘dual’ and ‘sum’ for each of the 5 stimulations in a Sidak multiple comparison test. Hashmarks refer to comparison of the nth stimulation to the 1st one within dual stimulation events (Friedman + Dunn’s multiple comparison test). We mention the two-way ANOVA p-value in the legend (Sum v Dual, for both Amplitude and Surface).

      (5) Fig 5C: I was confused by the 2*RSC manipulation. How do we know if there is amplification unless we know what the 2*RSC stim alone looks like?

      We now label the right panel in Fig 5C as “high light intensity” or “HLI”. Increasing the activation of Chrimson increases the amplitude of the summed EPSP that now exceeds the threshold for amplification of synaptic events. Amplification refers to the shape of the plateau-like prolongation of the peak, most pronounced on the second EPSP, now indicated with an arrow.  We clarify this also in the text (line 309-310).

      (6) Fig 6D (supplement 1): Typo, "though" should be "through"

      Yes, corrected (line 1015).

      (7) Fig 6G (supplement 1): Typo, I believe this refers to the dotted are in panel F, not panel A.

      Yes, corrected (line 1021).

      (8) Fig 7: The effect of muscarine was qualitatively described in the Results, but there is no quantification and it is not shown in the Figure. The results should either be reported properly or removed from the Results.

      We remove the last sentence in the Results.

      (9) Methods: The age and sex of the mice should be reported. Transgenic mouse line should be reported (along with stock number if applicable).

      We used C57BL6 mice with transgenic background (Ai14 mice, Jax n007914  reporter line) or C57BL6 wild type mice. This is now indicated in the Methods (lines 566-567).

      (10) Methods: If the viruses are only referred to with their plasmid number, then the capsid used for the viruses should be specified. For example, I believe the AAV-CAG-tomato virus used the retroAAV capsid, which is important to the experiment.

      Thank you for pointing this out. Indeed the AAV-CAG-tdTom virus used the retroAAV capsid, (line 575).

      (11) Data/code availability: I didn't see any sort of data/code availability statement, will the data and code be made publicly available?

      Data are stored on local servers at the SPPIN, Université Paris Cité, and are made available upon reasonable request. Code for intrinsic properties analysis is available on github (https://github.com/schoki0710/Intrinsic_Properties). This information is now included (line 717-720).

      (12) Very minor (and these might be a matter of opinion), but I believe "records" should be "recordings", and "viral constructions" should be "viral constructs".

      The text had benefited from proofreading by Richard Miles, who always preferred “records” to “recordings” in his writings. We choose to keep the current wording.

      Reviewer #2 (Recommendations For The Authors):

      Below are two major points that require clarification.

      (1) In the last set of experiments presented by the authors (Figs 6 onwards) they focus on 'putative L4' PrS cells. For several lines of evidence (outlined below), I am convinced that these neurons are not presubicular, but belong to the subiculum. I think this is a major point that requires substantial clarification, in order to avoid confusion in the field (see also suggestions on how to address this comment at the end of this section).

      Several lines of evidence support the interpretation that, what the authors call 'L4 PrS neurons', are distal subicular cells:

      (1.1) The anatomical location of the retrogradely-labelled cells (from mammillary bodies injections), as shown in Figs 6B, C, and Fig. 6_1B, very clearly indicates that they belong to the distal subiculum. The subicular-to-PrS boundary is a sharp anatomical boundary that follows exactly the curvature highlighted by the authors' red stainings. The authors could also use specific subicular/PrS markers to visualize this border more clearly - e.g. calbindin, Wfs-1, Zinc (though I believe this is not strictly necessary, since from the pattern of AD fibers, one can already draw very clear conclusions, see point 1.3 below).

      Our criteria to delimit the presubiculum are the following: First and foremost, we rely on the defining presence of antero-dorsal thalamic fibers that target specifically the presubiculum and not the neighbouring subiculum (Simonnet et al., 2017, Nassar et al., 2018, Simonnet and Fricker, 2018; Jiayan Liu et al., 2021). This provides the precise outline of the presubicular superficial layers 1 to 3. It may have been confusing to the reviewer that our slicing angle gives horizontal sections. In fact, horizontal sections are favourable to identify the layer structure of the PrS,  based on DAPI staining and the variations in cell body size. The work by Ishihara and Fukuda (2016) illustrates in their Figure 12 that the presubicular layer 4 lies below the presubicular layer 3, and forms a continuation with the subiculum (Sub1). Their Figure 4 indicates with a dotted line the “generally accepted border between the (distal) subiculum and PreS”, and it runs from the proximal tip of superficial cells of the PrS toward the white matter, among the radial direction of the cortical tissue.  We agree with this definition. Others have sliced coronally (Cembrowski et al., 2018) which renders a different visualization of the border region with the subiculum.

      Second, let me explain the procedure for positioning the patch electrode in electrophysiological experiments on horizontal presubicular slices. Louis Richevaux, the first author, who carried out the layer 4 cell recordings, took great care to stay very close (<50 µm) to the lower limit of the zone where the GFP labeled thalamic axons can be seen. He was extremely meticulous about the visualization under the microscope, using LED illumination, for targeting. The electrophysiological signature of layer 4 neurons with initial bursts (but not repeated bursting, in mice) is another criterion to confirm their identity (Huang et al., 2017). Post-hoc morphological revelation showed their apical dendrites, running toward the pia, sometimes crossing through the layer 3, sometimes going around the proximal tip, avoiding the thalamic axons (Figure 6D). For example the cell in Figure 6, suppl. 1 panel D, has an apical dendrite that runs through layer 3 and layer 1. 

      Third, retrograde labeling following stereotaxic injection into the LMN is another criterion to define PrS layer 4. This approach is helpful for visualization, and is based on the defining axonal projection of layer 4 neurons (Yoder and Taube, 2011; Huang et al., 2017). Due to the technical challenge to stereotaxically inject only into LMN, the resultant labeling may not be limited to PrS layer 4. We cannot entirely exclude some overflow of retrograde tracers (B) or retrograde virus (C) to the neighboring MMN. This would then lead to co-labeling of the subiculum. In the main Figure 6, panels B and C, we agree that for this reason the red labelled cell bodies likely include also subicular neurons, on the proximal side, in addition to L4 presubicular neurons. We now point out this caveat in the main text (line 324-326) and in the methods (line 591-592).

      (1.2) Consistent with their subicular location, neuronal morphologies of the 'putative L4 cells' are selectively constrained within the subicular boundaries, i.e. they do not cross to the neighboring PrS (maybe a minor exception in Figs. 6_1D2,3). By definition, a neuron whose morphology is contained within a structure belongs to that structure.

      From a functional point of view, for the HD system, the most important criterion for defining presubicular layer 4 neurons is their axonal projection to the LMN (Yoder and Taube 2011). From an electrophysiological standpoint, it is the capacity of layer 4 neurons to fire initial bursts (Simonnet et al., 2013; Huang et al., 2017).  Anatomically, we note that the expectation that the apical dendrite should go straight up into layer 3 might not be a defining criterion in this curved and transitional periarchicortex. Presubicular layer 4 apical dendrites may cross through layer 3 and exit to the side, towards the subiculum (This is the red dendritic staining at the proximal end of the subiculum, at the frontier with the subiculum, Figure 6 C).

      (1.3) As acknowledged by the authors in the discussion (line 408): the PrS is classically defined by the innervation domain of AD fibers. As Figure 6B clearly indicates, the retrogradely-labelled cells ('putative L4') are convincingly outside the input domain of the AD; hence, they do not belong to the PrS.

      The reviewer is mistaken here, the deep layers 4 and 5/6 indeed do not lie in the zone innervated by the thalamic fibers (Simonnet et al., 2017; Nassar et al., 2018; Simonnet and Fricker, 2018) but still belong to the presubiculum. The presubicular deep layers are located below the superficial layers, next to, and in continuation of the subiculum. This is in agreement with work by Yoder and Taube 2011; Ishihara and Fukuda 2016; Boccara, … Witter, 2015; Peng et al., 2017 (Fig 2D); Yoshiko Honda et al., (Marmoset, Fig 2A) 2022; Balsamo et al., 2022 (Figure 2B).

      (1.4) Along with the above comment: in my view, the optogenetic stimulation experiments are an additional confirmation that the 'putative L4 cells' are subicular neurons, since they do not receive AD inputs at all (hence, they are outside of the PrS); they are instead only indirectly driven upon strong excitation of the PrS. This indirect activation is likely to occur via PrS-to-Subiculum 'back-projections', the existence of which is documented in the literature and also nicely shown by the authors (see Figure 1_1 and line 109).

      See above. Only superficial layers 1-3 of the presubiculum receive direct AD input.

      (1.5) The electrophysiological properties of the 'putative L4 cells' are consistent with their subicular identity, i.e. they show a sag current and they are intrinsically bursty.

      Presubicular layer 4 cells also show bursting behaviour and a sag current (Simonnet et al., 2013; Huang et al., 2017).

      From the above considerations, and the data provided by the authors, I believe that the most parsimonious explanation is that these retrogradely-labelled neurons (from mammillary body injections), referred to by the authors as 'L4 PrS cells', are indeed pyramidal neurons from the distal subiculum.

      We agree that the retrograde labeling is likely not limited to the presubicular layer 4 cells, and we now indicate this in the text (line 324-326). However, the portion of retrogradely labeled neurons that is directly below the layer 3 should be considered as part of the presubiculum.

      I believe this is a fundamental issue that deserves clarification, in order to avoid confusion/misunderstandings in the field. Given the evidence provided, I believe that it would be inaccurate to call these cells 'L4 PrS neurons'. However, I acknowledge the fact that it might be difficult to convincingly and satisfactorily address this issue within the framework of a revision. For example, it is possible that these 'putative L4 cells' might be retrogradely-labelled from the Medial Mammillary Body (a major subicular target) since it is difficult to selectively restrict the injection to the LMN, unless a suitable driver line is used (if available). The authors should also consider the possibility of removing this subset of data (referring to putative L4), and instead focus on the rest of the story (referring to L3)- which I think by itself, still provides sufficient advance.

      We agree with the reviewer that it is difficult to provide a satisfactory answer. To some extent, the reviewer’s comments target the nomenclature of the subicular region. This transitional region between the hippocampus and the entorhinal cortex has been notoriously ill defined, and the criteria are somewhat arbitrary for determining exactly where to draw the line. Based on the thalamic projection, presubicular layers 1-3 can now be precisely outlined, thanks to the use of viral labeling. But the presubicular layer 4 had been considered to be cell-free in early works, and termed ‘lamina dissecans’ (Boccara 2010), as the limit between the superficial and deep layers. Then it became of great interest to us and to the field, when the PrS layer 4 cells were first identified as LMN projecting neurons (Yoder and Taube 2011). This unique back-projection to the upstream region of the HD system is functionally very important, closing the loop of the Papez circuit (mammillary bodies - thalamus - hippocampal structures).

      We note that the reviewer does not doubt our results, rather questions the naming conventions. We therefore maintain our data. We agree that in the future a genetically defined mouse line would help to better pin down this specific neuronal population.

      We thank the reviewer for sharing their concerns and giving us the opportunity to clarify our experimental approach to target the presubicular layer 4. We hope that these explanations will be helpful to the readers of eLife as well.

      (2) The PrS anatomy could be better clarified, especially in relation to its modular organization (see e.g. Preston-Ferrer et al., 2016; Ray et al., 2017; Balsamo et al., 2022). The authors present horizontal slices, where cortical modularity is difficult to visualize and assess (tangential sections are typically used for this purpose, as in classical work from e.g. barrel cortex). I am not asking the authors to validate their observations in tangential sections, but just to be aware that cortical modules might not be immediately (or clearly) apparent, depending on the section orientation and thickness. The authors state that AD fibers were 'not homogeneously distributed' in L3 (line 135) and refer to 'patches of higher density in deep L3' (line 136). These statements are difficult to support unless more convincing anatomy and  . I see some L3 inhomogeneity in the green channel in Fig. 1G (last two panels) and also in Fig. 1K, but this seems to be rather upper L3. I wonder how consistent the pattern is across different injections and at what dorsoventral levels this L3 modularity is observed (I think sagittal sections might be helpful). If validated, these observations could point to the existence of non-homogeneous AD innervation domains in L3 - hinting at possible heterogeneity among the L3 pyramidal cell targets. Notably, modularity in L2 and L1 is not referred to. The authors state that AD inputs 'avoid L2' (line 131) but this statement is not in line with recent work (cited above) and is also not in line with their anatomy data in Fig. 1G, where modularity is already quite apparent in L2 (i.e. there are territories avoided by the AD fibers in L2) and in L1 (see for example the last image in Fig. 1G). This is the case also for the RSC axons (Fig. 1H) where a patchy pattern is quite clear in L1 (see the last image in panel H). Higher-mag pictures might be helpful here. These qualitative observations imply that AD and RSC axons probably bear a precise structural relationship relative to each other, and relative to the calbindin patch/matrix PrS organization that has been previously described. I am not asking the authors to address these aspects experimentally, since the main focus of their study is on L3, where RSC/AD inputs largely converge. Better anatomy pictures would be helpful, or at least a better integration of the authors' (qualitative) observations within the existing literature. Moreover, the authors' calbindin staining in Fig. 1K is not particularly informative. Subicular, PaS, MEC, and PrS borders should be annotated, and higher-resolution images could be provided. The authors should also check the staining: MEC appears to be blank but is known to strongly express calb1 in L2 (see 'island' by Kitamura et al., Ray et al., Science 2014; Ray et al., frontiers 2017). As additional validation for the staining: I would expect that the empty L2 patches in Figs. 1G (last two panels) would stain positive for Calbindin, as in previous work (Balsamo et al. 2022).

      We now provide a new figure showing the pattern of AD innervation in PrS superficial layers 1 to 3, with different dorso-ventral levels and higher magnification (Figure 2). Because our work was aimed at identifying connectivity between long-range inputs and presubicular neurons, we chose to work with horizontal sections that preserve well the majority of the apical dendrites of presubicular pyramidal neurons. We feel it is enriching for the presubicular literature to show the cytoarchitecture from different angles and to show patchiness in horizontal sections. The non-homogeneous AD innervation domains (‘microdomains’) in L3 were consistently observed across different injections in different animals.

      Author response image 1.

      Thalamic fiber innervation pattern. A, ventral, and B, dorsal horizontal section of the Presubiculum containing ATN axons expressing GFP. Patches of high density of ATN axonal ramifications in L3 are indicated as “ATN microdomains”. Layers 1, 2, 3, 4, 5/6 are indicated.  C, High magnification image (63x optical section)(different animal).<br />

      We also provide a supplementary figure with images of horizontal sections of calbindin staining in PrS, with a larger crop, for the reviewer to check (Figure 3, see below). We thank the reviewer for pointing out recent studies using tangential sections. Our results agree with the previous observation that AD axons are found in calbindin negative territories (cf Fig 1K). Calbindin+ labeling is visible in the PrS layer 2 as well as in some patches in the MEC (Figure 3 panel A). Calbindin staining tends to not overlap with the territories of ATN axonal ramification. We indicate the inhomogeneities of anterior thalamic innervation that form “microdomains” of high density of green labeled fibers, located in layer 1 and layer 3 (Figure 3, Panel A, middle). Panel B shows another view of a more dorsal horizontal section of the PrS, with higher magnification, with a big Calbindin+ patch near the parasubiculum.

      The “ATN+ microdomains” possess a high density of axonal ramifications from ATN, and have been previously documented in the literature. They are consistently present. Our group had shown them in the article by Nassar et al., 2018, at different dorsoventral levels (Fig 1 C (dorsal) and 1D (ventral) PrS). See also Simonnet et al., 2017, Fig 2B, for an illustration of the typical variations in densities of thalamic fibers, and supplementary Figure 1D. Also Jiayan Liu et al., 2021 (Figure 2 and Fig 5) show these characteristic microzones of dense thalamic axonal ramifications, with more or less intense signals across layers 1, 2, and 3.  While it is correct that thalamic axons can be seen to cross layer 2 to ramify in layer 1, we maintain that AD axons typically do not ramify in layer 2. We modify the text to say, “mostly” avoiding L2 (line 130).

      The reviewer is correct in pointing out that the 'patches of higher density in deep L3' are not only in the deep L3, as in the first panel in Fig 1G, but in the more dorsal sections they are also found in the upper L3. We change the text accordingly (line 135-136) and we provide the layer annotation in Figure 1G. We further agree with the reviewer that RSC axons also present a patchy innervation pattern. We add this observation in the text (line 144).

      It is yet unclear whether anatomical microzones of dense ATN axon ramifications in L3 might fulfill the criteria of a functional modularity, as it is the case for the calbindin patch/matrix PrS organization (Balsamo et al., 2022). As the reviewer points out, this will require more information on the precise structural relationship of AD and RSC axons relative to each other, as well as functional studies. Interestingly, we note a degree of variation in the amplitudes of oEPSC from different L3 neurons (Fig. 2F, discussion line 420; 428), which might be a reflection of the local anatomo-functional micro-organization.

      Minor points:

      (1) The pattern or retrograde labelling, or at least the way is referred to in the results (lines 104ff), seems to imply some topography of AD-to-PreS projections. Is it the case? How consistent are these patterns across experiments, and individual injections? Was there variability in injection sites along the dorso-ventral and possibly antero-posterior PrS axes, which could account for a possibly topographical AD-to-PrS input pattern? It would be nice to see a DAPI signal in Fig. 1B since the AD stands out quite clearly in DAPI (Nissl) alone.

      Yes, we find a consistent topography for the AD-to-PrS projection, for similar injection sites in the presubiculum. The coordinates for retrograde labeling were as indicated -4.06 (AP), 2.00 (ML) and -2.15 mm (DV) such that we cannot report on possible variations for different injection sites.

      (2) Fig. 2_2KM: this figure seems to show the only difference the authors found between AD and RS input properties. The authors could consider moving these data into main Fig. 2 (or exchanging them with some of the panels in F-O, which instead show no difference between AD and RSC). Asterisks/stats significance is not visible in M.

      For space reasons we leave the panels of Fig. 2_2KM in the supplementary section. We increased the size of the asterisk in M.

      (3) The data in Fig. 1_1 are quite interesting, since some of the PrS projection targets are 'non-canonical'. Maybe the authors could consider showing some injection sites, and some fluorescence images, in addition to the schematics. Maybe the authors could acknowledge that some of these projection targets are 'putative' unless independently verified by e.g. retrograde labeling. Unspecific white matter labelling and/or spillover is always a potential concern.

      We now include the image of the injection site for data in Fig. 1_1 as a supplementary Fig. 1_2. The Figure 1_1 shows the retrogradely labeled upstream areas of Presubiculum.

      Author response image 2.

      Retrobeads were injected in the right Presubiculum.<br />

      (4) The authors speculate that the near-coincident summation of RS + AD inputs in L3 cells could be a potential mechanism for the binding of visual + HD information in PrS. However, landmarks are learned, and learning typically implies long-term plasticity. As the authors acknowledge in the discussion (lines 493ff) GluR1 is not expressed in PrS cells. What alternative mechanics could the authors envision? How could the landmark-update process occur in PrS, if is not locally stored? RSC could also be involved (Jakob et al) as acknowledged in the introduction - the authors should keep this possibility open also in the discussion.

      A similar point has been raised by Reviewer 1, please check our answer to their point 2. Briefly, our results indicate that HD-to-landmark updating is a multi-step process. RSC may be one of the places where landmarks are learned. The subsequent temporal mapping of HD to landmark signals in PrS might be plasticity-free, as matching directional with visual landmark information based on temporal coincidence does not necessarily require synaptic plasticity.  It seems likely that there is no local storage and no change in synaptic weights in PrS. The landmark-anchored HD signals reach LMN via L4 neurons, sculpting network dynamics across the Papez circuit. One possibility is that the trace of a landmark that matches HD may be stored as patterns of neural activity that could guide navigation (cf. El-Gaby et al., 2024, Nature) Clearly more work is needed to understand how the HD attractor is updated on a mechanistic level. Recent work in prefrontal cortex mentions “activity slots” and delineates algorithms for dynamic control of cognitive maps without synaptic plasticity (Whittington et al., 2025, Neuron): information may be stored in neural attractor activity, and the idea that working memory may rely on recurrent updates of neural activity might generalize to the HD system. We include these considerations in the discussion (line 499-503; 523-533) and also point to alternative models (line 518 -522) including modeling work in the retrosplenial cortex.

      (5) The authors state that (lines 210ff) their cluster analysis 'provided no evidence for subpopulations of layer 3 cells (but see Balsamo et al., 2022)' implying an inconsistency; however, Balsamo et al also showed that the (in vivo) ephys properties of the two HD cell 'types' are virtually identical, which is in line with the 'homogeneity' of L3 ephys properties (in slice) in the authors' data. Regarding the possible heterogeneity of L3 cells: the authors report inhomogeneous AD innervation domains in L3 (see also main comment 2) and differences in input summation (some L3 cells integrate linearly, some supra-linearly; lines 272) which by itself might already imply some heterogeneity. I would therefore suggest rewording the statements to clarify what the lack of heterogeneity refers to.

      We agree. In line 212 we now state “cluster analysis (Figure 2D) provided no evidence for subpopulations of layer 3 cells in terms of intrinsic electrophysiological properties (see also Balsamo et al., 2022).”

      (6) n=6 co-recorded pairs are mentioned at line 348, but n=9 at line 366. Are these numbers referring to the same dataset? Please correct or clarify

      Line 349 refers to a set of 6 co-recorded pairs (n=12 neurons) in double injected mice with Chronos injected in ATN and Chrimson in RSC (cf. Fig. 7E). The 9 pairs mentioned in line 367 refer to another type of experiment where we stimulated layer 3 neurons by depolarizing them to induce action potential firing while recording neighboring layer 4 neurons to assess connectivity. Line 367  now reads: “In n = 9 paired recordings, we did not detect functional synapses between layer 3 and layer 4 neurons.”

      Reviewer #3 (Recommendations For The Authors):

      Questions for the authors/points for addressing:

      I found that the slice electrophysiology experiments were not reported with sufficient detail. For example, in Figure 2, I am assuming that the voltage clamp experiments were carried out using the Cs-based recording solution, while the current clamp experiments were carried out using the K-Gluc intracellular solution. However, this is not explicitly stated and it is possible that all of these experiments were performed using the K-Gluc solution, which would give slightly odd EPSCs due to incomplete space/voltage clamp. Furthermore, the method states that gabazine was used to block GABA(A) receptor-mediated currents, but not when this occurred. Was GABAergic neurotransmission blocked for all measurements of EPSC magnitude/dynamics? If so, why not block GABA(B) receptors? If not blocking GABAergic transmission for measuring EPSCs, why not? This should be stated explicitly either way.

      The addition of drugs or difference of solution is indicated in the figure legend and/or in the figure itself, as well as in the methods. We now state explicitly: “In a subset of experiments, the following drugs were used to modulate the responses to optogenetic stimulations; the presence of these drugs is indicated in the figure and figure legend, whenever applicable.” (line 632). A Cs-based internal solution and gabazine were used in Figure 5, this is now indicated in the Methods section (line 626). All other experiments were performed using K-Gluc as an internal solution and ACSF.

      Methods: The experiments involving animals are incompletely reported. For example, were both sexes used? The methods state "Experiments were performed on wild‐type and transgenic C57Bl6 mice" - what transgenic mice were used and why is this not reported in detail (strain, etc)? I would refer the authors to the ARRIVE guidelines for reporting in vivo experiments in a reproducible manner (https://arriveguidelines.org/).

      We now added this information in the methods section, subsection “Animals” (line 566-567). Animals of both sexes were used. The only transgenic mouse line used was the Ai14 reporter line (no phenotype), depending on the availability in our animal facility.

      For experiments comparing ATN and RSC inputs onto the same neuron (e.g. Figure 2 supplement 2 G - J), are the authors certain that the observed differences (e.g. rise time and paired-pulse facilitation on the ATN input) are due to differences in the synapses and not a result of different responses of the opsins? Refer to https://pubmed.ncbi.nlm.nih.gov/31822522/ from Jess Cardin's lab. This could easily be tested by switching which opsin is injected into which nucleus (a fair amount of extra work) or comparing the Chrimson synaptic responses with those evoked using Chronos on the same projection, as used in Figure 2 (quite easy as authors should already have the data).

      We actually did switch the opsins across the two injection sites. In Figure 2 - supplement 2G-J, the values linked by a dashed line result from recordings in the switched configuration with respect to the original configuration (in full lines, Chronos injected in RSC and Chrimson in ATN). The values from switched configuration followed the trend of the main configuration and were not statistically different (Mann-Whitney U test).

      Statistical reporting: While the number of cells is generally reported for experiments, the number of slices and animals is not. While slice ephys often treat cells as individual biological replicates, this is not entirely appropriate as it could be argued that multiple cells from a single animal are not independent samples (some sort of mixed effects model that accounts for animals as a random effect would be better). For the experiments in the manuscript, I don't think this is necessary, but it would certainly reassure the reader to report how many animals/slices each dataset came from. At a bare minimum, one would want any dataset to be taken from at least 3 animals from 2 different litters, regardless of how many cells are in there.

      Our slice electrophysiology experiments include data from 38 successfully injected animals: 14 animals injected in ATN, 20 animals injected in RSC, and 4 double injected animals. Typically, we recorded 1 to 3 cells per slice. We now include this information in the text or in the figure legends (line 159, 160, 297, 767, 826, 831, 832, 839, 845, 901, 941).

      For the optogenetic experiments looking at the summation of EPSPs (e.g. figure 4), I have two questions: why were EPSPs measured and not EPSCs? The latter would be expected to give a better readout of AMPA receptor-mediated synaptic currents. And secondly, why was 20 Hz stimulation used for these experiments? One might expect theta stimulation to be a more physiologically-relevant frequency of stimulation for comparing ATN and RSC inputs to single neurons, given the relevance with spatial navigation and that the paper's conclusions were based around the head direction system. Similarly, gamma stimulation may also have been informative. Did the authors try different frequencies of stimulation?

      Question 1. The current clamp configuration allows to measure  EPSPamplification/prolongation by NMDA or persistent Na currents (cf.  Fricker and Miles 2000), which might contribute to supralinearity.

      Question 2. In a previous study from our group about the AD to PrS connection (Nassar et al., 2018), no significant difference was observed on the dynamics of EPSCs between stimulations at 10 Hz versus 30 Hz. Therefore we chose 20 Hz. This value is in the range of HD cell firing (Taube 1995, 1998 (peak firing rates, 18 to 24 spikes/sec in RSC; 41 spikes/sec in AD)(mean firing rates might be lower), Blair and Sharp 1995). In hindsight, we agree that it would have been useful to include 8Hz or 40Hz stimulations. 

      The GABA(A) antagonist experiments in Figure 5 are interesting but I have concerns about the statistical power of these experiments - n of 3 is absolutely borderline for being able to draw meaningful conclusions, especially if this small sample of cells came from just 1 or 2 animals. The number of animals used should be stated and/or caution should be applied when considering the potential mechanisms of supralinear summation of EPSPs. It looks like the slight delay in RSC input EPSP relative to ATN that was in earlier figures is not present here - could this be the loss of feedforward inhibition?

      The current clamp experiments in the presence of QX314 and a Cs gluconate based internal solution were preceded by initial experiments using puff applications of glutamate to the recorded neurons (not shown). Results from those experiments had pointed towards a role for TTX resistant sodium currents and for NMDA receptor activation as a factor favoring the amplification and prolongation of glutamate induced events. They inspired the design of the dual wavelength stimulation experiments shown in Figure 5, and oriented our discussion of the results. We agree of course that more work is required to dissect the role of disinhibition for EPSP amplification. This is however beyond the present study.

      Concerning the EPSP onset delays following RSC input stimulation:  In this set of experiments, we compensated for the notoriously longer delay to EPSP onset, following RSC axon stimulation, by shifting the photostimulation (red) of RSC fibers to -2 ms, relative to the onset of photostimulation of ATN fibers (blue). This experimental trick led to an improved  alignment of the onset of the postsynaptic response, as shown in the figure below for the reviewer.

      Author response image 3.

      In these experiments, the onset of RSC photostimulation was shifted forward in time by -2 ms, in an attempt to better align the EPSP onset to the one evoked by ATN stimulation.<br />

      We insert in the results a sentence to indicate that experiments illustrated in Figure 5 were performed in only a small sample of 3 cells that came from 2 mice (line 297), so caution should be applied. In the discussion we  formulate more carefully, “From a small sample of cells it appears that EPSP amplification may be facilitated by a reduction in synaptic inhibition (n = 3; Figure 5)” (line 487).

      Figure 7: I appreciate the difficulties in making dual recordings from older animals, but no conclusion about the RSC input can legitimately be made with n=1.

      Agreed. We want to avoid any overinterpretation, and point out in the results section that the RSC stimulation data is from a single cell pair. The sentence now reads : “... layer 4 neurons occurred after firing in the layer 3 neuron, following ATN afferent stimuli, in 4 out of 5 cell pairs. We also observed this sequence when RSC input was activated, in one tested pair.” line (347-349)

      Minor points:

      Line 104: 'within the two subnuclei that form the anterior thalamus' - the ATN actually has three subdivisions (AD, AV, AM) so this should state 'two of the three nuclei that form the anterior thalamus...'

      Corrected, line 103

      Line 125: should read "figure 1F" and not "figure 2F".

      Corrected, line 124

      Line 277-280: Why were two different posthoc tests used on the same data in Figures 3E & F?

      We used Sidak’s multicomparison test to compare each event Sum vs. Dual (two different configurations at each time point - asterisks) and Friedman’s and Dunn’s to compare the nth EPSP amplitude to the first one for Dual events (same configuration between time points - hashmarks). We give two-way ANOVA results in the legend.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Major concerns:

      (1) Is the direct binding of MCAK to the microtubule cap important for its in vivo function?

      a.The authors claim that their "study provides mechanistic insights into understanding the end-binding mechanism of MCAK". I respectfully disagree. My concern is that the paper offers limited insights into the physiological significance of direct end-binding for MCAK activity, even in vitro. The authors estimate that in the absence of other proteins in vitro, ~95% of MCAK molecules arrive at the tip by direct binding in the presence of ~ physiological ATP concentration (1 mM). In cells, however, the major end-binding pathway may be mediated by EB, with the direct binding pathway contributing little to none. This is a reasonable concern because the apparent dissociation constant measured by the authors shows that MCAK binding to microtubules in the presence of ATP is very weak (69 uM). This concern should be addressed by 1) calculating relative contributions of direct and EB-dependent pathways based on the affinities measured in this and other published papers and estimated intracellular concentrations. Although there are many unknowns about these interactions in cells, a modeling-based analysis may be revealing. 2) the recapitulation of these pathways using purifying proteins in vitro is also feasible. Ideally, some direct evidence should be provided, e.g. based on MCAK function-separating mutants (GDP-Pi tubulin binding vs. catalytic activity at the curled protofilaments) that contribution from the direct binding of MCAK to microtubule cap in EB presence is significant.

      We thank the reviewer for the thoughtful comments.

      (1) We think that the end-binding affinity of MCAK makes a significant contribution for its cellular functions. To elucidate this concept, we now use a simple model shown in Supplementary Appendix-2 (see pages 49-51, lines 1246-1316). In this model, we simplified MCAK and EB1 binding to microtubule ends by considering only these two proteins while neglecting other factors (e.g. XMAP215). Specifically, we considered two scenarios: one in which both proteins freely diffuse in the cytoplasm and another where MCAK is localized to specific cellular structures, such as the centrosome or centromere. Based on the modeling results, we argue that MCAK's functional impact at microtubule ends derives both from its intrinsic end-binding capacity and its ability to strengthen the EB1-mediated end association pathway.

      (2) We agree with the reviewer that MCAK exhibiting a lower end-binding affinity (69 µM) is indeed intriguing, as one might intuitively expect a stronger affinity, e.g. in the nanomolar range. Several factors may contribute to this observation. First, this could be partly due to the in vitro system employed, which may not perfectly replicate in vivo conditions, especially when considering cellular processes quantitatively. Variations in medium composition can significantly influence the binding state. For example, reducing salt concentration leads to a marked increase in MCAK’s binding affinity (Helenius et al., 2006; Maurer et al., 2011; McHugh et al., 2019). Additionally, while numerous binding events with short durations were detected, we excluded transient interactions from our analysis to facilitate quantification. This likely leads to an underestimation of the on-rate and, consequently, the binding affinity. Moreover, to minimize the interference of purification tags (His-tag), we ensured their complete removal during protein sample preparation. Previous studies reported that retaining the His-tag of MAPs affects the binding affinity to microtubules (Maurer et al., 2011; Zhu et al., 2009). Finally, a low affinity is not necessarily unexpected. Considering the microtubule end as a receptor with multiple binding sites for MCAK, the overall binding affinity is in the nanomolar range (260 nM). This does not necessarily contradict MCAK being a microtubule dynamics regulator as only a few MCAK molecules may suffice to induce microtubule catastrophe (as discussed on page 13, lines 408-441).

      (3) Ideally, we would search for mutants that specifically interfere with the binding of GDP-Pi-tubulin or the curled protofilaments. However, the mutant we tested significantly impacts the overall affinity of MCAK to microtubules (both end and lattice), making it challenging to isolate and discuss the function of MCAK with respect to the binding to GDP-Pi-tubulin alone. Additionally, we also think that the GDP-Pi-tubulin in the EB cap and the tubulin in the curved protofilaments may share structural similarities. For instance, the tubulin dimers in both states may be less compact compared to those in the lattice, which could explain why MCAK recognizes both simultaneously (Manka and Moores, 2018). However, this remains a conjecture, as there is currently no direct evidence to support it.

      b. As mentioned in the Discussion, preferential MCAK binding to tubulins near the MT tip may enhance MCAK targeting of terminal tubulins AFTER the MCAK has been "delivered" to the distal cap via the EB-dependent mechanism. This is a different targeting mechanism than the direct MCAK-binding. However, the measured binding affinity between MCAK and GMPCPP tubulins is so weak (69 uM), that this effect is also unlikely to have any impact because the binding events between MCAK and microtubule should be extremely rare. Without hard evidence, the arguments for this enhancement are very speculative.

      Please see our response to the comment No. 1. Additionally, we have revised our discussion to discuss the end-binding affinity of MCAK as well as its physiological relevance (please see page 13, lines 408-441; and see Supplementary Appendix-2 in pages 49-51, lines 1246-1316).

      (2) The authors do not provide sufficient justification and explanation for their investigation of the effects of different nucleotides in MCAK binding affinity. A clear summary of the nucleotide-dependent function of MCAK (introduction with references to prior affinity measurements and corresponding MCAK affinities), the justifications for this investigation, and what has been learned from using different nucleotides (discussion) should be provided. My take on these results is that by far the strongest effect on microtubule wall and tip binding is achieved by adding any adenosine, whereas differences between different nucleotides are relatively minor. Was this expected? What can be learned from the apparent similarity between ATP and AMPPNP effects in some assays (Fig 1E, 4C, etc) but not others (Fig 1D,F, etc)?

      We thank the reviewer for this suggestion. We have revised the manuscript accordingly, and below are the main points of our response

      (1) The experiment investigating the effects of different nucleotides on MCAK binding affinity was inspired by the previous studies demonstrating that kinesin-13 interactions with microtubules are highly dependent on their adenosine-bound states. For example, kinesin-13s tightly bind microtubules and prefer to form protofilament curls or rings with tubulin in the AMPPNP state, whereas kinesin-13s are considered to move along the microtubule lattice via one-dimensional diffusion in the ADP·Pi state (Asenjo et al., 2013; Benoit et al., 2018; Friel and Howard, 2011; Helenius et al., 2006). Based on these observations, we wondered whether MCAK's adenosine-bound states might similarly affect its binding preference for growing microtubule ends. We have made the motivation clear in the revised manuscript (please see page 7, lines 199-209).

      (2) Our main finding regarding the effects of nucleotides is that MCAK shows differential end-binding affinity and preference based on its nucleotide state. First, MCAK shows the greatest preference for growing microtubule ends in the ATP state, supporting the idea that diffusive MCAK (MCAK·ATP) can directly bind to growing microtubule ends. Second, MCAK·ATP also demonstrates a binding preference for GTPγS microtubules and the ends of GMPCPP microtubules. The similar trends in binding preference suggest that the affinity for GDP·Pi-tubulin and GTP-tubulin likely underpins MCAK’s preference for growing microtubule ends. To clarify these points, we have added further discussions in the manuscript (please see page 8, lines 230-233; page9, lines 258-270 and pages 13-14, lines 443-458).

      (3) It is not clear why the authors decided to use these specific mutant MCAK proteins to advance their arguments about the importance of direct tip binding. Both mutants are enzymatically inactive. Both show roughly similar tip interactions, with some (minor) differences. Without a clear understanding of what these mutants represent, the provided interpretations of the corresponding results are not convincing.

      We thank the reviewer for this comment. In the revised manuscript, we no longer draw conclusions about the importance of end-binding based on the mutant data. Instead, we think that the mutant data provide insights into the structural basis of the end-binding preference. Therefore, we have rewritten the results in this section to more accurately reflect these findings (please see page 10, lines 295-327).

      (4) GMPCPP microtubules are used in the current study to represent normal dynamic microtubule ends, based on some published studies. However, there is no consensus in the field regarding the structure of growing vs. GMPCPP-stabilized microtubule ends, which additionally may be sensitive to specific experimental conditions (buffers, temperature, age of microtubules, etc). To strengthen the authors' argument, Taxol-stabilized microtubules should be used as a control to test if the effects are specific. Additionally, the authors should consider the possibility that stronger MCAK binding to the ends of different types of microtubules may reflect MCAK-dependent depolymerization events on a very small scale (several tubulin rows). These nano-scale changes to tubulins and the microtubule end may lead to the accumulation of small tubulin-MCAK aggregates, as is seen with other MAPs and slowly depolymerizing microtubules. These effects for MCAK may also depend on specific nucleotides, further complicating the interpretation. This possibility should be addressed because it provides a different interpretation than presented in the manuscript.

      Regarding the two points raised here, our thoughts are as following

      (1) The end of GMPCPP-stabilized microtubules differs from that of growing microtubules, with the most obvious known difference being the absence of the region enriched in GDP-Pi-tubulin. We consider the end of GMPCPP microtubules as an analogue of the distal tip of growing microtubules, based on two key features: (1) curled protofilaments and (2) GMPCPP-tubulin, a close analogue of GTP-tubulin. Notably, both features are present at the ends of both GMPCPP-stabilized and growing microtubules. Moreover, we agree with the suggestion to use taxol-stabilized microtubules as a control. This would eliminate the second feature (absence of GTP-tubulin), allowing us to isolate the effect of the first feature. Therefore, we conducted this experiment, and our data showed that MCAK exhibits only a mild binding preference for the ends of taxol-stabilized microtubules, which is much less pronounced than for the ends of GMPCPP microtubules. This observation supports the idea that GMPCPP-stabilized ends closely resemble the growing ends of microtubules.

      (2) The reviewer suggested that stronger MCAK binding to the ends of different types of microtubules might reflect MCAK-dependent depolymerization events on a very small scale. This is an insightful possibility, which we had overlooked in the original manuscript. Fortunately, we performed the experiments at the single-molecule concentrations. Upon reviewing the raw data, we found that under ATP conditions, the binding events of MCAK were not cumulative (see Fig. X1 below) and showed no evidence of local accumulation of MCAK-tubulin aggregates.

      Author response image 1.

      The representative kymograph showing GFP-MCAK binding at the ends and lattice of GMPCPP microtubules in the presence of 1 mM ATP (10 nM GFP-MCAK), which corresponded to Fig. 5A. The arrow: the end-binding of MCAK. Vertical bar: 1 s; horizontal bar: 2 mm.

      (5) It would be helpful if the authors provided microtubule polymerization rates and catastrophe frequencies for assays with dynamic microtubules and MCAK in the presence of different nucleotides. The video recordings of microtubules under these conditions are already available to the authors, so it should not be difficult to provide these quantifications. They may reveal that microtubule ends are different (or not) under the examined conditions. It would also help to increase the overall credibility of this study by providing data that are easy to compare between different labs.

      We thank the reviewer for this suggestion. In the revised manuscript, we have provided data on the growth rates, which are similar across the different nucleotide states (Fig. s1). However, due to the short duration of our recordings (usually 5 minutes, but with a high frame rate, 10 fps), we did not observe many catastrophe events, which prevented us from quantifying catastrophe frequency using the current dataset. Since we measured the binding kinetics of MCAK during the growing phase of microtubules, the similar growth rates and microtubule end morphologies suggest that the microtubule ends are comparable across the different conditions.

      Reviewer #1 (Recommendations For The Authors):

      a. Please provide more details about how the microtubule-bound molecules were selected for analysis (include a description of scripts, selection criteria, and filters, if any). Fig 1A arrows do not provide sufficient information.

      We first measured the fluorescence intensity of each binding event. A probability distribution of these intensities was then constructed and fitted with a Gaussian function. A binding event was considered to correspond to a single molecule if its intensity fell within μ±2σ of the distribution. The details of the single-molecule screening process are now provided in the revised manuscript (see page17, lines 574-583).

      b. Evidence that MCAK is dimeric in solution should be provided (gel filtration results, controls for Figs1A - bleaching, or comparison with single GFP fluorophore).

      In the revised manuscript, we provide the gel filtration results of purified MCAK and other proteins used in this study. The elution volume of the peak for GFP-MCAK corresponded to a molecular weight range between 120 kDa (EB1-GFP dimer) and 260 kDa (XMAP215-GFP-his6), suggesting that GFP-MCAK exists as a dimer (~220 kDa) under experimental condition (please see Fig.s1 and page 5, lines 104-105). In addition, we also measured the fluorescence intensity of both MCAK<sup>sN+M</sup> and MCAK. MCAK<sup>sN+M</sup> is a monomeric mutant that contains the neck domain and motor domain (Wang et al., 2012). The average intensity of MCAK<sup>sN+M</sup> is 196 A.U., about 65% of that of MCAK (300 A.U.). These two measurements suggest that the purified MCAK used in this study exists dimers (see Fig. s1).

      c. Evidence that MCAK on microtubules represents single molecules should be provided (distribution of GFP brightness with controls - GFP imaged under identical conditions). Since assay buffers include detergent, which is not desirable, all controls should be done using the same assay conditions. The authors should rule out that their main results are detergent-sensitive.

      (1) Regarding if MCAK on microtubules represent single molecules: please refer to our responses to the two points above.

      (2) To rule out the effect of tween-20 (0.0001%, v/v), we performed additional control experiments. The results showed that it has no significant effect on microtubule-binding affinity of MCAK (see Figure below).

      Author response image 2.

      Tween-20 (0.0001%, v/v) has no significant effect on microtubule-binding affinity of MCAK. (A) The representative projection images of GFP-MCAK (5 nM) binding to taxol-stabled GDP microtubules in the presence of 1 mM AMPPNP with or without tween-20. The upper panel showed the results of the control experiments performed without MCAK. Scale bar: 5 mm. (B) Statistical quantification of the binding intensity of GFP-MCAK binding to GDP microtubules with or without tween-20 (53 microtubules from 3 assays and 70 microtubules from 3 assays, respectively). Data were presented as mean ± SEM. Statistical comparisons were performed using the two-tailed Mann-Whitney U test with Bonferroni correction, n.s., no significance.

      d. How did the authors plot single-molecule intensity distributions? I am confused as to why the intensity distribution for single molecules in Fig 1D and 2A looks so perfectly smooth, non-pixelated, and broader than expected for GFP wavelength. Please provide unprocessed original distributions, pixel size, and more details about how the distributions were processed.

      In the revised manuscript, we provided unprocessed original data in Fig. 1B and Fig. 2A. We thank the reviewer for pointing out this problem.

      e. Many quantifications are based on a limited number of microtubules and the number of molecules is not provided, starting from Fig 1D and down. Please provide detailed statistics and explain what is plotted (mean with SEM?) on each graph.

      We performed a thorough inspection of the manuscript and corrected the identified issues.

      f. Plots with averaged data should be supplemented with error bars and N should be provided in the legend. E.g. Fig 1C - average position of MT and peak positions.

      We agree with the reviewer. In the revised manuscript, we have made the changes accordingly (e.g. Fig. 2C).

      g. Detailed information should be provided about protein constructs used in this work including all tags. The use of truncated proteins or charged/bulky tags can modify protein-microtubule interactions.

      We agree with the reviewer. In the revised manuscript, we provide the information of all constructs (see Fig. s1 and the related descriptions in Methods, pages 15-16, lines 476-534).

      h. Line 515: We estimated that the accuracy of microtubule end tracking was ~6 nm by measuring the standard error of the distribution of the estimated error in the microtubule end position. - evidence should be provided using the conditions of this study, not the reference to the prior work by others.

      i. Line 520: We estimated that the accuracy of the measured position was ~2 nm by measuring the standard error of the fitting peak location". Please provide evidence.

      Point h-i: we now provide detailed descriptions of how to estimate tracking and measurement accuracy and error in our work. Please see pages 18-19, lines 626-645.

      j. Kymographs in Fig 5G are barely visible. Please provide single-channel greyscale images. What are the dim molecules diffusing on this microtubule?

      We have incorporated the changes suggested by the reviewer. We think that some of the dim signals may result from stochastic background noise, while others likely represent transient bindings of MCAK. The exposure time in our experiments was approximately 0.05 seconds; if the binding duration were shorter than this, the signal would be lower (i.e. the “dim” signals). It is important to note that in this study, we selected binding events lasting at least 2 consecutive frames, meaning transient binding events were not included. This point has been clarified in the Methods section (see page17, lines 573-583).

      k. Please provide a methods description for Fig 6. Did the buffer include 1 mM ATP? The presence of ATP would make these conditions more physiological. ATP concentration should be stated clearly in the main text or figure legend.

      The buffer contains ATP. In the revised manuscript, we have provided the methods for the experiments of microtubule dynamics assay, as well as the analysis of microtubule lifetimes and catastrophe frequency (see page 17, lines 561-572 and page 20, lines 685-690).

      l. Line 104: experiment was performed in BRB80 supplemented with 50 mM KCl and 1 mM ATP, providing a nearly physiological ion strength. Please provide a reference or add your calculations in Methods.

      We have provided references on page 5, lines 101-104 of our manuscript.

      m. What was the MCAK concentration in Figure 4? Did the microtubule shorten under any of these conditions?

      In these experiments, we used a very low concentration of MCAK and taxol-stabilized microtubules, so there’s no microtubule shortening observed here. ATP: 10 nM GFP-MCAK; AMPPNP: 1 nM GFP-MCAK; ADP: 10 nM GFP-MCAK; APO state: 0.1 nM GFP-MCAK.

      Other criticism:

      Text improvements are recommended in the Discussion. For example, line 348: Fourth, the loss of the binding preference.. suggests that the binding preference .. is required for the optimal .. preference.

      We thank the reviewer for pointing out this. In the revised manuscript, we conducted a thorough revision and review of the text.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Chen et al. investigate the localization of microtubule kinesin-13 MCAK to the microtubule ends. MCAK is a prominent microtubule depolymerase whose molecular mechanisms of action have been extensively studied by a number of labs over the last ~twenty years. Here, the authors use single-molecule approaches to investigate the precise localization of MCAK on growing microtubules and conclude that MCAK preferentially binds to a GDP-Pi-tubulin portion of the microtubule end. The conclusions are speculative and not well substantiated by the data, making the impact of the study in its current form rather limited. Specifically, greater effort should be made to define the region of MCAK binding on microtubule ends, as well as its structural characteristics. Given that MCAK has been previously shown to effectively tip-track growing microtubule ends through an established interaction with EB proteins, the physiological relevance of the present study is unclear. Finally, the manuscript does not cite or properly discuss a number of relevant literature references, the results of which should be directly compared and contrasted to those presented here.

      We thank the reviewer for the comments. As these suggestions are more thoroughly expressed in the following comments for authors, we will provide the responses in the corresponding sections, as shown below.

      Reviewer #2 (Recommendations For The Authors):

      Significant concerns:

      (1) Establishing the precise localization of MCAK wrt microtubule end is highly non-trivial. More details should be provided, including substantial supplementary data. In particular, the authors claim ~6 nm accuracy in microtubule end positioning - this should be substantiated by data showing individual overlaid microtubule end intensity profiles as well as fits with standard deviations etc. Furthermore, to conclude that MCAK binds behind XMAP215, the authors should look at the localization of the two proteins simultaneously, on the same microtubule end. Notably, EB binding profiles are well known to exponentially decay along the microtubule lattice - this is not very apparent from the presented data. If MCAK's autonomous binding pattern matches that of EB, we should be seeing an exponentially-decaying localization for MCAK as well? However, averaged MCAK signals seem to only be fitted to Gaussian. Note that the EB binding region (i.e. position and size of the EB comet) can be substantially modulated by increasing the microtubule growth rate - this can be easily accomplished by increasing tubulin concentrations or the addition of XMAP215 (e.g. see Maurer et al. Cur Bio 2014). Thus to establish that MCAK on its own binds the same region as EB, experiments that directly modulate the size and the position of this region should be added.

      (1) We thank the reviewer for this comment. Regarding the accuracy in microtubule end positioning, we now provide more details, and please see pages 18-19, lines 625-645 in the revised manuscript.

      (2) Regarding the relative localization of XMAP215 and MCAK, we performed additional experiments to record their colocalizations simultaneously, on the same microtubule end. Our results showed that MCAK predominantly binds behind XMAP215, with 14.5% appearing within the XMAP215’s binding region. Please see Fig. 2.D-E and lines 184-197 in the revised manuscript.

      (3) Regarding the exponential decay of the EB1 signal along microtubules, we observed that the position probability distribution measured in the present study follows a Gaussian distribution, and the expected exponential decay was not apparent. Since the exponential decay is thought to result from the time delay between tubulin polymerization and GTP hydrolysis, slower polymerization is expected to reduce this latency (Maurer et al., 2014). In our experiments, the growth rate was relatively low (~0.7 mm/min), much slower than the rate observed in cells, where the comet-shaped EB1 signal is most pronounced. The previous study has shown that the exponential decay of EB1 is more pronounced at growth rates exceeding 3 mm/min in vitro (Maurer et al., 2014). Therefore, we think that the relatively slow growth may account for the observed non-exponential decay distribution of the EB1 signals. The same reason may also explain the distribution of MCAK.

      (4) We agree with the reviewer’s suggestion that altering microtubule growth rate is a valid and effective approach to regulate the EB cap length. However, the conclusion that MCAK binds to the EB region is supported by three lines of evidence: (1) the localization of MCAK at the ends of microtubules, (2) new experimental data showing that MCAK binds to the proximal end of the XMAP215 site, and (3) the tendency of MCAK to bind GTPγS microtubules, similar to EB1. Based on these findings, we did not pursue additional experiments to modify the length of the EB cap.

      (2) Even if MCAK indeed binds behind XMAP215, there is no evidence that this region is defined by the GDP-Pi nucleotide state; it could still be curved protofilaments. GTPyS is an analogue of GTP - to what extent GTPyS microtubules exactly mimic the GDP-Pi-tubulin state remains controversial. Furthermore, nucleotide sensing for EB is thought to be achieved through its binding at the interface of four tubulin dimers. However MCAK's binding site is distinct, and it has been shown to recognize intradimer tubulin curvature. Thus it is not clear how MCAK would sense the nucleotide state. On the other hand, there is mounting evidence that the morphology of the growing microtubule end can be highly variable, and that curved protofilaments may be protruding off the growing ends for tens of nanometers or more, previously observed both by EM as well as by fluorescence (e.g. Mcintosh, Moores, Chretien, Odde, Gardner, Akhmanova, Hancock, Zanic labs). Thus, to establish that MCAK indeed localizes along the closed lattice, EM approaches should be used.

      First, we conducted additional experiments that demonstrate MCAK indeed binds behind XMAP215, supporting the conclusion that MCAK interacts with the EB cap (please see Fig. 2 in the revised manuscript). Second, our argument that MCAK preferentially binds to GDP-Pi tubulin is based on two observations: (1) the binding regions of MCAK overlap with those of EB1, and (2) MCAK preferentially binds to GTPγS microtubules, which are considered a close analogue of GDP-Pi tubulin. Third, understanding the structural basis of how MCAK senses the nucleotide state of tubulin is beyond the scope of the present study. However, inspired by the reviewer’s suggestion, we looked into the structure of the MCAK-tubulin complex. The L2 loop of MCAK makes direct contact with the interdimer interface (Trofimova et al., 2018; Wang et al., 2017), which could provide a structural basis for recognizing the changes induced by GTP hydrolysis. While this remains a hypothesis, it is certainly a promising direction for future research. Forth, we agree with the reviewer that an EM approach would be ideal for establishing that MCAK localizes along the closed lattice. However, this is not the focus of the current study. Instead, we argue that MCAK binds to the EB cap, where at least some lateral interactions are likely to have formed.

      (3) The physiological relevance of the study is rather questionable: MCAK has been previously established to be able to both diffuse along the microtubule lattice (e.g. Helenius et al.) as well as hitchhike on EBs (Gouveia et al.). Given the established localization of EBs to growing microtubule ends in cells, and apparently higher affinity of MCAK for EB vs. the microtubule end itself (although direct comparisons with the literature have not been reported here), the relevance of MCAK's autonomous binding to dynamic microtubule ends is dubious.

      We thank the reviewer for raising the importance of physiological relevance. Please refer to our response to the comment No.1 of reviewer 1. Briefly, we think that the end-binding affinity of MCAK makes a significant contribution for its cellular functions. To elucidate this concept, we now use a simple model shown in Supplementary Appendix-2 (see pages 49-51, lines 1246-1316). In this model, we simplified MCAK and EB1 binding to microtubule ends by considering only these two proteins while neglecting other factors (e.g. XMAP215). Specifically, we considered two scenarios: one in which both proteins freely diffuse in the cytoplasm and another where MCAK is localized to specific cellular structures, such as the centrosome or centromere. Based on the modeling results, we argue that MCAK's functional impact at microtubule ends derives both from its intrinsic end-binding capacity and its ability to strengthen the EB1-mediated end association pathway.

      (4) Finally, the study seriously lacks discussion of and comparison with the existing literature on this topic. There are major omissions in citing relevant literature, such as e.g. landmark study by Kinoshita et al. Science 2001. Several findings reported here directly contradict previous findings in the literature. Direct comparison with e.g. Gouveia et al findings, Helenius et al. findings, and others need to be included. For example, Gouveia et al reported that EB is necessary for MCAK plus-end-tracking in vitro (please see Figure 1 of their manuscript). The authors should discuss how they reconcile the differences in their findings when compared to this earlier study.

      We thank the reviewer for this helpful suggestion. In the revised manuscript, we have updated the text description and included comparative discussions with other relevant studies in the Discussion section. Specifically, we added comparisons with the research on XMAP215 in page 14, lines 459-472 (Barr and Gergely, 2008; Kinoshita et al., 2001; Tournebize et al., 2000). Additionally, we have compared our findings with those of Gouveia et al. and Helenius et al. regarding MCAK's preference for binding microtubule ends in page 6, lines 145-157 and page 13, 408-441, respectively (Gouveia et al., 2010; Helenius et al., 2006).

      Additional specific comments:

      Figure 1

      Gouveia et al. (Figure 1) reported that MCAK does not autonomously preferentially localize to growing tips. Specifically, Gouveia et al. found equal association rates of MCAK to both the lattice and the tip in the presence of EB3delT, an EB3 construct that does not directly interact with MCAK. How can these findings be reconciled with the results presented here?

      We are uncertain why there was no observed difference in the on-rates to the lattice and the end in the study by Gouveia et al. Even when considering only the known affinity of MCAK for curved protofilaments at the distal tip of growing microtubules, we would still expect to observe an end-binding preference. After carefully comparing the experimental conditions, we nevertheless identified some differences. First, we used a 160 nm tip size to calculate the on-rate (k<sub>on</sub>), whereas Gouveia et al. used a 450 nm tip. Using a longer tip size would naturally lead to a smaller(k<sub>on</sub>) value. Note that we chose 160 nm for several reasons: (i) a previous cryo-electron tomography study has elucidated that the sheet structures of dynamic microtubule ends have an average length of around 180 nm (Guesdon et al., 2016); (ii) Analysis of fluorescence signals at dynamic microtubule ends has demonstrated that the taper length at the microtubule end is less than 180 nm (Maurer et al., 2014); (iii) in the present study, we estimated that the length of MCAK's end-binding region is approximately 160 nm. Second, in Gouveia et al., single-molecule binding events were recorded in the presence of 75 nM EB3ΔT, which could potentially create a crowded environment at the tip, reducing MCAK binding. Third, as mentioned in our response to Reviewer 1, we took great care to minimize the interference from purification tags (e.g., His-tag) by ensuring their complete removal during protein preparation. Previous studies reported that retaining the His-tag of MAPs led to a significant increase in binding for microtubules (Maurer et al., 2011; Zhu et al., 2009). We believe that some of the factors mentioned above, or their combined effects, may account for the differences in these two observations.

      1C shows the decay of tubulin signal over several hundred nm - should show individual traces? How aligned? Doesn't this long decay suggest protruding protofilaments? (E.g. Odde/Gardner work).

      (1) In the revised manuscript, we now show individual traces (e.g. in Fig. 1B and Fig. 2A). The average trace for tubulin signal with standard deviation was shown in Fig. 2C.

      (2) The microtubule lattice was considered as a Gaussian wall and its end as a half-Gaussian in every frame. Use the peak position of the half-Gaussian of every frame to align and average microtubule end signals, during the dwell time. The average microtubule ends' half-Gaussion peak used as a reference to measure the intensity profile of individual single-molecule binding event in every frame (see page18, lines 607-624).

      (3) We think that the decay of tubulin signal results from the convolution of the tapered end structure and the point spread function. In the revised manuscript, we have updated the Figures to provide unprocessed original data in Fig. 1B and Fig. 2A.

      Please show absolute numbers of measurements in 1C (rather than normalized distribution only).

      In the revised manuscript, we have included the raw data for both tubulin and MCAK signals as part of the methods description. In Fig. 1, using normalized values allows for the simultaneous representation of microtubule and protein signals on a unified graph.

      How do the results in 1D-G compare with the previous literature? Particularly comparison of on-rates between this study and the Gouveia et al? Assuming 1 um = 1625 dimers, it appears that in the presence of EB3, the on-rate of MCAK to the tips reported in Gouveia et al. is an order of magnitude higher than reported here in the absence of EB3 (4.3 x 10E-4 vs. 2 x 10E-5). If so, and given the robust presence of EB proteins at growing microtubule ends in cells, this would invalidate the potential physiological relevance of the current study. Note that the dwell times measured in Gouveia et al. are also longer than those measured here.

      Note that in Gouveia et al, the concentration of mCherry-EB3 was 75 nM, about 187.5 times higher than that of MCAK (0.4 nM). The relative concentrations of these two proteins are not always the case in cells. Regarding the physiological relevance of the end-binding affinity of MCAK itself, please refer to our response to the point No.1 of Reviewer 1.

      Notably, Helenius et al reported a diffusion constant for MCAK of 0.38 um^2/s, which is more than an order of magnitude higher than reported here. The authors should comment on this!

      In the revised manuscript, we have provided an explanation for the difference in diffusion coefficient. Please see page 6, line 142-157. In short, low salt condition facilitates rapid diffusion of MCAK.

      Figure 2:

      This figure is critical and really depends on the analysis of the tubulin signal. Note significant variability in tubulin signal between presented examples in 2A. Also, while 2C looks qualitatively similar, there appears to be significant variability over the several hundred nm from the tip along the lattice. This is the crucial region; statistical significance testing should be presented. More detailed info, including SDs etc. is necessary.

      In the revised manuscript, we have provided raw data in Fig. 1B and Fig. 2A. Additionally, we have provided statistical analysis on the tubulin signals (Fig. 2C) and performed significance test. Please see page 5, lines 111-116 and page 7, lines 179-183 for detailed descriptions.

      Insights into the morphology of microtubule ends based on TIRF imaging have been previously gained in the literature, with reports of extended tip structures/protruding protofilaments (see e.g. Coombes et al. Cur Bio 2013, based on the methods of Demchouk et al. 2011). Such analysis should be performed here as well, if we are to conclude that nucleotide state alone, as opposed to the end morphology, specifies MCAK's tip localization.

      We appreciate the reviewer’s suggestion and agree that it provides a valid optical microscopy-based approach for estimating microtubule end morphology. However, this method did not establish a direct correlation between microtubule end morphology and tubulin nucleotide status. Therefore, we think that refining the measurement of microtubule end morphology will not necessarily provide more information to the understanding of tubulin nucleotide status at MCAK binding sites. Based on the available data in the present study, there are two main pieces of evidence supporting the idea that MCAK can sense tubulin nucleotide status: (1) the binding regions of MCAK and EB overlap significantly, and (2) MCAK shows a clear preference for binding to GTPγS microtubules, similar to EB1 (we provide a new control to support this, Fig. s4). Of course, we do not consider this to be a perfect set of evidence. As the reviewer has pointed out here and in other suggestions, future work should aim to further distinguish the nucleotide status of tubulin in the dynamic versus non-dynamic regions at the ends of microtubules, and to investigate the structural basis by which MCAK recognizes tubulin nucleotide status.

      EB comet profile should be clearly reproduced. MCAK should follow the comet profile.

      Please see our 3<sup>rd</sup> response to the point 1 of this reviewer.

      The conclusion that the MCAK binding region is larger than XMAP215 is not firm, based on the data presented. The authors state that 'the binding region of MCAK was longer than that of XMAP215'. What is the exact width of the region of the XMAP215 localization and how much longer is the MCAK end-binding region? Is this statistically significant?

      We have revised this part in the revised manuscript (page 6, lines 167-172). The position probability distributions of MCAK and XMAP215 were significantly different (K-S test, p< 10<sup>-5</sup>), and the binding region of MCAK (FWHM=185 nm) was significantly longer than that of XMAP215 (FWHM=123 nm).

      MCAK localization with AMPPNP should also be performed here. Even low concentrations of MCAK have been shown to induce microtubule catastrophe/end depolymerization. This will dramatically affect microtubule end morphology, and thus apparent positioning of MCAK at the end.

      In the end positioning experiment, we used a low concentration of MCAK (1 nM). Under this condition, microtubule dynamics remained unchanged, and the morphology of the microtubule ends was comparable across different conditions (with EB1, MCAK or XMAP215). Additionally, in the revised manuscript, we present a new experiment in which we recorded the localization of both MCAK and XMAP215 on the same microtubule. The results support the conclusion regarding their relative localization: most MCAK is found at the proximal end of the XMAP215 binding region, while approximately 15% of MCAK is located within the XMAP215 binding region. Please see Fig. 2D-E and page 7, lines 184-197 for the corresponding descriptions.

      Figure 3:

      For clearer presentation, projections showing two microtubule lattice types on the same image (in e.g. two different colors) should be shown first without MCAK, and then with MCAK.

      We thank the reviewer for this suggestion. We have adjusted the figure accordingly. Please see Fig. 4 in the revised manuscript.

      Please comment on absolute intensity values - scales seem to be incredibly variable.

      The fluorescence value presented here is the result of multiple images being summed. Therefore, the difference in absolute values is influenced not only by the binding affinity of MCAK in different states to microtubules, but also by the number of images used. In this analysis, we are not comparing MCAK in different states, but rather evaluating the binding ability of MCAK in the same state on different types of microtubules.

      Given that the authors conclude that MCAK binding mimics that of EB, EB intensity measurements and ratios on different lattice substrates should be performed as a positive control.

      We performed additional experiments with EB1, in the revised manuscript, we provide the data as a positive control (please see Fig. s4).

      Figure 4:

      MCAK-nucleotide dependence of GMPCPP microtubule-end binding has been previously established (see e.g. Helenius et al, others?) - what is new here? Need to discuss the literature. This would be more appropriate as a supplemental figure?

      In the present study, we reproduced the GMPCPP microtubule-end binding of MCAK in the AMPPNP state, as shown in several previous reports (Desai et al., 1999; Hertzer et al., 2006). Here, we also quantified the end to lattice binding preference, and our results showed that the nucleotide state-dependence shows the same trend as the binding preference of MCAK to the growing microtubule ends. Therefore, we prefer to keep this figure in the main text (Fig. 5).

      Figure 5:

      Please note that both MCAK mutants show an additional two orders of magnitude lower microtubule binding on-rates when compared to wt MCAK. This makes the analysis of preferential binding substrate for these mutants dubious.

      We agreed with this point. We have rewritten this part. Please see page 10, lines 295-327, in the revised manuscript.

      Figure 6:

      Combined effects of XMAP215 and XKCM1 (MCAK) have been previously explored in the landmark study by Kinoshita et al. Science 2001, which should be cited and discussed. Also note that Moriwaki et al. JCB 2016 explored the combined effects of XMA215 and MCAK - which should be discussed here and compared to the current results.

      We agree with the reviewer. We have revised the discussion on this part. Please see page 11, lines 329-342 and page 14, lines 459-472 in the revised manuscript.

      Please report quantification for growth rate and lifetime.

      In the revised manuscript, we provide all these data. Please see pages 11-12, lines 343-374.

      To obtain any new quantitative information on the combined effects of the two proteins, at the very minimum, the authors should perform a titration in protein concentration.

      We agree with the reviewer on this point. In our pilot experiments, we performed titration experiments to determine the appropriate concentrations of MCAK and XMAP215, respectively. We selected 50 nM for XMAP215, as it clearly enhances the growth rate and exhibits a mild promoting effect on catastrophe—two key effects of XMAP215 reported in previous studies (Brouhard et al., 2008; Farmer et al., 2021). Reducing the XMAP215 concentration eliminates the catastrophe-promoting effect, while increasing it would not much enhance the growth rate. For MCAK, we chose 20 nM, as it effectively promotes catastrophe; increasing the concentration beyond this point leads to no microtubule growth, at least in the MCAK-only condition. If there’s no microtubule growth, it would be difficult to quantify the parameters of microtubule dynamics, hindering a clear comparison of the combined versus individual effects. Therefore, we think that the concentrations used in this study are appropriate and representative. In the revised manuscript, we make this point clearer (see pages 11 and lines 329-342).

      Finally, the writing could be improved for overall clarity.

      We thank the reviewer for pointing out this. In the revised manuscript, we conducted a thorough revision and review of the text.

      Reviewer #3 (Public Review):

      The authors revisit an old question of how MCAK goes to microtubule ends, partially answered by many groups over the years. The authors seem to have omitted the literature on MCAK in the past 10-15 years. The novelty is limited due to what has previously been done on the question. Previous work showed MCAK targets to microtubule plus-ends in cells through association with EB proteins and Kif18b (work from Wordeman, Medema, Walczak, Welburn, Akhmanova) but none of their work is cited.

      We thank the reviewer for the suggestion. Some of the referenced work has already been cited in our manuscript, such as studies on the interaction between MCAK and EB1. However, other relevant literature had not been properly cited. In the revised manuscript, we have added further discussion on this topic in the context of existing findings. Please refer to pages 3-4, lines 68-85, and pages 13, lines 425-441.

      It is not obvious in the paper that these in vitro studies only reveal microtubule end targeting, rather than plus end targeting. MCAK diffuses on the lattice to both ends and its conformation and association with the lattice and ends has also been addressed by other groups-not cited here. I want to particularly highlight the work from Friel's lab where they identified a CDK phosphomimetic mutant close to helix4 which reduces the end preference of MCAK. This residue is very close to the one mutated in this study and is highly relevant because it is a site that is phosphorylated in vivo. This study and the mutant produced here suggest a charge-based recognition of the end of microtubules.

      Here the authors analyze this MCAK recognition of the lattice and microtubule ends, with different nucleotide states of MCAK and in the presence of different nucleotide states for the microtubule lattice. The main conclusion is that MCAK affinity for microtubules varies in the presence of different nucleotides (ATP and analogs) which was partially known already. How different nucleotide states of the microtubule lattice influence MCAK binding is novel. This information will be interesting to researchers working on the mechanism of motors and microtubules. However, there are some issues with some experiments. In the paper, the authors say they measure MCAK residency of growing end microtubules, but in the kymographs, the microtubules don't appear dynamic - in addition, in Figure 1A, MCAK is at microtubule ends and does not cause depolymerization. I would have expected to see depolymerization of the microtubule after MCAK targeting. The MCAK mutants are not well characterized. Do they still have ATPase activity? Are they folded? Can the authors also highlight T537 and discuss this?

      Finally, a few experiments are done with MCAK and XMAP215, after the authors say they have demonstrated the binding sites overlap. The data supporting this statement were not obvious and the conclusions that the effect of the two molecules are additive would argue against competing binding sites. Overall, while there are some interesting quantitative measurements of MCAK on microtubules - in particular in relation to the nucleotide state of the microtubule lattice - the insights into end-recognition are modest and do not address or discuss how it might happen in cells. Often the number of events is not recorded. Histograms with large SEM bars are presented, so it is hard to get a good idea of data distribution and robustness. Figures lack annotations. This compromises therefore their quantifications and conclusions. The discussion was hard to follow and needs streamlining, as well as putting their work in the context of what is known from other groups who produced work on this in the past few years.

      We thank the reviewer for the comments. Regarding the physiological relevance of the end-binding of MCAK itself, please refer to our response to the point No.1 of reviewer 1. Moreover, as we feel that other suggestions are more thoroughly expressed in the following comments for authors, we will provide the responses in the corresponding sections, as shown below.

      Reviewer #3 (Recommendations For The Authors):

      Why, on dynamic microtubules, is MCAK at microtubule plus ends and does not cause a catastrophe?

      At this concentration (10 nM MCAK with 16 mM tubulin in Fig. 1; 1 nM MCAK with 12 mM tubulin in Fig. 2), MCAK has little effect on microtubule dynamics in our experiments. Using TIRFM, we were able to observe individual MCAK binding events. Based on these observations, we think that in the current experimental condition, a single binding event of MCAK is insufficient to induce microtubule catastrophe; rather, it likely requires cumulative changes resulting from multiple binding events.

      Do the MCAK mutants still have ATPase activity?

      The ATPase activities of MCAK<sup>K525A</sup> and MCAK<sup>V298S</sup> are both reduced to about 1/3 of the wild-type (Fig. s6).

      The intensities of GFP are not all the same on the microtubule lattice (eg 1A). See blue and white arrowheads. The authors could be looking at multiple molecules of GFP-MCAK instead of single dimers. How do they account for this possibility?

      In the revised manuscript, we provide the gel filtration result of the purified MCAK, and the position of the peak corresponds to ~220 kDa, demonstrating that the purified MCAK in solution is dimeric (please see Fig.s1 and page 5, lines 101-103). We measured the fluorescence intensity of each binding event. A probability distribution of these intensities was then constructed and fitted with a Gaussian function. A binding event was considered to correspond to a single molecule if its intensity fell within μ±2σ of the distribution. The details of the single-molecule screening process are provided in the revised manuscript (see page 17, lines 574-583).

      In addition, we also measured the fluorescence intensity of both MCAK<sup>sN+M</sup> and MCAK. MCAK<sup>sN+M</sup> is a monomeric mutant that contains the neck domain and motor domain (Wang et al., 2012). The average intensity of MCAK<sup>sN+M</sup> is 196 A.U., about 65 % of that of MCAK (300 A.U.), suggesting that MCAK is a dimer (see Fig. s1). Moreover, we think that some of the dim signals may result from stochastic background noise, while others likely represent transient bindings of MCAK. The exposure time in our experiments was approximately 0.05 seconds; if the binding duration were shorter than this, the signal would be lower. It is important to note that in this study, we specifically selected binding events lasting at least 2 consecutive frames, meaning transient binding events were not included. This point has been clarified in the Methods section (see page 17, lines 568-569 and lines 574-583).

      Could the authors provide a kymograph of an MT growing, in the presence of MCAK+AMPPNP? Can MCAK track the cap?

      Under single-molecule conditions, we observed a single MCAK molecule briefly binding to the end of the microtubule. However, we did not record if MCAK at high concentrations could track microtubule ends under AMPPNP conditions.

      In the experiments in Figure 6, the authors should also show the localization of MCAK and XMAP215 at microtubule plus ends in their kymographs to show the two molecules overlap.

      Regarding the relative localization of XMAP215 and MCAK, we conducted additional experiments to record their colocalization simultaneously at the same microtubule end. Our results show that MCAK predominantly binds behind XMAP215, with 14.5% of MCAK binding within the XMAP215 binding region. Please see Fig. 2.D-E and page 7, lines 184-197 in the revised manuscript. However, we argue that the effects of XMAP215 and MCAK are additive, and their binding sites do not necessarily need to overlap for these effects to occur.

      The authors do not report what statistical tests are done in their graphs, and one concern is over error propagation of their data. Instead of bar graphs, showing the data points would be helpful.

      We have now shown all data points in the revised manuscript.

      MCAK+AMPPNP accumulates at microtubule ends. Appropriate quotes from previous work should be provided.

      We have made the revisions accordingly. Please see page 9, lines 273-276.

      Controls are missing. An SEC profile for all purified proteins should be presented. Also, the authors need to explain if they report the dimeric or monomeric concentration of MCAK, XMAP215, etc...

      We have provided the gel filtration result for all purified proteins in the revised manuscript (Fig.s1). Moreover, we now make it clear that the concentrations of MCAK and EB1 are monomeric concentration. Please see the legend for Fig. 1, line 893 in the revised manuscript.

      Figure 1: the microtubules don't look dynamic at all. This is also why the authors can record MCAK at microtubule ends, because their structure is not changing.

      The microtubules are dynamic, but they may appear non-dynamic due to the relatively slow growth rate and the high frame rate at which we are recording. We propose that individual binding events of MCAK induce structural changes at the nanoscopic or molecular scale, which are not detectable using TIRFM.

      I recommend the authors measure the Kon and Koff for single GFP-MCAK mutant molecules and provide the information alongside their normalized and averaged binding intensities of GFP-MCAK in Fig 5. Showing data points instead of bar graphs would be better.

      (1) We measured k<sub>on</sub> and dwell time for mutants at growing microtubule end. However, we did not perform single-molecule tracking for MCAK’s binding on stabilized microtubules. This is mainly because the superimposed signal on the stable microtubule already indicates the changes in the mutant's binding affinity to different microtubule structures, and moreover, the binding of the mutants is highly transient, making accurate single-molecule tracking and calculations difficult.

      (2) In the revised figure, we have included the data points in all plots.

      When discussing how Kinesin-13 interacts with the lattice, the authors should quote the papers that report the organization of full-length Kinesin-13 on tubulin heterodimers: Trofimova et al, 2018; McHugh et al 2019; Benoit et al, 2018. It would reinforce their model and account for the full-length protein, rather than just the motor domain.

      We thank the suggestion for the reviewer. In our manuscript, we have cited papers on full-length Kinesin-13 to discuss the interaction between MCAK and microtubule end-curved structure. Additionally, we have utilized the MCAK-tubulin crystal structure (PDB ID: 5MIO) in Fig. 6, as it depicts a human MCAK, which is consistent with the protein used in our study. This structure illustrates the interaction sites between MCAK and tubulin dimer, guiding our mutation studies on specific residues. Thus, we prefer to use the structure (PDB ID: 5MIO) in Fig.6.

      Figure 5A. What type of model is this? A PDB code is mentioned. Is this from an X-ray structure? If so, mention it.

      We have now included the structural information in the Figure legend (see page 37, lines 1045).

      Figure 5B. It is not possible to distinguish the different microtubule lattices (GTPyS, GDP, and GMPCPP). The experiment needs to be better labelled.

      We thank the reviewer for this comment. We have now rearranged the figure for better clarity (see Fig. 6).

      "Figure 5D: what are the statistical tests? I don't understand " The statistical comparisons were made versus the corresponding value of 848 GFP-MCAK".

      We have made this point clearer in the revised manuscript (see pages 38, line 1078-1080).

      What is the "EB cap"? This needs explaining.

      We provide this explanation for this, please see page 4, lines 87-89 in the revised manuscript.

      Work from Friel and co-workers showed MCAK T537E did not have depolymerizing activity and a reduced affinity for microtubule ends. The work of the authors should be discussed with respect to this previously published work.

      We thank the reviewer for this suggestion. In the revised manuscript, we have added discussions on this (see page 10, lines 303-307).

      The concentration of protein used in the assays is not always described.

      We have checked throughout the manuscript and made revisions accordingly.

      "Having revealed the novel binding sites of MCAK in dynamic microtubule ends " should be on "we wondered how MCAK may work ..with EB1". This is not addressed so should be removed. Instead, they can quote the work from Akhmanova's lab. Realistically this section should be rephrased as there are other plus-end targeting molecules that compete with MCAK, not just XMAP215 and EB1.

      We have rephrased this section as suggested by this reviewer to be more specific. Please see page 11, lines 329-342.

      What is AMPCPP?

      It should be “AMPPNP”

      Typos in Figure 5.

      Corrected

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      We thank the reviewer for his/her very positive comments.

      Reviewer #2 (Public review):

      We thank the reviewer for his/her positive evaluation. We plan to add RNAseq data of yeast wild-type and JDP mutant strains as more direct readout for the role of Apj1 in controlling Hsf1 activity. We agree with the reviewer that our study includes one major finding: the central role of Apj1 in controlling the attenuation phase of the heat shock response. In accordance with the reviewer we consider this finding highly relevant and interesting for a broad readership. We agree that additional studies are now necessary to mechanistically dissect how the diverse JDPs support Hsp70 in controlling Hsf1 activity. We believe that such analysis should be part of an independent study but we will indicate this aspect as part of an outlook in the discussion section of a revised manuscript.

      Reviewer #3 (Public review):

      We thank the reviewer for his/her suggestions. We agree that it is sometimes difficult to distinguish direct effects of JDP mutants on heat shock regulation from indirect ones, which can result from the accumulation of misfolded proteins that titrate Hsp70 capacity. We also agree that an in vitro reconstitution of Hsf1 displacement from DNA by Apj1/Hsp70 will be important, also to dissect Apj1 function mechanistically. We will add this point as outlook to the revised manuscript.

      Reviewer #1 (Recommendations for the authors): 

      (1) Can the authors submit the raw translatome data to a standard repository? Also, the data should be summarized in a supplemental Excel table. 

      We submitted the raw translatome data to the NCBI Gene Expression Omnibus and added the analyzed data sets (shown in Figures 1 and 5) as Supplementary Tables S4/S5 (excel sheets). We additionally included RNAseq analysis of yeast WT and JDP mutants set grown at 25°C, complementing and confirming our former translatome analysis (new Figure 5, Figure Supplement 2). Respective transcriptome raw data were also deposited at the NCBI Gene Expression Omnibus and analyzed data are available as Supplementary Table S7.

      (2) MW indicators need to be added to the Western Blot figures. 

      We added molecular weight markers to the Western Blot figures.

      (3) Can the authors please include the sequences of the primers used in all the RT-qPCR experiments? They mention they are in the supplemental information, but I couldn't locate them. 

      We added the sequences of the RT-qPCR primers as Supplementary Table S4.

      (4) Given the clear mechanism proposed, it would be nice if the authors could provide a nice summary figure. 

      We followed the suggestion of the reviewer and illustrate our main finding as new Figure 7.

      Reviewer #2 (Recommendations for the authors): 

      (1) As mentioned above, a co-IP experiment between Hsf1 and Ssa1/2 in APJ1 and apj1∆ cells, utilizing Hsf1 alleles with and without the two known binding sites, would cement the assignment of Apj1 in the Hsf1 regulatory circuit. 

      We agree with the reviewer that Hsf1-Ssa1/2 pulldown experiments, as done by Pincus and colleagues (1), will further specify the role of Apj1 in targeting Hsp70 to Hsf1 during the attenuation phase of the heat shock response. We have tried extensively such pulldown experiments to document dissociation of Ssa1/2 from Hsf1 upon heat shock in yeast wild-type cells. While we could specifically detect Ssa1/2 upon Hsf-HA1 pulldown, our results after heat shock were highly variable and inconclusive and did not allow us to probe for a role of Apj1 or the two known Ssa1/2 binding sites in the phase-specific targeting. We now discuss the potential roles of the two distinct Ssa1/2 binding sites for phase-specific regulation of Hsf1 activity in the revised manuscript (page 12, lanes 17-21).

      (2) Experiments in Figure 3 nicely localize CHIP reactions with known HSEs. A final confirmatory experiment utilizing a mutated HSE (another classic experiment in the field) would cement this finding and validate the motif and reporter-based analysis. 

      We thank the reviewer for this meaningful suggestions. We have done something like this by using the non-Hsf1 regulated gene BUD3, which lacks HSEs, as reference. We engineered a counterpart, termed “BUD3 HS-UAS”, which bears inserted HSEs, derived from the native UAS of HSP82, within the BUD3 UAS. We show that BUD3<sup>+</sup> lacking HSEs is not occupied by Hsf1 and Apj1 under either non-stress or heat shock conditions while BUD3-HSE is clearly occupied under both, paralleling Hsf1 and Apj1 occupancy of HSP82 (Figure 3E). We have renamed the engineered allele to “BUD3-HSE” to clarify the experimental design and output.

      (3) Page 8 - the ydj1-4xcga allele is introduced without explaining why it's needed, since ydj1∆ cells are viable. The authors should acknowledge the latter fact, then justify why the RQC depletion approach is preferred. Especially since the ydj1∆ mutant appears in Figure 5B. 

      ydj1∆ cells are viable, yet they grow extremely slowly at 25°C and hardly at 30°C,  making them difficult to handle. The RQC-mediated depletion of Ydj1 in ydj1-4xcga cells allows for solid growth at 30°C, facilitating strain handling and analysis of Ydj1 function. Importantly, ydj1-4xcga cells are still temperature-sensitive and exhibit the same deregulation of the heat shock response upon combination with apj1D as observed for ydj1∆ cells. Thus ydj1 knockout and knockdown cells do not differ in the relevant phenotypes reported here and we performed most of the analysis with  ydj1-4xcga cells due to their growth advantage. We added a respective explanation to the text (page 8, lanes 13-14) .

      (4) The authors raise the possibility that Sis1, Apj1, and Ydj1 may all be competing for access to Ssa1/2 at different phases of the HSR, and that access may be dictated by conformational changes in Hsf1. Given that there are at least two known Hsp70 binding sites that have negative regulatory activity in Hsf1, the possibility that domain-specific association governs the different roles should be considered. It is also unclear how the JDPs are associating with Hsf1 differentially if all binding is through Ssa1/2. 

      We thank the reviewer for the comment and will add the possibility of specific roles of the identified Hsp70 binding sites in regulating Hsf1 activity at the different phases of the heat shock response to the discussion section. Binding of Ssa1/2 to substrates (including Hsf1) is dependent on J-domain proteins (JDPs), which differ in substrate specificity. It is tempting to speculate that the distinct JDPs recognize different sites in Hsf1 and are responsible for mediating the specific binding of Ssa1/2 to either N- or C-terminal sites in Hsf1. Thus, the specific binding of a JDP to Hsf1 might dictate the binding to Ssa1/2 to either binding site. We discuss this aspect in the revised manuscript (page 12, lanes 17-21).

      (5) Figure 6 - temperature sensitivity of hsf1 and ydj1 mutants has been linked to defects in the cell wall integrity pathway rather than general proteostasis collapse. This is easily tested via plating on osmotically supportive media (i.e., 1M sorbitol) and should be done throughout Figure 6 to properly interpret the results.

      Our data indicate proteostasis breakdown in ydj1 cells by showing strongly altered localization of Sis1-GFP, pointing to massive protein aggregation (Figure 6 – Figure Supplement  1D).

      We followed the suggestion of the reviewer and performed spot tests in presence of 1 M sorbitol (see figure below). The presence of sorbitol is improving growth of ydj1-4xcga mutant cells at increased temperatures, in agreement with the remark of the reviewer. We, however, do not think that growth rescue by sorbitol is pointing to specific defects of the ydj1 mutant in cell wall integrity. Sorbitol functions as a chemical chaperone and has been shown to have protective effects on cellular proteostasis and to rescue phenotypes of diverse point mutants in yeast cells by facilitating folding of the respective mutant proteins and suppressing their aggregation (2-4). Thus sorbitol can broadly restore proteostasis, which can also explain its effects on growth of ydj1 mutants at increased temperatures. Therefore the readout of the spot test with sorbitol is not unambiguous and we therefore prefer not showing it in the manuscript.

      Author response image 1.

      Serial dilutions of indicated yeast strains were spotted on YPD plates without and with 1 M sorbitol and incubated at indicated temperatures for 2 days.<br />

      Reviewer #3 (Recommendations for the authors): 

      (1) Line 154: Can the authors, by analysis, offer an explanation for why HSR attenuation varies between genes for the sis1-4xcga strain? Is it, for example, a consequence of that a hypomorph and not a knock is used, a mRNA turnover issue, or that Hsf1 has different affinities for the HSEs in the promoters? 

      We used the sis1-4xcga knock-down strain because Sis1 is essential for yeast viability. The point raised by the reviewer is highly valid and we extensively thought about the diverse consequences of Sis1 depletion on levels of e.g. translated BTN2 (minor impact) and HSP104 (strong impact) mRNA. We meanwhile performed transcriptome analysis and confirmed the specific impact of Sis1 depletion on HSP104 mRNA levels, while BTN2 mRNA levels remained much less affected (new Figure 5 - Figure Supplement 2A/B). We compared numbers and spacings of HSEs in the respective target genes but could not identify obvious differences. Hsf1 occupancy within the UAS region of both BTN2 and HSP104 is very comparable at three different time points of a 39°C heat shock: 0, 5 and 120 min, arguing against different Hsf1 affinities to the respective HSEs (5). The molecular basis for the target-specific derepression upon Sis1 depletion thus remains to be explored. We added a respective comment to the revised version of the manuscript (page 12, lanes 3-8) .

      (2) Line 194: The analysis of ChIP-seq is not very elaborated in its presentation. How specific is this interaction? Can it be ruled out by analysis that it is simply the highly expressed genes after the HS that lead to Apj1 appearing there? More generally: Can the data in the main figure be presented to give a more unbiased genome-wide view of the results?

      We overall observed a low number of Apj1 binding events in the UAS of genes. The interaction of Apj1 with HSEs is specific as we do not observe Apj1 binding to the UAS of well-expressed non-heat shock genes. Similarly, Apj1 does not bind to ARS504 (Figure S3 – Figure Supplement 1). We extended the description of our ChIP-seq analysis procedures leading to the identification of HSEs as Apj1 target sites to make it easier to understand the data analysis. We additionally re-analysed the two Apj1 binding peaks that did not reveal an HSE in our original analysis. Using a modified setting we can identify a slightly degenerated HSE in the promoter region of the two genes (TMA10, RIE1) and changed Figure 3C accordingly. Notably, TMA10 is a known target gene of Hsf1. The expanded analysis is further documenting the specificity of the Apj1 binding peaks.

      (3) Line 215. Figure 3. The clear anticorrelation is puzzling. Presumably, Apj1 binds Hsf1 as a substrate, and then a straight correlation is expected: When Hsf1 substrate levels decrease at the promoters, also Apj1 signal is predicted to decrease. What explanations could there be for this? Is it, for example, that Hsf1 is not always available as a substrate on every promoter, or is Apj1 tied up elsewhere in the cell/nucleus early after HS? 

      We propose that Apj1 binds HSE-bound Hsf1 only after clearance of nuclear inclusions, which form upon heat stress. Apj1 thereby couples the restoration of nuclear proteostasis to the attenuation of the heat shock response. This explains the delayed binding of Apj1 to HSEs (via Hsf1), while Hsf1 shows highest binding upon activation of the heat shock response (early timepoints). Notably, the binding efficiency of Hsf1 and Apj1 (% input) largely differ, as we determine strong binding of Hsf1 five min post heat shock (30-40% of input), whereas maximal 3-4% of the input is pulled down with Apj1 (60 min post heat shock) (Figure 3D). Even at this late timepoint 10-20% of the input is pulled down with Hsf1. The diverse kinetics and pulldown efficiencies suggest that Apj1 displaces Hsf1 from HSEs and accordingly Hsf1 stays bound to HSEs in apj1D cells (Figure 4). This activity of Apj1 explains the anti-correlation: increased targeting of Apj1 to HSE-bound Hsf1 will lower the absolute levels of HSE-bound Hsf1. What we observe in the ChIP experiment at the individual timepoints is a snapshot of this reaction. Accordingly, at the last timepoint (120 min after heat shock ) analyzed, we observe low binding of both Hsf1 and Apj1 as the heat shock response has been shut down.

      (4) Line 253: "Sis-depleted".  

      We have corrected the mistake.

      (5) Line 332: Fig. 6C SIS1 OE from pRS315. A YIP would have been better, 20% of the cells will typically not express a protein with a CEN/ARS of the pRS-series so the Sis1 overexpression phenotype may be underestimated and this may impact on the interpretation. 

      We agree with the reviewer that Yeast Integrated Plasmids (YIP) represent the gold standard for complementation assays. We are not aware of a study showing that 20% of cells harboring pRS-plasmids do not express the encoded protein. The results shown in Fig. 8C/D demonstrate that even strong overproduction of Sis1 cannot restore Hsf1 activity control. This interpretation also will not be affected assuming that a certain percentage of these cells do not express Sis1. Nevertheless, we added a comment to the respective section pointing to the possibility that the Sis1 effect might be underestimated due to variations in Sis1 expression (page 11, lanes 15-19).

      (6) Figure 1C. Since n=2, a more transparent way of showing the data is the individual data points. It is used elsewhere in the manuscript, and I recommend it. 

      We agree that showing individual data points can enhance transparency, particularly with small sample sizes. However, the log2 fold change (log2FC) values presented in Figure 1C and other figures derived from ribosome profiling and RNAseq experiments were generated using the DESeq2 package. This DeSeq2 pipeline is widely used in analyzing differential gene expression and known for its statistical robustness. It performs differential expression analysis based on a model that incorporates normalization, dispersion estimation, and shrinkage of fold changes. The pipeline automatically accounts for biological, technical variability, and batch effects, thereby improving the reliability of results. These log2FC values are not directly calculated from log-transformed normalized counts of individual samples but are instead estimated from a fitted model comparing group means. Therefore, the individual values of replicates in DESeq2 log2FC cannot be shown.

      (7) Figure 1D. Please add the number of minutes on the X-axis. Figure legend: "Cycloheximide" is capitalized.  

      We revised the figure and figure legend as recommended.

      (8) Several figure panels: Statistical tests and SD error bars for experiments performed in duplicates simply feel wrong for this reviewer. I do recognize that parts of the community are calculating, in essence, quasi-p-values using parametric methods for experiments with far too low sample numbers, but I recommend not doing so. In my opinion, better to show the two data points and interpret with caution.

      We followed the advice of the reviewer and removed statistical tests for experiments based on duplicates.

      References

      (1) Krakowiak, J., Zheng, X., Patel, N., Feder, Z. A., Anandhakumar, J., Valerius, K. et al. (2018) Hsf1 and Hsp70 constitute a two-component feedback loop that regulates the yeast heat shock response eLife 7,

      (2) Guiberson, N. G. L., Pineda, A., Abramov, D., Kharel, P., Carnazza, K. E., Wragg, R. T. et al. (2018) Mechanism-based rescue of Munc18-1 dysfunction in varied encephalopathies by chemical chaperones Nature communications 9, 3986

      (3) Singh, L. R., Chen, X., Kozich, V., and Kruger, W. D. (2007) Chemical chaperone rescue of mutant human cystathionine beta-synthase Mol Genet Metab 91, 335-342

      (4) Marathe, S., and Bose, T. (2024) Chemical chaperone - sorbitol corrects cohesion and translational defects in the Roberts mutant bioRxiv  10.1101/2024.09.04.6109452024.2009.2004.610945

      (5) Pincus, D., Anandhakumar, J., Thiru, P., Guertin, M. J., Erkine, A. M., and Gross, D. S. (2018) Genetic and epigenetic determinants establish a continuum of Hsf1 occupancy and activity across the yeast genome Mol Biol Cell 29, 3168-3182

    1. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public review):

      This manuscript assesses the differences between young and aged chondrocytes. Through transcriptomic analysis and further assessments in chondrocytes, GATA4 was found to be increased in aged chondrocyte donors compared to young donors. Subsequent mechanistic analysis with lentiviral vectors, siRNAs, and a small molecule was used to study the role of GATA4 in young and old chondrocytes. Lastly, an in vivo study was used to assess the effect of GATA4 expression on osteoarthritis progression in a DMM mouse model.

      Strengths:

      This work linked the overexpression of GATA4 to NF-kB signaling pathway activation, alterations to the TGF-b signaling pathway, and found that GATA4 increased the progression of OA compared to the DMM control group. This indicates that GATA4 contributes to the onset and progression of OA in aged individuals.

      The authors thank the reviewer for reviewing our manuscript and providing insightful comments.

      Weaknesses:

      (1) A couple of sentences should be added to the introduction, to emphasize the role GATA4 plays, such as the alterations to the TGF-b signaling pathway and the increased activation of the NF-kB pathway. 

      As suggested, we have expanded on these signaling pathways in the Introduction to highlight the known functions of GATA4. Importantly, there was no previous study reporting the roles of GATA4 in regulating TGF-β pathway.

      “Many growth factors contribute to the chondro-supportive environment in the knee joint. Particularly, transforming growth factor-b (TGF-b) plays a key role in maintaining chondrocytes and replenishing ECM loss. However, during OA, TGF-b can induce catabolic processes in chondrocytes, resulting in matrix stiffening, osteophytes, and chondrocyte hypertrophy.[10-12]” (Lines 80-84)

      “Mechanistically, upregulation of GATA4 was shown to increase nuclear factor-kB (NF-kB) pathway activation.[14,15]  NF-κB is thought to amplify and potentially propagate cellular senescence during the aging process through the senescence-associated secretory phenotype (SASP), which could contribute to a low-grade state of chronic inflammation.[16]” (Lines 99-102)

      “When GATA4 was over expressed, we found that there were alterations to the TGF-b signaling pathway and activation of the NF-kB signaling pathway.” (Lines 106-108)

      (2) Figure 1F, the GATA4 histology image should be bigger.

      We have now increased the size of the image in revised Figure 1F.

      (3) Further discussion should be conducted regarding the reasoning as to why GATA4 increases the phosphorylation of SMAD1/5. 

      Thank you. The underlying mechanism of GATA4 activating SMAD1/5 has not been previously investigated. We have now elaborated on this in the discussion and have added more relevant publications.

      “Our study indicated that there was an observed decrease in chondrogenesis and an increase in hypertrophy-related genes following GATA4 overexpression (Figure 2G).” (Lines 572-574)

      “These previous studies and literature review inspired us to explore the potential association between GATA4 levels and the activation of SMAD1/5.” (Lines 587-588)

      “In this study, it was shown that GATA4 was necessary for bone morphogenic protein-6 (BMP-6) mediated IL-6 induction, in which there are multiple GATA binding domains on the IL-6 promoter. This work further showed that GATA4 interacts with SMAD 2,3 and 4.[55] Studies have suggested that BMP pathways and GATA4 work synergistically to regulate SMAD signaling.56 This information indicates that the involvement of GATA4 in the TGF-b signaling pathway is complex and further studies should be conducted to better assess this relationship.” (Lines 594-599)

      (4) More information should be included to clarify why GATA4 is thought to be linked to DNA damage and the pathway that is associated with that. 

      We have now included further information in the discussion to clarify the association between DNA damage and GATA4 upregulation.

      “The study by Kang et al. demonstrated that the suppression of p62 following DNA damage leads to GATA4 accumulation due to the lack of autophagy.13 DNA damage is known to increase with age.71 Therefore, we believe that DNA damage due to aging is a key driver of the upregulation of GATA4 in old chondrocytes.” (Lines 642-646)

      (5) Please add further information regarding the limitations of the animal study conducted in this work and future plans to assess this. 

      We have included more limitations of the animal study that was conducted in this work and have expanded on the future plans to use inducible GATA4 expression in transgenic mouse lines to study the role of GATA4 overexpression in OA onset and progression.

      “Third, during our in vivo work, the intraarticular injection of GATA4 lentivirus was not chondrocyte-specific. Therefore, the injection also allowed for other cell types to overexpress GATA4. Future work should be conducted using transgenic mouse lines for cartilage-specific inducible overexpression or depletion of Gata4 to further investigate the role of GATA4 in chondrocytes.” (666-670)

      (6) In Figure 5, GATA4 should be changed to Gata4 in the graphed portions for consistency. 

      Thanks. We have made the necessary adjustments throughout the manuscript.

      Reviewer #2 (Public review):

      (1) While it is convincing that GATA4 expression is elevated in elderly individuals, and that it has a detrimental impact on cartilage health, the authors might want to add further discussion on the variability among individual human donors, especially given the finding that the elevation of GATA4 was not observed in chondrocytes from donor O1 (Figure 1G).

      The authors thank the reviewer for reviewing our manuscript and providing insightful comments.

      As suggested, we have included more discussion on the variability among donors.

      “Although we found that GATA4 was generally increased with aging, some young donors also exhibited increased levels of GATA4, which may be associated with increased DNA damage, as discussed above, or other stressors. Therefore, GATA4 should be used together in conjunction with other aging biomarkers, such as the epigenetic clock [72] to precisely define chondrocyte aging. Future work should examine biological versus chronological aging and epigenetic clock-based assessments to explain the variabilities in GATA4 expression among donors.” (Lines 658-663)

      (2) It might also be worth adding additional discussion on the interplay between senescent chondrocytes and the dysfunctional ECM during aging. As noted by the authors, aging is associated with decreased sGAG content and likely degenerative changes in the collagen II network, so the microniche of chondrocytes, and thus cell-matrix crosstalk through the pericellular matrix, is also altered or impaired. 

      Thank you for this comment. We have included more discussion on the interplay of chondrocyte senescence and dysfunctional ECM during aging, with a specific focus on the microniche of chondrocytes.

      “Additionally, a common hallmark of chondrocyte aging is the alternation of ECM, including composition change [2] and stiffening.[57] ECM stiffness can directly affect chondrocyte phenotype and proliferation, and contribute to OA.[58] A recent study by Fu et al. associated matrix stiffening with the promotion of chondrocyte senescence.[59] Furthermore, matrix stiffening has been associated with modulating the TGF-b signaling pathway.[60-62] Future studies should investigate the potential of matrix stiffening and the effect of GATA4 on pericellular matrix proteins such as decorin[63,64], biglycan, collagen VI and XV, as these proteins assist with the regulation of biochemical interactions and assist with the maintenance of the chondrocyte microenvironment.[65] Herein, the TGF-b signaling pathway can further alter the extracellular microenvironment[62], which could promote cellular senescence and subsequently NF-kB pathway activation.” (Lines 600-610)

      (2) If applicable, please also add Y3 and O3 to Figure S1 for visual comparison across individual donors. 

      As suggested, we added Y3 and O3 to the revised Figure S1 for more visual comparisons across individual donors.

      (3) Figure 3C, the molecular weight labels are off. 

      Thanks. We corrected this mistake.

      (4) Line 438 - Please clarify in text that the highest efficiency of siRNA chosen was siRNA2. 

      As suggested, we added the reason for selecting siRNA2.

      “Several GATA4 siRNAs were tested, and the one with the highest efficiency was selected based off RT-qPCR results, which indicated that siRNA2 treatment induced lowest expression of GATA4.  (Supplementary Figure S6).” (Lines 448-450)

      (5) Did the authors test the timeline of sustained knockdown of GATA4 by siRNA?

      We used a 7-day timepoint of chondrogenesis, and RT-qPCR results demonstrated that there was a downregulation of GATA4 expression at this timepoint (Figure 4). In the current in vitro study, we did not examine the efficacy of GATA4 siRNA for longer than 7 days.

      Reviewer #3( Public review):

      (1) It would be useful to explain why GATA4 was chosen over HIF1a, which was the most differentially expressed. 

      The authors thank the reviewer for reviewing our manuscript and providing insightful comments.

      When we first saw the results, we did consider studying the role of HIF1a in aging because it was the most differentially expressed. When we reviewed the relevant literature, we found that HIF1a was commonly upregulated in aged individuals which was thought to be linked to hypoxia and increased oxidated stress (PMID: 12470896, PMID: 12573436). Further investigation found studies that investigated HIF1a in chondrocytes and the use of in vivo work to investigate its role in osteoarthritis (PMID: 32214220). Indicating that HIF1a plays a protective role during OA by suppressing the activation of NF-kB pathway.  Moreover, there is work that has been conducted assessing the stabilization of HIF1a by regulating mitophagy and using HIF1a as a potential therapeutic target for OA (PMID: 32587244). Since there have been many studies investigating the correlation of HIF1a expression and OA, we felt that it would be more innovative to look at other molecules, such as GATA4. Moreoever, as we highlighted in the Introducion and Disucussion, through testing in cell types other than chondrocytes, GATA4 was shown to be associated with DNA damage and senescence, which are both aging hallmarks. Given the fact that roles of GATA4 in chodnrocytes had not been previous studies, we thus chose GATA4 in this study. 

      “Of note, Hypoxia-Inducible Factor 1a (HIF1a) was the most differentially expressed gene predicted to regulate chondrocyte aging. The connection between HIF1a and aging has been previously reported.32 Furthermore, additional studies have investigated HIF1a in association with OA and assessed its use as a therapeutic target.[33,34] Therefore, we decided to focus on GATA4, which was less studied in chondrocytes but highly associated with cellular senescence, an aging hallmark. However, our selection did not dampen the importance of HIF1α and other molecules listed in Figure 1D in chondrocyte aging. They can be further studied in the future using the same strategy employed in the current work.” (Lines 526-533)

      (2) In Figure 5, it would be useful to demonstrate the non-surgical or naive limbs to help contextualize OARSI scores and knee hyperalgesia changes. 

      Thank you for your comment. Based on prior experience, the OARSI score of mice in the sham group had an OARSI score ranging from 0-0.5. In the current study, we focused on the DMM control and DMM Gata4 virus groups so we did not include a sham control group. We recognized this was a limitation of this study.

      “We measured the naive limbs for knee hyperalgesia before DMM surgery, and found the average threshold was 507g. We have highlighted the threshold measurement in the figure legend.507 g was the threshold baseline for non-surgery mice (dashed line).” (Lines 499-500)

      (3) While there appear to be GATA4 small-molecule inhibitors in various stages of development that could be used to assess the effects in age-related OA, those experiments are out of scope for the current study. 

      We agree with this comment that the results are still preliminary, which was the reason that we put it in the supplementary materials. However, we felt like the result is informative, which will support the potential of GATA4 as a therapeutic target and inspire the development of more specific inhibitors. Therefore, if the reviewer agrees, we want to keep the results in the current study.

      In particular, our in vitro study demonstrated the potential of using small-molecule GATA4 to enhance the quality of cartilage created by old chondrocytes. We can validate the findings in vivo, as well as develop other GATA4 inhibitors. (Lines 673-675)

      (4) Is GATA4 upregulated in chondrocytes in publicly available databases? 

      Thank you for this question. We have examined the public databases and have found that there is data showing the trend that GATA4 is upregulated in aged or OA chondrocytes in work conducted by Ungethuem et al (PMID: 20858714). In one study by Ramos et al. (PMID: 25054223), we noticed that GATA4 expression levels were the same in both young and old groups, which may be due to the relatively smaller sample size in the young group compared to old group (4 vs 26).

      Work Conducted by Grogan et al. (Unpublished https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE39795)

      Author response image 1.

      Author response image 2.

      Work conducted by Ramos et al. (PMID: 25054223).<br />

      Author response image 3.

      Work conducted by Ungethuem et al (PMID: 20858714).<br />

      (5) In many cases, the figure captions describe the experiment vs. the outcome. It may be more compelling to state the main finding in the figure title, and you might consider changing it from what is stated at present. For example, Figure 2: instead of the impact of overexpression, you may say GATA4 overexpression impairs cartilage formation (as stated in the results).

      Thanks for the suggestion. We have made the following changes to the figure captions as suggested.

      Figure 1: GATA4 is upregulated in aged chondrocytes (Line 373)

      Figure 2: Overexpressing GATA4 impairs the hyaline cartilage formation capacity of young chondrocytes (Lines 408-409)

      Figure 3: GATA4 overexpression activates SMAD1/5  (Line 436)

      Figure 4: Suppressing GATA4 in old chondrocytes promotes cartilage formation and lowers expression of proinflammatory cytokines (Line 467)

      Figure 5: Gata4 overexpression in the knee joints accelerates OA progression in mice. (Line 593)

      (6) It would be useful to provide a little more information about the human tissue donors, if that is available. 

      We have provided more information about the tissue donors in the revised Supplementary Table S1.

      (7) While aging-like changes were observed in young chondrocytes with GATA4 overexpression, it would be interesting to directly evaluate if there is a change in biological versus chronological age in these tissues. Companies like Zymo can provide this biological v chronological age epigenetic clock-based assessments if that is of interest, to say the young chondrocytes are looking "older". 

      Thank you for this information. We agree that it will be important to assess epigenetic changes in GATA-overexpressing cells. We are contacting the company to learn more about their technology. Meanwhile, we added this to the future work section of the manuscript.

      “Although we found that GATA4 was generally increased with aging, some young donors also exhibited increased levels of GATA4, which may be associated with increased DNA damage, as discussed above, or other stressors. Therefore, GATA4 should be used together in conjunction with other aging biomarkers, such as the epigenetic clock [72] to precisely define chondrocyte aging. Future work should examine biological versus chronological aging and epigenetic clock-based assessments to explain the variabilities in GATA4 expression among donors.”  (Lines 658-663)

      (8) It is not clear the age at which the mice received DMM in the methods, but it is shown in Figure 5. 

      We have added the age at which the mice received the DMM surgery to the methods section.

      “Intraarticular injections were administered to mice between 10-12 weeks of age under general anesthesia to safeguard the well-being of the animals and to minimize procedural discomfort.” (Line 300)

      “One week after viral vector injection, DMM surgery was performed to induce the OA model on mice 11-13 weeks of age.” (Line 312-313)

      (9) It is not clear which factors were assayed using Luminex, and it would be great to add. 

      Thank you for this comment, we have added a comprehensive list of proteins assessed using Luminex into a new supplementary table 6 (S6).

      (10) Also interesting, loss of GATA4 seems to prevent diet-induced obesity in mice and promote insulin sensitivity (potentially via GLP-1 secretion). I wonder if there may be a metabolic axis here too? PMID: 21177287. I may have missed parts of the discussion of the role of GATA4 in metabolism, but it might be an interesting addition to the discussion. 

      In the current study, we have not investigated the role of GATA4 in obesity. As suggested, we have included a discussion of GATA4 in metabolism.

      “Furthermore, GATA4 might be associated with metabolic regulation. A study conducted by Patankar et al. investigated how GATA4 regulates obesity. Specifically, they used intestine-specific Gata4 knockout mice to study diet-induced obesity, showing that the knockout mice were resistant to the high-fat diet, and that glucagon-like peptide-1 (GLP-1) release was increased. These findings indicated a decreased risk for the development for insulin resistance in knockout mice.[44] This work was taken a step further in a subsequent publication, in which the same team investigated the dietary lipid-dependent and independent effects on the development of steatosis and fibrosis in Gata4 knockout mice. The results from this work suggested that the knockdown of Gata4 increases GLP-1 release, in turn suppressing the development of hepatic steatosis and fibrosis, ultimately blocking hepatic de novo lipogenesis.[45] These studies are especially interesting with the rise of GLP-1 based therapy for the treatment of OA.46,47 Thus, the coupling of GATA4-related metabolic dysfunction and OA should be further investigated.” (Lines 542-553)

      (11) Another potential citation: GATA4 regulates angiogenesis and persistence of inflammation in rheumatoid arthritis PMID: 29717129 - around the inflammatory axis potential in OA? since GATA4 was reported in FLS from OA- PMC11183113.

      Thank you. We have included this work/citation in the discussion section.\

      “Further studies have shown that GATA4 regulates angiogenesis and inflammation in fibroblast-like synoviocytes in rheumatoid arthritis, indicating that GATA4 is required for the inflammation induced by IL-1b. This study also demonstrated that GATA4 binds to promoter regions on Vascular Endothelial Growth Factor (VEGF)-A and VEGFC to enhance transcription and regulate angiogenesis.[15]”  (Lines 558-562)

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      Weaknesses: 

      The main weakness in this paper lies in the authors' reliance on a single model to derive conclusions on the role of local antigen during the acute phase of the response by comparing T cells in model antigen-vaccinia virus (VV-OVA) exposed skin to T cells in contralateral skin exposed to DNFB 5 days after the VV-OVA exposure. In this setting, antigen-independent factors may contribute to the difference in CD8+ T cell number and phenotype at the two sites. For example, it was recently shown that very early memory precursors (formed 2 days after exposure) are more efficient at seeding the epithelial TRM compartment than those recruited to skin at later times (Silva et al, Sci Immunol, 2023). DNFB-treated skin may therefore recruit precursors with reduced TRM potential. In addition, TRM-skewed circulating memory precursors have been identified (Kok et al, JEM, 2020), and perhaps VV-OVA exposed skin more readily recruits this subset compared to DNFB-exposed skin. Therefore, when the DNFB challenge is performed 5 days after vaccinia virus, the DNFB site may already be at a disadvantage in the recruitment of CD8+ T cells that can efficiently form TRM. In addition, CD8+ T cell-extrinsic mechanisms may be at play, such as differences in myeloid cell recruitment and differentiation or local cytokine and chemokine levels in VV-infected and DNFB-treated skin that could account for differences seen in TRM phenotype and function between these two sites. Although the authors do show that providing exogenous peptide antigen at the DNFB-site rescues their phenotype in relation to the VV-OVA site, the potential antigen-independent factors distinguishing these two sites remain unaddressed. In addition, there is a possibility that peptide treatment of DNFB-treated initiates a second phase of priming of new circulatory effectors in the local-draining lymph nodes that are then recruited to form TRM at the DFNB-site, and that the effect does not solely rely on TRM precursors at the DNFB-treated skin site at the time of peptide treatment. 

      Thank you for pointing out these potential caveats to our work.  We have considered the possibility that late application of peptide or cell-extrinsic difference could affect the interpretation of our results.  We would like to highlight that in our prior publication on this topic [1], we found that OT-1 responses in mice infected with VV-OVA and VV-N (irrelevant antigen) yielded the same responses as in our VV-OVA/DNFB models.  In addition, in both our prior publication and our current manuscript, application of peptide to DNFB painted sites results in T<sub>RM</sub> with a similar phenotype to those in the VV-OVA site.  Thus, we are confident that it is the presence of cognate antigen in the skin that drives the augmented T<sub>RM</sub> fitness that we observe.

      Secondly, although the authors conclusively demonstrate that TGFBRIII is induced by TCR signals and required for conferring increased fitness to local-antigen-experienced CD8+ TRM compared to local antigen-inexperienced cells, this is done in only one experiment, albeit repeated 3 times. The data suggest that antigen encounter during TRM formation induces sustained TGFBRIII expression that persists during the antigen-independent memory phase. It remains unclear why only the antigen encounter in skin, but not already in the draining lymph nodes, induces sustained TGFBRIII expression. Further characterizing the dynamics of TGFBRIII expression on CD8+ T cells during priming in draining lymph nodes and over the course of TRM formation and persistence may shed more light on this question. Probing the role of this mechanism at other sites of TRM formation would also further strengthen their conclusions and enhance the significance of this finding. 

      This is an intriguing point.  We do not understand why expression of TGFbR3 in T<sub>RM</sub> required antigen encounter in the skin if T<sub>RM</sub> at all sites clearly have encountered antigen during priming in the LN.  We speculate that durable TGFbR3 expression may require antigen encounter in the context of additional cues present in the periphery or only once cells have committed to the T<sub>RM</sub> lineage.  A more detailed characterization of the dynamics of TGFbR3 expression in multiple tissues would be informative and represents a promising future direction for this project.  We note that to robustly perform these experiments a reporter mouse would likely be a requirement.

      Reviewer #2 (Public review): 

      Weaknesses: 

      Overall, the authors' conclusions are well supported, although there are some instances where additional controls, experiments, or clarifications would add rigor. The conclusions regarding skin-localized TCR signaling leading to increased skin CD8+ TRM proliferation in-situ and increased TGFBR3 expression would be strengthened by assessing skin CD8+ TRM proliferation and TGFBR3 expression in models of high versus low avidity topical OVA-peptide exposure.

      Thank you for these helpful suggestions.  We did not attempt these experiment as we were concerned that given the relatively modest expansion differences observed with the APL that resolving differences in TGFbR3 and BrdU would prove unreliable. However, this is something that we could attempt as we continue working on this project.

      The authors could further increase the novelty of the paper by exploring whether TGFBR3 is regulated at the RNA or protein level. To this end, they could perform analysis of their single-cell RNA sequencing data (Figure 1), comparing Tgfbr3 mRNA in DNFB versus VV-treated skin. 

      As discussed above, a more detailed analysis of TGFbR3 regulation is of great interest.  These experiments would likely require the creation of additional tools (e.g. a reporter mouse) to provide robust data.  However, as suggested, we have re-analyzed our scRNAseq looking for expression of Tgfbr3. Pseudobulk analysis of cells isolated from VV or DNFB sites suggests that Tgfbr3 appears to be elevated in antigen-experienced TRM at steady-state (Author response image 1).

      Author response image 1.

      Pseudobulk analysis by average gene expression of Tgfbr3 in cells isolated from either VV or DNFB treated flanks, divided by the average gene expression of Tgfbr3 in naïve CD8 T cells from the same dataset.

      For clarity, when discussing antigen exposure throughout the paper, it would be helpful for the authors to be more precise that they are referring to the antigen in the skin rather than in the draining lymph node. A more explicit summary of some of the lab's previous work focused on CD8+ TRM and the role of TGFb would also help readers better contextualize this work within the existing literature on which it builds. 

      We appreciate this feedback, and we have clarified this in the text.

      For rigor, it would be helpful where possible to pair flow cytometry quantification with the existing imaging data.

      Thank you for these suggestions.  In terms of quantification of number of T<sub>RM</sub>by flow cytometry, we have previously demonstrated as much as a 36-fold decrease in cell count when compared to numbers directly visualized by immunofluorescence [1].  Thus, for enumeration of T<sub>RM</sub> we rely primarily on direct IF visualization and use flow cytometry primarily for phenotyping.

      Additional controls, namely enumerating TRM in the opposite, untreated flank skin of VV-only-treated mice and the treated flank skin of DNFB-only treated mice, would help contextualize the results seen in dually-treated mice in Figure 2.

      Without a source of inflammation (e.g. VV infection of DNFB) we see very few T<sub>RM</sub>in untreated skin.  A representative image is provided (Author response image 2).  A single DNFB stimulation does not recruit any CD8+ T cells to the skin without a prior sensitization [2].

      Author response image 2.

      Representative images of epidermal whole mounts of VV treated flank skin, and an untreated site from the same mouse isolated on day 50 post infection and stained for CD8a.

      In figure legends, we suggest clearly reporting unpaired T tests comparing relevant metrics within VV or DNFB-treated groups (for example, VV-OVA PBS vs VV-OVA FTY720 in Figure 3F).

      Thank you for this suggestion.  The figure legends have been amended.

      Finally, quantifying right and left skin draining lymph node CD8+ T cell numbers would clarify the skin specificity and cell trafficking dynamics of the authors' model. 

      We quantified the numbers of CD8 T cells in left and right skin draining lymph nodes by flow cytometry in mice at day 50 post VV infection DNFB-pull.  We observe similar numbers of cells at both sites (Author response Image 3).

      Author response Image 3.

      Quantification of total number of CD8+ T cells in left and right inguinal lymph nodes. Each symbol represents paired data from the same individual animal, and this is representative of 3 separate experiments.

      Reviewer #1 (Recommendations for the authors): 

      (1) Figures 1D and S1C demonstrate that 80-90 % of TRM at both VV and DNFB sites express CD103+. In contrast, the sequencing data suggests the TRM at the VV site has much higher Itgae expression. Also, clusters 3 and 4, which express significantly more Itgae than all other clusters, together comprise only ~30% of CD8+ T cells at the VV-infected skin site. How can these discrepancies between transcript and protein expression be explained? 

      Thank you for these excellent comments. T<sub>RM</sub> at both VV and DNFB sites appear to express similarly high levels of CD103 protein in both the OT-I system as we previously published [1] and in a polyclonal system using tetramers.  The lower penetrance of Itgae expression in the scRNAseq data we attribute to a lack of sensitivity which is common with this modality.  However, the relative increased expression of Itgae in clusters 3 and 4 is interesting and may suggest increased Itgae production/stability.  However, in the absence of any effect on protein expression, we chose not to focus on these mRNA differences.

      (2) For the experiments in Figure 3D, in order to exclude a contribution from circulating memory cells, FTY720 should have been administered during the duration of, not prior to, the initiation of the recall response. The effect of FTY720 wears off quickly, so the current experimental setting likely allows for circulating cells to enter the skin. This concern is mitigated by the results of anti-Thy1.1 mAb treatment, but documenting the experiment as in Figure D will likely be confusing to readers. 

      Thank you for this comment.  We relied on the literature indicating that the half-life of FTY720 in blood is longer than 6 days [3-5].  However, on reviewing this again, there are other reports suggesting a lower halflife.  Thank you for pointing out this potential caveat.  As mentioned above, we do not think this affects the interpretation of our data as similar results were obtained with anti-Thy1.1

      (3) Similar to what is described in the weaknesses section, the data on TGFBRIII expression is lacking. When is TGFBRIII induced? In the LN during primary activation and it is then sustained by a secondary antigen exposure at the peripheral target tissue site? Or is it only induced in the peripheral tissue, and there is interesting biology to uncover in regard to how it is induced by the TCR only after secondary exposure, etc.? 

      Thank you for these comments. As discussed above, a more detailed analysis of TGFbR3 regulation is of great interest.  These experiments would likely require the creation of additional tools (e.g. a reporter mouse) to provide robust data and are part of our future directions.

      (4) As described in the weakness section, there could be TCR-independent differences between the VV-OVA and DNFB sites that lead to phenotypic changes in the TRMs that are formed there, both CD8+ T cell-intrinsic (kinetics; with regard to time after initial priming) and extrinsic (microenvironmental differences due to the nature of the challenge, recruited cell types, cytokines, chemokines, etc.). Since the authors report the use of both VV and VV-ova, we recommend an experimental strategy that controls for this by challenging one site with VV and another with VV-OVA concomitantly, followed by repeating the key experiments reported in this manuscript. 

      As discussed above, we have previously published a very similar experiment using VV-OVA and VV-N infection on opposite flanks [1].

      (5) In Figure 6J please indicate means and provide more of the statistics comparing the groups (such as comparing VV-WT vehicle to VV-KO vehicle etc.), and potentially display on a linear scale as with all of the other figures looking at cells/mm2 to help convince the reader of the conclusions and support the secondary findings mentioned in the text such as "Notably, numbers of Tgfbr3ΔCD8 TRM in cohorts treated with vehicle remained at normal levels indicating that loss of TGFβRIII does not affect TRM epidermal residence in the steady state" despite it looking like there is a decrease when looking at the graph. 

      We appreciate the feedback on the readability of this figure, and so have updated figure 6J to be on a linear scale and added additional helpful statistics to the figure legend. The difference between Tgfbr3<sup>WT</sup> and Tgfbr3<sup>∆CD8</sup> at steady state is excellent point, and we agree that there could to be a trend towards reduction in the huNGFR+ T<sub>RM</sub> across both groups, even without CWHM12 administration. However, we did not see statistically significant reductions in steady-state Tgfbr3<sup>∆CD8</sup> T<sub>RM</sub>, but the slight reduction in both VV-OVA and DNFB treated flanks suggests that TGFßRIII may play a role in steady-state maintenance of all T<sub>RM</sub>. Perhaps with more sensitive tools to better visualize TGFßRIII expression, we could identify stepwise upregulation of TGFßRIII depending on TCR signal strength, possibly starting in the lymph node. We have also amended our description of this figure in the text, to allow for the possibility that a low, but under the level of detection amount of TGFßRIII could play a role in steady-state maintenance of both local antigen-experienced and bystander T<sub>RM</sub>.

      Minor points: 

      (1) In describing Figure 4B, the term "doublets" for pairs of connected dividing cells is confusing. 

      Thank you for this comment, the term has been revised to “dividing cells” in the text and figure.

      (2) Figure legend 4F: BrdU is not "expressed" . 

      Very true, it has been changed to “incorporation”.

      (3) Do CreERT2 and/or huNGFR expressed by transferred OT-I cells act as foreign antigens in C57BL/6 mice, potentially causing elimination of circulating memory cells? If that were the case, this would not necessarily confound the read-out of TRM persistence studied here, since skin TRM are likely protected from at least antibody-mediated deletion and their numbers are not maintained by recruitment of circulating cells at stead-state. However, it would be useful to be aware of this potential limitation of this and similar models. 

      Thank you for raising the important technical concern.  In our prior work [1] and this work, we monitor the levels of transferred OT-I cells in the blood over time.  We have not observed rejection of huNGFR+ cells.  We also note that others using the same system have also not observed rejection [6].

      (4) In Figure 6J, means or medians should be indicated 

      This has been updated in Figure 6J.

      (5) Using the term "antigen-experienced" to specifically refer to TRM at the VV site could be confusing, since those at the DNFB site are also Ag-experienced (in the LN draining the VV skin site). 

      We agree that it is a challenging term, as all T<sub>RM</sub> are memory cells. That is why in the text we refer to T<sub>RM</sub> isolated from the VV site as “local antigen experienced T<sub>RM</sub>.”, to try to distinguish them from bystanders that did not experience local antigen.

      (6) The Title essentially restates what was already reported in the authors' prior study. If the data supporting the TGFBRIII-mediated mechanism is studied in more depth, maybe adding this aspect to the title may be useful? 

      Thank you for this suggestion.  I think the current title is probably most suitable for the current manuscript but we are willing to change it should the editors support an alternative title.

      Reviewer #2 (Recommendations for the authors): 

      (1) Definition of bystander CD8+ TRM: The first paragraph of the introduction defines CD8+ TRM. To improve the clarity of this definition, we suggest being explicit that bystander TRM experience cognate antigen in the SDLNs but, in contrast to other TRM, do not experience cognate antigen in the skin. 

      Thank you, we have clarified this is in the text.

      (2) Consider softening the language when comparing the efficiency of CD8+ recruitment of the skin between DNFB and VV-treated flanks. For example, substitute "equal efficiency" with "comparable efficiency" since it is difficult to directly compare the extent of inflammation between viral and hapten-based treatments. 

      We have adjusted this terminology throughout the paper.

      (3) Throughout figure legends, we appreciate the indication of the number of experimental repeats performed. We suggest, either through statistics or supplemental figures, demonstrating the degree of variability between experiments to aid readers in understanding the reproducibility of results. 

      Thank you for this suggestion.  In key figures we show data from individual mice across multiple experiments. Thus, inter-experiment variability is captured in our figures.  

      (4) Figure 1: 

      a) Add control mice treated with either vaccinia virus or DNFB and harvest back skin at day 52 to demonstrate baseline levels of polyclonal and B8R tetramer-positive CD8s in the epidermis. These controls would clarify the background CD8+ expansion that might occur in DNFB-treated mice in the absence of vaccinia virus. 

      This point was addressed above.

      b) Figure 1: It would be helpful to see the %Tet+ population specifically in the CD103+ population, recognizing that the majority of the CD8+ from the skin are CD103+. 

      We did look only at CD103+ CD8 T cells from the skin for our tetramer analysis, so this has been clarified in the figure legend.

      c) Provide a UMAP, very similar to 1H, where CD8+ T cells, vaccinia virus, and DNFB-treated flanks are overlaid.

      Thank you for this suggestion.  A UMAP combining aspects of 1G (cell types from the whole ImmgenT dataset) with 1H (our data) results in a figure that is very difficult to interpret.  Thus, we have separated cell types across the entire ImmgenT data set (e.g. CD8+ T cells) and our data into 2 separate panels.

      d) 1D: left flow plot has numbered axis while the right flow plot does not. 

      Thank you, this has been fixed.

      (5) Figure 2: 

      a) In the figure legend, define what is meant by the grey line present in Figures 2C and 2D. 

      This has been updated in the figure legend.

      b) Edit the Y axis of 2C and 2D to specify the TRM signature score. 

      This has been updated in the figure.

      c) Include panel 1D from 1S into Figure 2 to help clarify for the reader what genes are expressed in the 0 - 5 clusters.

      We appreciate the feedback, but we found the heatmap made the figure look too busy, so we feel comfortable keeping it available within supplemental figure 1.

      d) In body of text explicitly discuss that the TRM module used to calculate a signature score was created using virus infection modules (HSV, LCMV and influenza) and thus some of the transcriptional similarity between the authors vaccinia virus treated CD8+ TRM and the TRM module might be due to viral infection rather than TRM status.

      Thank you for this comment.  We have now emphasized this point in the text.

      (6) Figure 3: 

      a) If there are leftover tissue sections, it would be optimal to show specific staining for CD103. We recognize that this data has been previously published by the lab, but it would be ideal to show it once in this paper. 

      Unfortunately, we do not have leftover tissue sections, so we are unable to measure CD103 by I.F. in these experiments.

      b) If you did collect skin draining lymph nodes in the Thy1.1 depletion model, it would be nice to see flow data showing the depletion effects in the skin draining lymph nodes in addition to the blood. 

      Unfortunately, we did not collect the skin draining lymph nodes, and do not have that data for the relevant experiments.

      c) Figure 3 F & G: Perform a T-test comparing vaccinia virus PBS to FTY720 and isotype to anti-Thy1.1 within the same treatment group. Showing no significance with these two comparisons would strengthen the authors' claims. Statistics can be described in legend. 

      We have included this analysis in the figure legend.

      (7) Figure 4: 

      a) It would be helpful to have the CD69+/CD103+ population in this model discussed/defined more. The CD69 expression seen in 4E is lower than the reviewers would've predicted, and it would be interesting to see CD103 expression as well.

      We have found that generally CD103 is a stronger marker for in the skin by flow, as CD69 staining is somewhat less robust in the colors we have chosen.  By way of example, we present gating we did upstream in that experiment, gated previously on liveCD45+CD3+CD8+ events (Author response image 4).

      Author response image 4.

      Representative flow cytometric plots showing CD69 and CD103 expression in gated live CD45+CD8+CD90.1+ cells isolates from VV-OVA or DNFB treated flanks.

      (8) Figure 5: 

      a) Define APL and its purpose in both the body of text and the figure legend. 

      We have clarified this in the text and the figure legend.

      b) Using in-vivo BrdU, compare proliferation between high avidity N4 and low avidity Y3 OVA-peptide at the primary recall timepoint. 

      We considered this, but due to the lack of sensitivity of the BrdU incorporation and the relatively subtle phenotype of the Y3, we did not think the assay would be sensitive enough to identify differences.

      (9) Figure 6: 

      a) Compare TGFBR3 expression in CD8+ T cells from mice receiving high avidity N4 versus low avidity Y3 OVA-peptide at the primary recall timepoint. 

      This point was discussed above.

      b) Either 1) examine TGFBR3 mRNA expression in VV vs DNFB skin from scRNA-seq dataset or 2) perform a qPCR on epidermal CD8+ T cells from mice receiving high avidity N4 versus low avidity Y3 at the primary recall timepoint. This would help distinguish whether TGFBR3 regulation occurs at the mRNA versus protein level. 

      This point has been discussed above.

      c) Figure 6A: Not required, but it seems like the TGFBR3 gate could be shifted to the right a bit. 

      The gates were set using FMO.

      d) Figure 6C: What comparison is the asterisk indicating significance referring to?

      It is the Dunnett’s test comparing VV-OVA to DNFB and untreated skin, the figure has been amended to clarify this point.

      e) Figure 6: To increase the rigor of the claim that CWHM12 is creating a TGFb limiting condition, the authors could either 1) perform an ELISA or cell-based assay measuring active TGFb, 2) recapitulate results of 6J using monoclonal antibody against avb6 as done in Hirai et al., 2021, Immunity., or 3) examine Tgfbr3 mRNA expression in your single cell RNAseq data, comparing cluster 0 and cluster 3.

      We are pleased to have the opportunity to show Tgfbr3 mRNA, which is above in figure R1.

      (10) Material and methods: 

      Specify how the localization of the back skin used for imaging was made consistent between the right and left flanks. 

      We have updated this methodology in the text.

      Literature Cited

      (1) Hirai, T., et al., Competition for Active TGFβ Cytokine Allows for Selective Retention of Antigen-Specific Tissue- Resident Memory T Cells in the Epidermal Niche. Immunity, 2021. 54(1): p. 84-98.e5.

      (2) Manresa, M.C., Animal Models of Contact Dermatitis: 2,4-Dinitrofluorobenzene-Induced Contact Hypersensitivity, in Animal Models of Allergic Disease: Methods and Protocols, K. Nagamoto-Combs, Editor. 2021, Springer US: New York, NY. p. 87-100.

      (3) Müller, H.C., et al., The Sphingosine-1 Phosphate receptor agonist FTY720 dose dependently affected endothelial integrity in vitro and aggravated ventilator-induced lung injury in mice. Pulmonary Pharmacology & Therapeutics, 2011. 24(4): p. 377-385.

      (4) Nofer, J.-R., et al., FTY720, a Synthetic Sphingosine 1 Phosphate Analogue, Inhibits Development of Atherosclerosis in Low-Density Lipoprotein Receptor–Deficient Mice. Circulation, 2007. 115(4): p. 501-508.

      (5) Brinkmann, V., et al., Fingolimod (FTY720): discovery and development of an oral drug to treat multiple sclerosis. Nat Rev Drug Discov, 2010. 9(11): p. 883-97.

      (6) Andrews, L.P., et al., A Cre-driven allele-conditioning line to interrogate CD4<sup>+</sup> conventional T cells. Immunity, 2021. 54(10): p. 2209-2217.e6.

    1. Author response:

      Reviewer #1 (Public review):

      Summary:

      In this manuscript, Cho et al. present a comprehensive and multidimensional analysis of glutamine metabolism in the regulation of B cell differentiation and function during immune responses. They further demonstrate how glutamine metabolism interacts with glucose uptake and utilization to modulate key intracellular processes. The manuscript is clearly written, and the experimental approaches are informative and well-executed. The authors provide a detailed mechanistic understanding through the use of both in vivo and in vitro models. The conclusions are well supported by the data, and the findings are novel and impactful. I have only a few, mostly minor, concerns related to data presentation and the rationale for certain experimental choices.

      Detailed Comments:

      (1) In Figure 1b, it is unclear whether total B cells or follicular B cells were used in the assay. Additionally, the in vitro class-switch recombination and plasma cell differentiation experiments were conducted without BCR stimulation, which makes the system appear overly artificial and limits physiological relevance. Although the effects of glutamine concentration on the measured parameters are evident, the results cannot be confidently interpreted as true plasma cell generation or IgG1 class switching under these conditions. The authors should moderate these claims or provide stronger justification for the chosen differentiation strategy. Incorporating a parallel assay with anti-BCR stimulation would improve the rigor and interpretability of these findings. 

      We will edit the manuscript to be more explicit that total splenic B cells were used in this set-up figure and the rest of the paper. In addition, we will try to perform new experiments to improve this "set-up figure" (and add old and new data for Supplemental Figure presentation). Specifically, we will increase the range of conditions tested - e.g., styles of stimulating proliferation and differentiation - to foster an increased sense of generality. We plan to compare mitogenic stimulation with anti-CD40 to  anti-IgM and to anti-IgM + anti-CD40, all with BAFF, IL-4, and IL-5, bearing in mind excellent work from Aiba et al, Immunity 2006; 24: 259-268, and similar papers. We also will try to present some representative flow cytometric profiles (presumably in new Supplemental Figure panels).

      To be transparent and add to a more open public discussion (using the virtues of this forum, the senior author and colleagues would caution about whether any in vitro conditions exist that warrant complete confidence. That is the reason for proceeding to immunization experiments in vivo. That is not said to cast doubt on our own in vitro data - there are some experiments (such as those of Fig. 1a-c and associated Supplemental Fig. 1) that only can be done in vitro or are better done that way (e.g., because of rapid uptake of early apoptotic B cells in vivo).

      For instance: Well-respected papers use the CD40LB and NB21.2D9 systems to activate B cells and generate plasma cells. Those appear to be BCR-independent and unfortunately, we found that they cannot be used with a.a. deprivation or these inhibitors due to effects on the engineered stroma-like cells. In considering BCR engagement, Reth has published salient points about signaling and concentrations of the Ab, the upshot being that this means of activating mitogenesis and plasma cell differentiation (when the B cells are costimulated via CD40 or TLR(4 or 7/8) is probably more than a bit artificial. Moreover, although Aiba et al, Immunity 2006; 24: 259-268 is a laudable exception, one rarely finds papers using BAFF despite the strong evidence it is an essential part of the equation of B cell regulation in vivo and a cytokine that modulates BCR signaling - in the cultures. 

      (2) In Figure 1c, the DMK alone condition is not presented. This hinders readers' ability to properly asses the glutaminolysis dependency of the cells for the measured readouts. Also, CD138+ in developing PCs goes hand in hand with decreased B220 expression. A representative FACS plot showing the gating strategy for the in vitro PCs should be added as a supplementary figure. Similarly, division number (going all the way to #7) may be tricky to gate and interpret. A representative FACS plot showing the separation of B cells according to their division numbers and a subsequent gating of CD138 or IgG1 in these gates would be ideal for demonstrating the authors' ability to distinguish these populations effectively.

      We agree that exact placement  of divisions deconvolution by FlowJow is more fraught than might be thought forpresentations in many or most papers. For the revision, we will try to add one or several representative FACS plot(s) with old and new data to provide the gating on CTV fluorescence, bearing these points in mind when extending the experiments from ~7 years ago (Fig. 1b, c). With the representative examples of the old data pasted in here, we will aver, however, that using divisions 0-6, and ≥7 was reasonable. 

      Ditto for DMK with normal glutamine. However, in the spirit of eLife transparency lacking in many other journals, this comparison is more fraught than the referee comment would make things seem. The concentration tolerated by cells is highly dependent on the medium and glutamine concentration, and perhaps on rates of glutaminolysis (due to its generation of ammonia). In practice, we find that DMK becomes more toxic to B cells unless glutamine is low or glutaminolysis is restricted. Thus, the concentration of DMK that is tolerated and used in Fig. 1b, c can become toxic to the B cells when using the higher levels of glutamine in typical culture media (2 mM or more) - at which point the "normal conditions + DMK" "control" involves the surviving cells in conditions with far greater cell death and less population expansion than the "low glutamine + DMK". condition. Overall, we appreciate the suggestion to show more DMK data and will work to do so for the earlier proliferation data (shown above) and the new experiments.  

      Author response image 1.

       

      (3) A brief explanation should be provided for the exclusive use of IgG1 as the readout in class-switching assays, given that naïve B cells are capable of switching to multiple isotypes. Clarifying why IgG1 was preferentially selected would aid in the interpretation of the results.

      We will edit the text to be more explicit and harmonize in light of the referee's suggestion that we focus the presentation of serologic data on IgG1 in the immunization experiments.

      [IgG1 provides the strongest signal and hence better signal/noise both in vitro and with the alum-based immunizations that are avatars for the adjuvant used in the majority of protein-based vaccines for humans.]

      (4) The immunization experiments presented in Figures 1 and 2 are well designed, and the data are comprehensively presented. However, to prevent potential misinterpretation, it should be clarified that the observed differences between NP and OVA immunizations cannot be attributed solely to the chemical nature of the antigens - hapten versus protein. A more significant distinction lies in the route of administration (intraperitoneal vs. intranasal) and the resulting anatomical compartment of the immune response (systemic vs. lung-restricted). This context should be explicitly stated to avoid overinterpretation of the comparative findings.

      We agree with the referee and will edit the text accordingly. Certainly, the difference in how the anti-ova response is elicited compared to the anti-NP response in the same mice or with a bit different an immunization regimen might be another factor - or the major factor - that could contribute towards explaining why glutaminolysis was important after ovalbumin inhalations (used because emergence of anti-ova Ab / ASCs is suppressed by the NP hapten after NP-ova immunization) but not needed for the anti-NP response unless Slc2a1 or Mpc2 also was inactivated. Thank you prompting addition of this caveat.

      Nevertheless, it seems fair to note that in Figures 1 and 2, the ASCs and Ab are being analyzed for NP and ova in the same mice, albeit with the NP-specific components not being driven by the inhalations of ovalbumin. With that in mind, when one compares the IgG1 anti-NP ASC and Ab to those for IgG1 anti-ovalbumin (ASC in bone marrow; Ab), the ovalbumin-specific response was reduced whereas the anti-NP response was not.

      (5) NP immunization is known to be an inducer of an IgG1-dominant Th2-type immune response in mice. IgG2c is not a major player unless a nanoparticle delivery system is used. However, the authors arbitrarily included IgG2c in their assays in Figures 2 and 3. This may be confusing for the readers. The authors should either justify the IgG2c-mediated analyses or remove them from the main figures. (It can be added as supplemental information with proper justification). 

      We will rearrange the Figure panels to move the IgM and IgG2c data to Supplemental Figures.

      For purposes of public discourse, we note that the data of previous Figure 3(c, g) show a very strong NP-specific IgG2c response that seems to contradict the concept that IgG2c responses necessarily are weak in this setting, and the important role of IgG2c (mouse - IgG1 in humans) in controlling or clearing various pathogens as well as in autoimmunity. So from the standpoint of providing a better sense of generality to the loss-of-function effects, we continue to think that these measurements are quite important. That said, the main text has many figure panels and as the review notes, the class switching and in vitro ASC generation were done with IL-4 / IgG1-promoting conditions. If possible, we will try to assay in vitro class switching with IFN-g rather than IL-4 but there may not be enough resources (time before lab closure; money).

      [As a collegial aside, we speculate that a greater or lesser IgG2c anti-NP response may arise due to different preparations of NP-carrier obtained from the vendor (Biosearch) having different amounts of TLR (e.g., TLR4) ligand. In any case, the points of presenting the IgG2c (and IgM) data were to push against the limiting boundaries of convention (which risks perpetuating a narrow view of potential outcomes) and make the breadth of results more apparent to readers.

      (6) Similarly, in affinity maturation analyses, including IgM is somewhat uncommon. I do not see any point in showing high affinity (NP2/NP20) IgMs (Figure 3d), since that data probably does not mean much.

      As noted in the reply immediately preceding this one, we appreciate this suggestion from the reviewer and will move the IgM and IgG2c to Supplemental status.

      Nonetheless, in collegial discourse we disagree a bit with the referee in light of our data as well as of work that (to our minds) leads one to question why inclusion of affinity maturation of IgM is so uncommon - as the referee accurately notes. Of course a defect in the capacity to class-switch is highly deleterious in patients but that is not the same as concluding that recall IgM or its affinity is of little consequence.

      In some of the pioneering work back in the 1980's, Bothwell showed that NP-carrier immunization generated hybridomas producing IgM Ab with extensive SHM (~11% of the 18 lineages; ~ 1/3 of the IgM hybridomas) [PMID: 8487778], IgM B cells appear to move into GC, and there is at least a reasonable published basis for the view that there are GC-derived IgM (unswitched) memory B cells (MBC) that would be more likely, upon recall activation, to differentiate into ASCs. [As an example, albeit with the Jenkins lab anti-rPE response, Taylor, Pape, and Jenkins generated quantitative estimates of the numbers of Ag-specific IgM<sup>+</sup>vs switched MBC that were GC-derived (or not). [PMID: 22370719]. While they emphasized that ~90% of  IgM<sup>+</sup> MBC appeared to be GC-independent, their data also indicated that ~1/2 of all GC-derived MBC were IgM<sup>+</sup> rather than switched (their Fig. 8, B vs C; also 8E, which includes alum-PE). And while we immensely respect the referee, we are perhaps less confident that IgM or high-affinity Ag-specific IgM doesn't mean that much, if only because of evidence that localized Ab compete for Ag and may thus influence selective processes [PMCID: PMC2747358; PMID: 15953185; PMID: 23420879; PMID: 27270306].

      (7) Following on my comment for the PC generation in Figure 1 (see above), in Figure 4, a strategy that relies solely on CD40L stimulation is performed. This is highly artificial for the PC generation and needs to be justified, or more physiologically relevant PC generation strategies involving anti-BCR, CD40L, and various cytokines should be shown. 

      In line with our response to point (1), we plan and will try to self-fund testing BCR-stimulated B cells (anti-CD40 to  anti-IgM and to anti-IgM + anti-CD40, all with BAFF, IL-4, and IL-5).

      (8) The effects of CB839 and UK5099 on cell viability are not shown. Including viability data under these treatment conditions would be a valuable addition to the supplementary materials, as it would help readers more accurately interpret the functional outcomes observed in the study. 

      We will add to the supplemental figures to present data that provide cues as to relative viability / survival under the experimental conditions used. [FSC X SSC as well as 7AAD or Ghost dye panels; we also hope to generate new data that include further experiments scoring annexin V staining.]

      (9) It is not clear how the RNA seq analysis in Figure 4h was generated. The experimental strategy and the setup need to be better explained.

      The revised manuscript will include more information (at minimum in the Methods, Legend), and we apologize that in this and a few other instances sufficiency of detail was sacrificed on the altar of brevity.

      [Adding a brief synopsis to any reader before the final version of record, given the many months it will take to generate new data, thoroughly revise the manuscript, etc:

      In three temporally and biologically independent experiments, cultures were harvested 3.5 days after splenic B cells were purified and cultured as in the experiments of Fig. 4a-e. total cellular RNA prepared from the twelve samples (three replicates for each of four conditions - DMSO vehicle control, CB839, UK5099, and CB839 + UK5099) was analyzed by RNA-seq. After the RNA-seq data were initially processed using the pipeline described in the Methods. For panels g & h of Fig 4, DE Seq2 was used to quantify and compare read counts in the three CB839 + UK5099 samples relative to the three independent vehicle controls and identify all genes for which variances yielded P<0.05. In Fig 4g, all such genes for which the difference was 'statistically significant' (i.e., P<0.05) were entered into the Immgen tool and thereby mapped to the B lineage subsets shown in the figure panels (i.e., g, h). In (g), these are displayed using one format, whereas (h) uses the 'heatmap' tool in MyGeneSet.  

      Reviewer #2 (Public review): 

      Summary: 

      In this manuscript, the authors investigate the functional requirements for glutamine and glutaminolysis in antibody responses. The authors first demonstrate that the concentrations of glutamine in lymph nodes are substantially lower than in plasma, and that at these levels, glutamine is limiting for plasma cell differentiation in vitro. The authors go on to use genetic mouse models in which B cells are deficient in glutaminase 1 (Gls), the glucose transporter Slc2a1, and/or mitochondrial pyruvate carrier 2 (Mpc2) to test the importance of these pathways in vivo. 

      Interestingly, deficiency of Gls alone showed clear antibody defects when ovalbumin was used as the immunogen, but not the hapten NP. For the latter response, defects in antibody titers and affinity were observed only when both Gls and either Mpc2 or Slc2a1 were deleted. These latter findings form the basis of the synthetic auxotrophy conclusion. The authors go on to test these conclusions further using in vitro differentiations, Seahorse assays, pharmacological inhibitors, and targeted quantification of specific metabolites and amino acids. Finally, the authors document reduced STAT3 and STAT1 phosphorylation in response to IL-21 and interferon (both type 1 and 2), respectively, when both glutaminolysis and mitochondrial pyruvate metabolism are prevented. 

      Strengths:

      (1) The main strength of the manuscript is the overall breadth of experiments performed. Orthogonal experiments are performed using genetic models, pharmacological inhibitors, in vitro assays, and in vivo experiments to support the claims. Multiple antigens are used as test immunogens--this is particularly important given the differing results. 

      (2) B cell metabolism is an area of interest but understudied relative to other cell types in the immune system. 

      (3) The importance of metabolic flexibility and caution when interpreting negative results is made clear from this study.

      Weaknesses:

      (1) All of the in vivo studies were done in the context of boosters at 3 weeks and recall responses 1 week later. This makes specific results difficult to interpret. Primary responses, including germinal centers, are still ongoing at 3 weeks after the initial immunization. Thus, untangling what proportion of the defects are due to problems in the primary vs. memory response is difficult.

      (2) Along these lines, the defects shown in Figure 3h-i may not be due to the authors' interpretation that Gls and Mpc2 are required for efficient plasma cell differentiation from memory B cells. This interpretation would only be correct if the absence of Gls/Mpc2 leads to preferential recruitment of low-affinity memory B cells into secondary plasma cells. The more likely interpretation is that ongoing primary germinal centers are negatively impacted by Gls and Mpc2 deficiency, and this, in turn, leads to reduced affinities of serum antibodies

      We provisionally plan to edit the wording of the conclusion a bit to add a possibility we consider unlikely to avoid a conclusion that MBCs bearing switched BCRs are affected once reactivated. We also will perform a new experiment to investigate, but unfortunately time before lab closure has been and remains our enemy both for performance and multiple replication of the work presented in Figure 3, panels h & i, and the related Supplemental Data (Supplemental Fig. 3a-j). Unfortunately, it will not be possible to do a memory experiment with recall immunization out at 8 weeks.  Despite the grant funding running out and institutional belt-tightening, however, we'll try to perform a new head-to-head comparison of 4 wk post-immunization with and without the boost at three weeks.

      The intriguing concern (points 1 & 2) provides a springboard for consideration of generalizations and simplifications. Germinal center durability is not at all monolithic, and instead is quite variable**. The premise (cognitive bias, perhaps?) in the interpretation is that in our previous work we find few if any GC B cells - NP-APC-binding or otherwise - above the background (non-immunized controls) three weeks after immunization with NP-ovalbumin in alum. Recognizing that it is not NP-carrier in alum as immunizations, we note for the readers and referee that Fig. 1 of the Taylor, Pape, & Jenkins paper considered above [PMID: 22370719] reported 10-fold more Ag-specific MBCs than GC B cells at day 29 post-immunization (the point at which the boost / recall challenge was performed in our Figure 3h, i).

      Viewed from that perspective, the surmise of the comment is that a major contribution to the differences in both all-affinity and high-affinity anti-NP IgG1 shown in Fig. 3i derives from the immunization at 4 wk stimulating GC B cells we cannot find as opposed to memory B cells. However, it is true that in the literature (especially with the experimentally different approach of transferring BCR-transgenic / knock-in versions of an NP-biased BCR) there may be meaningful pools of IgG1 and IgG2c GC B cells. Alternatively, our current reagents for immunizations may have become better at maintaining GC than those in the past - which we will try to test.

      The issue and question also relate to rates of output of plasma cells or rises in the serum concentrations of class-switched Ab. To this point, our prior experiences agree with the long-published data of the Kurosaki lab in Figure 3c of the Aiba et al paper noted above (Immunity, 2006) (and other such time courses). Readers can note that the IgG1 anti-NP response (alum adjuvant, as in our work) hits its plateau at 2 wk, and did not increase further from 2 to 3 wk. In other words, GC are on the decline and  Ab production has reached its plateau by the time of the 2nd immunization in Fig. 3h). 

      Assuming we understand the comment and line of reasoning correctly, we also lean towards disagreeing with the statement "This interpretation would only be correct if the absence of Gls/Mpc2 leads to preferential recruitment of low-affinity memory B cells into secondary plasma cells." Our evidence shows that both low-affinity as well as high-affinity anti-NP Ab (IgG1) went down as a result of combined gene-inactivation after the peak primary response (Fig. 3i). Recent papers show that affinity maturation is attributable to greater proliferation of plasmablasts with high-affinity BCR. Accordingly, the findings with loss of GLS and MPC function are quite consistent with the interpretation that much of the response after the second immunization draws on MBC differentiation into plasmablasta and then plasma cells, where the proliferative advantage of high-affinity cells is blunted by the impaired metabolism. The provisional plan, however, is to note the alternative, if less likely, interpretation proposed by the review.

      ** In some contexts, of course, especially certain viral infections or vaccination with lipid nanoparticles carrying modified mRNA, germinal centers are far more persistent; also, in humans even the seasonal flu vaccine **

      (3) The gating strategies for germinal centers and memory B cells in Supplemental Figure 2 are problematic, especially given that these data are used to claim only modest and/or statistically insignificant differences in these populations when Gls and Mpc2 are ablated. Neither strategy shows distinct flow cytometric populations, and it does not seem that the quantification focuses on antigen-specific cells.

      We will enhance these aspects of the presentation, using old and hopefully new data, but note for readers that many many other papers in the best journals show plots in which the separation of, say, GC-Tfh from overall Tfh is based on cut-off within what essentially is a continuous spectrum of emission as adjusted or compensated by the cytometer (spectral or conventional).

      Perhaps incorrectly, we omitted presenting data that included the results with NP-APC-staining - in part because within the GC B cell gate the frequencies of NP-binding events (GCB cells) were similar in double-knockout samples and controls. In practice, that would mean that the metabolic requirement applied about equally to NP+ and the total population. We will try to rectify this point in the revision.

      (4) Along these lines, the conclusions in Figure 6a-d may need to be tempered if the analysis was done on polyclonal, rather than antigen-specific cells. Alum induces a heavily type 2-biased response and is not known to induce much of an interferon signature. The authors' observations might be explained by the inclusion of other ongoing GCs unrelated to the immunization. 

      We will make sure the text is clear that the in vitro experiments do not represent GC B cells and that the RNA-seq data were not an Ag (SRBC)-specific subset.

      We also will try to work in a schematic along with expanding the Legends to make it more readily clear that the RNA-seq data (and hence the GSEA) involved immunizations with SRBC (not the alum / NP system which - it may be noted - in these experiments actually generated a robust IgG2c (type 1-driven) response along with the type 2-enhanced IgG1 response.

      Reviewer #3 (Public review): 

      Summary: 

      In their manuscript, the authors investigate how glutaminolysis (GLS) and mitochondrial pyruvate import (MPC2) jointly shape B cell fate and the humoral immune response. Using inducible knockout systems and metabolic inhibitors, they uncover a "synthetic auxotrophy": When GLS activity/glutaminolysis is lost together with either GLUT1-mediated glucose uptake or MPC2, B cells fail to upregulate mitochondrial respiration, IL 21/STAT3 and IFN/STAT1 signaling is impaired, and the plasma cell output and antigen-specific antibody titers drop significantly. This work thus demonstrates the promotion of plasma cell differentiation and cytokine signaling through parallel activation of two metabolic pathways. The dataset is technically comprehensive and conceptually novel, but some aspects leave the in vivo and translational significance uncertain.

      Strengths:

      (1) Conceptual novelty: the study goes beyond single-enzyme deletions to reveal conditional metabolic vulnerabilities and fate-deciding mechanisms in B cells.

      (2) Mechanistic depth: the study uncovers a novel "metabolic bottleneck" that impairs mitochondrial respiration and elevates ROS, and directly ties these changes to cytokine-receptor signaling. This is both mechanistically compelling and potentially clinically relevant.

      (3) Breadth of models and methods: inducible genetics, pharmacology, metabolomics, seahorse assay, ELISpot/ELISA, RNA-seq, two immunization models.

      (4) Potential clinical angle: the synergy of CB839 with UK5099 and/or hydroxychloroquine hints at a druggable pathway targeting autoantibody-driven diseases.

      We agree and thank the referee for the positive comments and this succinct summary of what we view as contributions of the paper.

      Weaknesses: 

      (1) Physiological relevance of "synthetic auxotrophy"

      The manuscript demonstrates that GLS loss is only crippling when glucose influx or mitochondrial pyruvate import is concurrently reduced, which the authors name "synthetic auxotrophy". I think it would help readers to clarify the terminology more and add a concise definition of "synthetic auxotrophy" versus "synthetic lethality" early in the manuscript and justify its relevance for B cells.

      We will edit the Abstract, Introduction, and Discussion to try to do better on this score. Conscious of how expansive the prose and data are even in the original submission, we appear to have taken some shortcuts that we will try to rectify. Thank you for highlighting this need to improve on a key concept!

      That said, we punctiliously & perhaps pedantically encourage readers to be completely accurate, in that under one condition of immunization GLS loss substantially reduced the anti-ovalbumin response (Fig. 1, Fig. 2a-c). And for this provisional response, we will expand a bit on the notion that synthetic auxotrophy represents effects on differentiation that appear to go beyond and not simply to be selective death, even though decreased population expansion is observed and one cannot exclude some contribution of enhanced death in vivo. Finally, we will note that this comment of the review raises interesting semantic questions about what represents "physiological relevance" but leave it at that.

      While the overall findings, especially the subset specificity and the clinical implications, are generally interesting, the "synthetic auxotrophy" condition feels a little engineered.

      One can readily say that CAR-T cells are 'a little engineered' so it is a matter of balancing this perspective of the referee against the strengths they highlight in points 1, 2, and 4. In any case, we will probably try to expand and be more explicit in the Discussion of the revised manuscript.

      In brief, even were the money not all gone, we would not believe that expanding the heft of this already rather large manuscript and set of data would be appropriate. As matters stand, a basic new insight about metabolic flexibility and its limits leads to evidence of a way to reduce generation of Ab and a novel impairment of STAT transcription factor induction by several cytokine receptors. The vulnerability that could be tested in later work on B cell-dependent autoimmunity includes the capacity to test a compound that already has been to or through FDA phase II in patients together with an FDA-approved standard-of-care agent.

      Put a different way, the point is that a basic curiosity to understand why decreasing glucose influx did not have an even more profound effect than what was observed, combined with curiosity as to why glutaminolysis was dispensable in relatively standard vaccine-like models of immunize / boost, provided a springboard to identification of new vulnerabilities. As above, we appreciate being made aware that this point merits being made more explicit in the Discussion of the edited version.

      Therefore, the findings strongly raise the question of the likelihood of such a "double hit" in vivo and whether there are conditions, disease states, or drug regimens that would realistically generate such a "bottleneck".

      Hence, the authors should document or at least discuss whether GC or inflamed niches naturally show simultaneous downregulation/lack of glutamine and/or pyruvate. The authors should also aim to provide evidence that infections (e.g., influenza), hypoxia, treatments (e.g., rapamycin), or inflammatory diseases like lupus co-limit these pathways. 

      Again, we appreciate some 'licensing' to be more expansive and explicit, and will try to balance editing in such points against undue tedium or tendentiously speculative length in the Discussion. In particular, we will note that a clear, simple implication of the work is to highlight an imperative to test CB839 in lupus patients already on hydroxychloroquine as standard-of-care, and to suggest development of UK5099 (already tested many times in mouse models of cancer) to complement glutaminase inhibition. 

      As backdrop, we note that the failure to advance imaging mass spectrometry to the capacity to quantify relative or absolute (via nano-DESI) concentrations of nutrients in localized interstitia is a critical gap in the entire field. Techniques that sample the interstitial fluid of tumor masses or in our case LN as a work-around have yielded evidence that there can be meaningful limitations of glucose and glutamine, but it needs to be acknowledged that such findings may be very model-specific and, as can be the case with cutting-edge science, are not without controversy. That said, yes, we had found that hypoxia reduced glutamine uptake but given the norms of focused, tidy packages only reported on leucine in an earlier paper [PMID27501247; PMCID5161594].

      It would hence also be beneficial to test the CB839 + UK5099/HCQ combinations in a short, proof-of-concept treatment in vivo, e.g., shortly before and after the booster immunization or in an autoimmune model. Likewise, it may also be insightful to discuss potential effects of existing treatments (especially CB839, HCQ) on human memory B cell or PC pools.

      We certainly agree that the suggestions offered in this comment are important next steps and the right approach to test if the findings reported here translate toward the treatment of autoimmune diseases that involve B cells, interferons, and pathophysiology mediated by auto-Ab. As practical points, performance and replication of such studies would take more time than the year allotted for return of a revised manuscript to eLife and in any case neither funds nor a lab remain to do these important studies. 

      Concrete evidence for our concurrence was embodied in a grant application to NIH that was essential for keeping a lab and doing any such studies. [We note, as a suggestion to others, that an essential component of such studies would be to test the effects of these compounds on B cells from patients and mice with autoimmunity]. Perhaps unfortunately for SLE patients, the review panelists did not agree about the importance of such studies. However, it can be hoped that the patent-holder of CB839 (and perhaps other companies developing glutaminase inhibitors) will see this peer-reviewed pre-print and the public dialogue, and recognize how positive results might open a valuable contribution to mitigation of diseases such as SLE.

      (2) Cell survival versus differentiation phenotype

      Claims that the phenotypes (e.g., reduced PC numbers) are "independent of death" and are not merely the result of artificial cell stress would benefit from Annexin-V/active-caspase 3 analyses of GC B cells and plasmablasts. Please also show viability curves for inhibitor-treated cell

      This comment leads us to see that the wording on this point may have been overly terse in the interests of brevity, and thereby open to some misunderstanding. Accordingly, we will expand out the text of the Abstract and elsewhere in the manuscript, to be more clear. In addition, we will add in some data on the point, hopefully including some results of new experiments.

      To clarify in this public context, it is not that an increase in death (along with the reported decrease in cell cycling) can be or is excluded - and in fact it likely exists in vitro. The point is that beyond any such increase, and taking into account division number (since there is evidence that PC differentiation and output numbers involve a 'division-counting' mechanism), the frequencies of CD138+ cells and of ASCs among the viable cells are lower, as is the level of Prdm1-encoded mRNA even before the big increase in CD138+ cells in the population. 

      (3) Subset specificity of the metabolic phenotype

      Could the metabolic differences, mitochondrial ROS, and membrane-potential changes shown for activated pan-B cells (Figure 5) also be demonstrated ex vivo for KO mouse-derived GC B cells and plasma cells? This would also be insightful to investigate following NP-immunization (e.g., NP+ GC B cells 10 days after NP-OVA immunization).

      We agree that such data could be nice and add to the comprehensiveness of the work. We will try to scrounge the resources (time; money; human) to test this roughly as indicated. That said, we would note that the frequencies and hence numbers of NP+ GC B cells are so low that even in the flow cytometer we suspect there will not be enough "events" to rely on the results with DCFDA in the tiny sub-sub-subset. It also bears noting that reliable flow cytometric identification of the small NP-specific plasmablast/plasma cell subset amidst the overall population, little of which arose from immunization or after deletion of the floxed segments in B cells, would potentially be misleading.

      (4) Memory B cell gating strategy

      I am not fully convinced that the memory-B-cell gate in Supplementary Figure 2d is appropriate. The legend implies the population is defined simply as CD19+GL7-CD38+ (or CD19+CD38++?), with no further restriction to NP-binding cells. Such a gate could also capture naïve or recently activated B cells. From the descriptions in the figure and the figure legend, it is hard to verify that the events plotted truly represent memory B cells. Please clarify the full gating hierarchy and, ideally, restrict the MBC gate to NP+CD19+GL7-CD38+ B cells (or add additional markers such as CD80 and CD273). Generally, the manuscript would benefit from a more transparent presentation of gating strategies.

      We will further expand the supplemental data displays to include more of the gating and analytic scheme, and hope to be able to have performed new experiments and analyses (including additional markers) that could mitigate the concern noted here. In addition, we will include flow data from the non-immunized control mice that had been analyzed concurrently in the experiments illustrated in this Figure.

      Although it should be noted that the labeling indicated that the gating included the important criterion that cells be IgD- (Supplemental Fig. 2b), which excludes the vast majority of naive B cells, in principle marginal zone (MZ) B cells might fall within this gate. However, the MZ B population is unlikely to explain the differences shown in Supplemental Fig. 2b-d.

      (5) Deletion efficiency - [The] mRNA data show residual GLS/MPC2 transcripts (Supplementary Figure 8). Please quantify deletion efficiency in GC B cells and plasmablasts.

      Even were there resources to do this, the degree of reduction in target mRNA (Gls; Mpc2) renders this question superfluous.

      Are there likely to be some cells with only one, or even neither, allele converted from fl to D? Yes, but they would be a minor subset in light of the magnitude of mRNA reduction, in contrast to our published observations with Slc2a1. As to plasmablasts and plasma cells, the pre-existing populations make such an analysis misleading, while the scarcity of such cells recoverable with antigen capture techniques is so low as to make both RNA and genomic DNA analyses questionable.

    1. South Korean esports ecosystem

      Not just esports, think K-pop boybands. And arguably, some of these may surely find themselves in different origins, like sports clubs and training camps (think La Masia from FCB). Also, this is missing an intersectional approach considering how gender and disability fit into this MASCULINE landscape.

    1. À la marge, la marque diversifie sa production, un partenariat avec Loro Piana (avant son rachat par le groupe LVMH) a permis de produire d'élégants coupe-vent et pantalons - une opération pour laquelle son épouse, Lucretia, s'est investie -, des casques ont été conçus avec Salomon, une paire de skis a été mise au point avec le constructeur automobile Bentley, de la même façon Zai a réussi une collaboration avec l'horloger Hublot… Mais Simon Jacomet reste concentré sur le ski. Une question de conviction.

      Produire d'autres items en faisant des partenariats avec d'autres marques de luxe notamment le made in france et associée à nos valeurs comme la **Maison Bernard Solfin **

    2. D'anciens champions, des accros au hors-piste dans la poudreuse, d'autres en quête de puissance sous la chaussure (ils seront servis). Pour ceux qui ont des demandes spécifiques, l'atelier conçoit des planches à l'unité. Tous peuvent renvoyer leur paire pour une rénovation. La Fédération internationale de ski vient de faire de Zai le partenaire officiel de la station Vail, aux États-Unis.

      • possibilité de produire uniquement pour certaines personnes avec des demandes spécifiques.

      • faire des partenariats avec des stations en tant que partenaires officiels.