6,784 Matching Annotations
  1. Feb 2023
    1. Conversation Hey, JB, I played a pickup game at the Rec today. At first, the older guys laughed and wouldn’t let me in unless I could hit from half-court . . . Of course, I did. All net. I wait for JB to say something, but he just smiles, his eyes all moony. I showed them guys how the Bells ball. I scored fourteen points. They told me I should try out for junior varsity next year ’cause I got hops . . . JB, are you listening? JB nods, his fingers tapping away on the computer, chatting probably with Miss Sweet Tea. I told the big guys about you, too. They said we could come back and run with them anytime. What do you think about that? HELLO—Earth to JB? Even though I know he hears me, the only thing JB is listening to is the sound of his heart bouncing on the court of love.

      Conversation Dad, this girl is making Jordan act weird. He’s here, but he’s not. He’s always smiling. His eyes get all spacey whenever she’s around, and sometimes when she’s not. He wears your cologne. He’s always texting her. He even wore loafers to school. Dad, you gotta do something. Dad does something. He laughs. Filthy, talking to your brother right now would be like pushing water uphill with a rake, son. This isn’t funny, Dad. Say something to him. Please. Filthy, if some girl done locked up JB, he’s going to jail. Now let’s go get some doughnuts.

      Basketball Rule #5 When you stop playing your game you’ve already lost.

      Showoff UP by sixteen with six seconds showing, JB smiles, then STRUTS side steps stutters Spins, and SI NKS a sick SLICK SLIDING SWeeeeeeeeeeT SEVEN-foot shot. What a showoff.

      Out of Control Are you kidding me? Come on. Ref, open your eyes. Ray Charles could have seen that kid walked. CALL THE TRAVELING VIOLATION! You guys are TERRIBLE! Mom wasn’t at the game tonight, which meant that all night Dad was free to yell at the officials, which he did.

      Mom calls me into the kitchen after we get home from beating St. Francis. Normally she wants me to sample the macaroni and cheese to make sure it’s cheesy enough, or the oven-baked fried chicken to make sure it’s not greasy and stuff, but today on the table is some gross-looking orange creamy dip with brown specks in it. A tray of pita-bread triangles is beside it. Maybe Mom is having one of her book club meetings. Sit down, she says. I sit as far away from the dip as possible. Maybe the chicken is in the oven. Where is your brother? she asks. Probably on the phone with that girl. She hands me a pita. No thanks, I say, then stand up to leave, but she gives me a look that tells me she’s not finished with me. Maybe the mac is in the oven. We’ve talked to you two about your grandfather, she says. He was a good man. I’m sorry you never got to meet him, Josh. Me too, he looked cool in his uniforms. That man was way past cool. Dad said he used to curse a lot and talk about the war. Mom’s laugh is short, then she’s serious again. I know we told you Grandpop died after a fall, but the truth is he fell because he had a stroke. He had a heart disease. Too many years of bad eating and not taking care of himself and so— What does this have to do with anything? I ask, even though I think I already know. Well, our family has a history of heart problems, she says, so we’re going to start eating better. Especially Dad. And we’re going to start tonight with some hummus and pita bread. FOR MY VICTORY DINNER? Josh, we’re going to try to lay off the fried foods and Golden Dragon. And when your dad takes you to the recreation center, no Pollard’s or Krispy Kreme afterward, understand? And I understand more than she thinks I do. But is hummus really the answer?

      35–18 is the final score of game six. A local reporter asks JB and I how we got so good. Dad screams from behind us, They learned from Da Man! The crowd of parents and students behind us laughs. On the way home Dad asks if we should stop at Pollard’s. I tell him I’m not hungry, plus I have a lot of homework, even though I skipped lunch today and finished my homework during halftime.

      Too Good Lately, I’ve been feeling like everything in my life is going right: I beat JB in Madden. Our team is undefeated. I scored an A+ on the vocabulary test. Plus, Mom’s away at a conference, which means so is the Assistant Principal. I am a little worried, though, because, as Coach likes to say, you can get used to things going well, but you’re never prepared for something going wrong.

      I’m on Free Throw Number Twenty-Seven We take turns, switching every time we miss. JB has hit forty-one, the last twelve in a row. Filthy, keep up, man, keep up, he says. Dad laughs loud, and says, Filthy, your brother is putting on a free-throw clinic. You better— And suddenly he bowls over, a look of horror on his face, and starts coughing while clutching his chest, only no sound comes. I freeze. JB runs over to him. Dad, you okay? he asks. I still can’t move. There is a stream of sweat on Dad’s face. Maybe he’s overheating, I say. His mouth is curled up like a little tunnel. JB grabs the water hose, turns the faucet on full blast, and sprays Dad. Some of it goes in Dad’s mouth. Then I hear the sound of coughing, and Dad is no longer leaning against the car, now he’s moving toward the hose, and laughing. So is JB. Then Dad grabs the hose and sprays both of us. Now I’m laughing too, but only on the outside.

      He probably just got something stuck in his throat, JB says when I ask him if he thought Dad was sick and shouldn’t we tell Mom what happened. So, when the phone rings, it’s ironic that after saying hello, he throws the phone to me, because, even though his lips are moving, JB is speechless, like he’s got something stuck in his throat.

      i·ron·ic [AY-RON-IK] adjective Having a curious or humorous unexpected sequence of events marked by coincidence. As in: The fact that Vondie hates astronomy and his mom works for NASA is ironic. As in: It’s not ironic that Grandpop died in a hospital and Dad doesn’t like doctors. As in: Isn’t it ironic that showoff JB, with all his swagger, is too shy to talk to Miss Sweet Tea, so he gives me the phone?

      This Is Alexis—May I Please Speak to Jordan? Identical twins are no different from everyone else, except we look and sometimes sound exactly alike.

      Phone Conversation (I Sub for JB) Was that your brother? Yep, that was Josh. I’m JB. I know who you are, silly—I called you. Uh, right. You have any siblings, Alexis? Two sisters. I’m the youngest. And the prettiest. You haven’t seen them. I don’t need to. That’s sweet. Sweet as pomegranate. Okay, that was random. That’s me. Jordan, can I ask you something? Yep. Did you get my text? Uh, yeah. So, what’s your answer? Uh, my answer. I don’t know. Stop being silly, Jordan. I’m not. Then tell me your answer. Are y’all rich? I don’t know. Didn’t your dad play in the NBA? No, he played in Italy. But still, he made a lot of money, right? It’s not like we’re opulent. Who says “opulent”? I do. You never use big words like that at school . . . I have a reputation to uphold. Is he cool? Who? Your dad. Very. So, when are you gonna introduce me? Introduce you? To your parents. I’m waiting for the right moment. Which is when? Uh— So, am I your girlfriend or not? Uh, can you hold on for a second? Sure, she says. Cover the mouthpiece, JB mouths to me. I do, then whisper to him: She wants to know are you her boyfriend. And when are you gonna introduce her to Mom and Dad. What should I tell her, JB? Tell her yeah, I guess, I mean, I don’t know. I gotta pee, JB says, running out of the room, leaving me still in his shoes. Okay, I’m back, Alexis. So, what’s the verdict, Jordan? Do you want to be my girlfriend? Are you asking me to be your girl? Uh, I think so. You think so? Well, I have to go now. Yes. Yes, what? I like you. A lot. I like you, too . . . Precious. So, now I’m Precious? Everyone calls you JB. Then I guess it’s official. Text me later. Good night, Miss Sweet— What did you call me? Uh, good night, my sweetness. Good night, Precious. JB comes running out of the bathroom. What’d she say, Josh? Come on, tell me. She said she likes me a lot, I tell him. You mean she likes me a lot? he asks. Yeah . . . that’s what I meant.

      JB and I eat lunch together every day, taking bites of Mom’s tuna salad on wheat between arguments: Who’s the better dunker, Blake or LeBron? Which is superior, Nike or Converse? Only today I wait at our table in the back for twenty-five minutes, texting Vondie (home sick), eating a fruit cup (alone), before I see JB strut into the cafeteria with Miss Sweet Tea holding his precious hand.

      Boy walks into a room with a girl. They come over. He says, Hey, Filthy McNasty like he’s said forever, but it sounds different this time, and when he snickers, she does too, like it’s some inside joke, and my nickname, some dirty punch line.

      At practice Coach says we need to work on our mental game. If we think we can beat Independence Junior High— the defending champions, the number one seed, the only other undefeated team— then we will. But instead of drills and sprints, we sit on our butts, make weird sounds— Ohmmmmmmmm Ohmmmmmmmm— and meditate. Suddenly I get this vision of JB in a hospital. I quickly open my eyes, turn around, and see him looking dead at me like he’s just seen a ghost.

      Second-Person After practice, you walk home alone. This feels strange to you, because as long as you can remember there has always been a second person. On today’s long, hot mile, you bounce your basketball, but your mind is on something else. Not whether you will make the playoffs. Not homework. Not even what’s for dinner. You wonder what JB and his pink Reebok–wearing girlfriend are doing. You do not want to go to the library. But you go. Because your report on The Giver is due tomorrow. And JB has your copy. But he’s with her. Not here with you. Which is unfair. Because he doesn’t argue with you about who’s the greatest, Michael Jordan or Bill Russell, like he used to. Because JB will not eat lunch with you tomorrow or the next day, or next week. Because you are walking home by yourself and your brother owns the world.

      Third Wheel You walk into the library, glance over at the music section. You look through the magazines. You even sit at a desk and pretend to study. You ask the librarian where you can find The Giver. She says something odd: Did you find your friend? Then she points upstairs. On the second floor, you pass by the computers. Kids checking their Facebook. More kids in line waiting to check their Facebook. In the Biography section you see an old man reading The Tipping Point. You walk down the last aisle, Teen Fiction, and come to the reason you’re here. You remove the book from the shelf. And there, behind the last row of books, you find the “friend” the librarian was talking about. Only she’s not your friend and she’s kissing your brother.

      tip·ping point [TIH-PING POYNT] noun The point when an object shifts from one position into a new, entirely different one. As in: My dad says the tipping point of our country’s economy was housing gamblers and greedy bankers. As in: If we get one C on our report cards, I’m afraid Mom will reach her tipping point and that will be the end of basketball. As in: Today at the library, I went upstairs, walked down an aisle, pulled The Giver off the shelf, and found my tipping point.

      The main reason I can’t sleep is not because of the game tomorrow tonight, is not because the stubble on my head feels like bugs are break dancing on it, is not even because I’m worried about Dad. The main reason I can’t sleep tonight is because Jordan is on the phone with Miss Sweet Tea and between the giggling and the breathing he tells her how much she’s the apple of his eye and that he wants to peel her and get under her skin and give me a break. I’m still hungry and right about now I wish I had an apple of my own.

      Surprised I have it all planned out. When we walk to the game I will talk to JB man to man about how he’s spending way more time with Alexis than with me and Dad. Except when I hear the horn, I look outside my window and it’s raining and JB is jumping into a car with Miss Sweet Tea and her dad, ruining my plan.

      Conversation In the car I ask Dad if going to the doctor will kill him. He tells me he doesn’t trust doctors, that my grandfather did and look where it got him: six feet under at forty-five. But Mom says your dad was really sick, I tell him, and Dad just rolls his eyes, so I try something different. I tell him that just because your teammate gets fouled on a lay-up doesn’t mean you shouldn’t ever drive to the lane again. He looks at me and laughs so loud, we almost don’t hear the flashing blues behind us.

      Game Time: 6:00 p.m. At 5:28 p.m. a cop pulls us over because Dad has a broken taillight. At 5:30 the officer approaches our car and asks Dad for his driver’s license and registration. At 5:32 the team leaves the locker room and pregame warm-ups begin without me. At 5:34 Dad explains to the officer that his license is in his wallet, which is in his jacket at home. At 5:37 Dad says, Look, sir, my name is Chuck Bell, and I’m just trying to get my boy to his basketball game. At 5:47 while Coach leads the Wildcats in team prayer, I pray Dad won’t get arrested. At 5:48 the cop smiles after verifying Dad’s identity on Google, and says, You “Da Man”! At 5:50 Dad autographs a Krispy Kreme napkin for the officer and gets a warning for his broken taillight. At 6:01 we arrive at the game but on my sprint into the gym I slip and fall in the mud.

      This is my second year playing for the Reggie Lewis Wildcats and I’ve started every game until tonight, when Coach tells me to go get cleaned up then find a seat on the bench. When I try to tell him it wasn’t my fault, he doesn’t want to hear about sirens and broken taillights. Josh, better an hour too soon than a minute too late, he says, turning his attention back to JB and the guys on the court, all of whom are pointing and laughing at me.

      Basketball Rule #6 A great team has a good scorer with a teammate who’s on point and ready to assist.

      Josh’s Play-by-Play At the beginning of the second half we’re up twenty-three to twelve. I enter the game for the first time. I’m just happy to be back on the floor. When my brother and I are on the court together this team is unstoppable, unfadeable. And, yes, undefeated. JB brings the ball up the court. Passes the ball to Vondie. He shoots it back to JB. I call for the ball. JB finds me in the corner. I know y’all think it’s time for the pick-and-roll, but I got something else in mind. I get the ball on the left side. JB is setting the pick. Here it comes— I roll to his right. The double-team is on me, leaving JB free. He’s got his hands in the air, looking for the dish from me. Dad likes to say, When Jordan Bell is open you can take his three to the bank, cash it in, ’cause it’s all money. Tonight, I’m going for broke. I see JB’s still wide open. McDonald’s drive-thru open. But I got my own plans. The double-team is still on me like feathers on a bird. Ever seen an eagle soar? So high, so fly. Me and my wings are— and that’s when I remember: MY. WINGS. ARE. GONE. Coach Hawkins is out of his seat. Dad is on his feet, screaming. JB’s screaming. The crowd’s screaming, FILTHY, PASS THE BALL! The shot clock is at 5. I dribble out of the double-team. 4 Everything comes to a head. 3I see Jordan. 2 You want it that bad? HERE YA GO! 1 . . .

      Before Today, I walk into the gym covered in more dirt than a chimney. When JB screams FILTHY’S McNasty, the whole team laughs. Even Coach. Then I get benched for the entire first half. For being late. Today, I watch as we take a big lead, and JB makes four threes in a row. I hear the crowd cheer for JB, especially Dad and Mom. Then I see JB wink at Miss Sweet Tea after he hits a stupid free throw. Today, I finally get into the game at the start of the second half. JB sets a wicked pick for me just like Coach showed us in practice, And I get double-teamed on the roll just like we expect. Today, I watch JB get open and wave for me to pass. Instead I dribble, trying to get out of the trap, and watch as Coach and Dad scream for me to pass. Today, I plan on passing the ball to JB, but when I hear him say “FILTHY, give me the ball,” I dribble over to my brother and fire a pass so hard, it levels him, the blood from his nose still shooting long after the shotclock buzzer goes off.

    1. Author Response

      Reviewer #1 (Public Review):

      It is now widely accepted that the age of the brain can differ from the person's chronological age and neuroimaging methods are ideally suited to analyze the brain age and associated biomarkers. Preclinical studies of rodent models with appropriate neuroimaging do attest that lifestyle-related prevention approaches may help to slow down brain aging and the potential of BrainAGE as a predictor of age-related health outcomes. However, there is a paucity of data on this in humans. It is in this context the present manuscript receives its due attention.

      Comments:

      1) Lifestyle intervention benefits need to be analyzed using robust biomarkers which should be profiled non-invasively in a clinical setting. There is increasing evidence of the role of telomere length in brain aging. Gampawar et al (2020) have proposed a hypothesis on the effect of telomeres on brain structure and function over the life span and named it as the "Telomere Brain Axis". In this context, if the authors could measure telomere length before and after lifestyle intervention, this will give a strong biomarker utility and value addition for the lifestyle modification benefits. 2) Authors should also consider measuring BDNF levels before and after lifestyle intervention.

      Response to comments 1+2: we agree that associating both telomere length and BDNF level with brain age would be interesting and relevant. However, we did not measure these two variables. We would certainly consider adding these in future work. Regarding telomere length, we now include a short discussion of brain age in relation to other bodily ages, such as telomere length (Discussion section):

      “Studying changes in functional brain aging is part of a broader field that examines changes in various biological ages, such as telomere length1, DNA methylation2, and arterial stiffness3. Evaluating changes in these bodily systems over time allows us to capture health and lifestyle-related factors that affect overall aging and may guide the development of targeted interventions to reduce age-related decline. For example, in the CENTRAL cohort, we recently reported that reducing body weight and intrahepatic fat following a lifestyle intervention was related to methylation age attenuation4. In the current work, we used RSFC for brain age estimation, which resulted in a MAE of ~8 years, which was larger than the intervention period. Nevertheless, we found that brain age attenuation was associated with changes in multiple health factors. The precision of an age prediction model based on RSFC is typically lower than a model based on structural brain imaging5. However, a higher model precision may result in a lower sensitivity to detect clinical effects6,7. Better tools for data harmonization among dataset6 and larger training sample size5 may improve the accuracy of such models in the future. We also suggest that examining the dynamics of multiple bodily ages and their interactions would enhance our understanding of the complex aging process8,9. “

      And

      “These findings complement the growing interest in bodily aging indicated, for example, by DNA methylation4 as health biomarkers and interventions that may affect them.”

      Reviewer #2 (Public Review):

      In this study, Levakov et al. investigated brain age based on resting-state functional connectivity (RSFC) in a group of obese participants following an 18-month lifestyle intervention. The study benefits from various sophisticated measurements of overall health, including body MRI and blood biomarkers. Although the data is leveraged from a solid randomized control set-up, the lack of control groups in the current study means that the results cannot be attributed to the lifestyle intervention with certainty. However, the study does show a relationship between general weight loss and RSFC-based brain age estimations over the course of the intervention. While this may represent an important contribution to the literature, the RSFC-based brain age prediction shows low model performance, making it difficult to interpret the validity of the derived estimates and the scale of change. The study would benefit from more rigorous analyses and a more critical discussion of findings. If incorporated, the study contributes to the growing field of literature indicating that weight-reduction in obese subjects may attenuate the detrimental effect of obesity on the brain.

      The following points may be addressed to improve the study:

      Brain age / model performance:

      1) Figure 2: In the test set, the correlation between true and predicted age is 0.244. The fitted slope looks like it would be approximately 0.11 (55-50)/(80-35); change in y divided by change in x. This means that for a chronological age change of 12 months, the brain age changes by 0.11*12 = 1.3 months. I.e., due to the relatively poor model performance, an 80-year-old participant in the plot (fig 2) has a predicted age of ~55. Hence, although the age prediction step can generate a summary score for all the RSFC data, it can be difficult to interpret the meaning of these brain age estimates and the 'expected change' since the scale is in years.

      2) In Figure 2 it could also help to add the x = y line to get a better overview of the prediction variance. The estimates are likely clustered around the mean/median age of the training dataset, and age is overestimated in younger subs and overestimated in older subs (usually referred to as "age bias"). It is important to inspect the data points here to understand what the estimates represent, i.e., is variation in RSFC potentially lost by wrapping the data in this summary measure, since the age prediction is not particularly accurate, and should age bias in the predictions be accounted for by adjusting the test data for the bias observed in the training data?

      Response to comment 1+2: we agree with the reviewer that due to the relatively moderate correlation between the predicted and observed age, a large change in the observed age corresponds to a small change in the predicted age. We now state this limitation in Results section 2.1:

      “Despite being significant and reproducible, we note that the correlations between the observed and predicted age were relatively moderate.”

      And discuss this point in the Discussion section:

      “In the current work, we used RSFC for brain age estimation, which resulted in a MAE of ~8 years, which was larger than the intervention period. Nevertheless, we found that brain age attenuation was associated with changes in multiple health factors. The precision of an age prediction model based on RSFC is typically lower than a model based on structural brain imaging5. However, a higher model precision may result in a lower sensitivity to detect clinical effects6,7. Better tools for data harmonization among dataset6 and larger training sample size5 may improve the accuracy of such models in the future.”

      Moreover, , we now add the x=y line to Fig. 2, so the readers can better assess the prediction variance as suggested by the reviewer:

      We prefer to avoid using different scales (year/month) in the x and y axes to avoid misleading the readers, but the list of observed and predicted ages are available as SI files with a precision of 2 decimals point (~3 days).

      We note that despite the moderate precision accuracy, we replicated these results in three separate cohorts.

      Regarding the effect of “age bias” (also known as “regression attenuation” or “regression dilution” 10), we are aware of this phenomenon and agree that it must be accounted for. In fact, the “age bias” is one of the reasons we chose to use the difference between the expected and observed ages as the primary outcome of the study, as this measure already takes this bias into account. To demonstrate this effect we now compute brain age attenuation in two ways: 1. As described and used in the current study (Methods 4.9); and 2. By regressing out the effect of age on the predicted brain age at both times separately, then subtracting the adjusted predicted age at T18 from the adjusted predicted age at T0. The second method is the standard method to account for age bias as described in a previous work 11. Below is a scatter plot of both measures across all participants:

      The x-axis represents the first method, used in the current study, and the y-axis represents the second method, described in Smith et al., (2019). Across all subjects, we found a nearly perfect 1:1 correspondence between the two methods (r=.998, p<0.001; MAE=0.45), as the two are mathematically identical. The small gap between the two is because the brain age attenuation model also takes into account the difference in the exact time that passed between the two scans for each participant (mean=21.36m, std = 1.68m).

      We now note this in Methods section 4.9:

      “We note that the result of computing the difference between the bias-corrected brain age gap at both times was nearly identical to the brain age attenuation measure (r=.99, p<0.001; MAE=0.45). The difference between the two is because the brain age attenuation model takes into account the difference in the exact time that passed between the two scans for each participant (mean=21.36m, std = 1.68m).”

      3) In Figure 3, some of the changes observed between time points are very large. For example, one subject with a chronological age of 62 shows a ten-year increase in brain age over 18 months. This change is twice as large as the full range of age variation in the brain age estimates (average brain age increases from 50 to 55 across the full chronological age span). This makes it difficult to interpret RSFC change in units of brain age. E.g., is it reasonable that a person's brain ages by ten years, either up or down, in 18 months? The colour scale goes from -12 years to 14 years, so some of the observed changes are 14 / 1.5 = 9 times larger than the actual time from baseline to follow-up.

      We agree that our model precision was relatively low, especially compared to the period of the intervention, as also stated by reviewer #1. We now discuss this issue in light of the studies pointed out by the reviewer (Discussion section):

      “In the current work, we used RSFC for brain age estimation, which resulted in a MAE of ~8 years, which was larger than the intervention period. Nevertheless, we found that brain age attenuation was associated with changes in multiple health factors. The precision of an age prediction model based on RSFC is typically lower than a model based on structural brain imaging5. However, a higher model precision may result in a lower sensitivity to detect clinical effects6,7. Better tools for data harmonization among datasets6 and larger training sample size5 may improve the accuracy of such models in the future.”

      Again, we note that despite the moderate precision accuracy, we replicated these results in three separate cohorts and found that both the correlation and the MAE between the predicted and observed age were significant in all of them.

      RSFC for age prediction:

      1) Several studies show better age prediction accuracy with structural MRI features compared to RSFC. If the focus of the study is to use an accurate estimate of brain ageing rather than specifically looking at changes in RSFC, adding structural MRI data could be helpful.

      We focused on brain structural changes in a previous work, and the focus of the current work was assessing age-related functional connectivity alterations. We now added a few sentences in the Introduction section that would hopefully better motivate our choice:

      “We previously found that weight loss, glycemic control, lowering of blood pressure, and increment in polyphenols-rich food were associated with an attenuation in brain atrophy 12. Obesity is also manifested in age-related changes in the brain’s functional organization as assessed with resting-state functional connectivity (RSFC). These changes are dynamic13 and can be observed in short time scales14 and thus of relevance when studying lifestyle intervention.”

      2) If changes in RSFC are the main focus, using brain age adds a complicated layer that is not necessarily helpful. It could be easier to simply assess RSFC change from baseline to follow up, and correlate potential changes with changes in e.g., BMI.

      We are specifically interested in age-related changes as we described a-priori in the registration of the study: https://clinicaltrials.gov/ct2/show/NCT03020186

      Moreover, age-related changes in RSFC are complex, multivariate and dependent upon the choice of theoretical network measures. We think that a data-driven brain age prediction approach might better capture these multifaceted changes and their relation to aging. We now state this in the Introduction section:

      “Studies have linked obesity with decreased connectivity within the default mode network15,16 and increased connectivity with the lateral orbitofrontal cortex17, which are also seen in normal aging18,19. Longitudinal trials have reported changes in these connectivity patterns following weight reduction20,21, indicating that they can be altered. However, findings regarding functional changes are less consistent than those related to anatomical changes due to the multiple measures22 and scales23 used to quantify RSFC. Hence, focusing on a single measure, the functional brain age, may better capture these complex, multivariant changes and their relation to aging. “

      The lack of control groups

      1) If no control group data is available, it is important to clarify this in the manuscript, and evaluate which conclusions can and cannot be drawn based on the data and study design.

      We agree that this point should be made more clear, and we now state this in the limitation section of the Discussion:

      “We also note that the lack of a no-intervention control group limits our ability to directly relate our findings to the intervention. Hence, we can only relate brain age attenuation to the observed changes in health biomarkers.”

      Also, following reviewers’ #2 and #3 comments, we refer to the weight loss following 18 months of lifestyle intervention instead of to the intervention itself. This is now made clear in the title, abstract, and the main text.

      Reviewer #3 (Public Review):

      The authors report on an interesting study that addresses the effects of a physical and dietary intervention on accelerated/decelerated brain ageing in obese individuals. More specifically, the authors examined potential associations between reductions in Body-Mass-Index (BMI) and a decrease in relative brain-predicted age after an 18-months period in N = 102 individuals. Brain age models were based on resting-state functional connectivity data. In addition to change in BMI, the authors also tested for associations between change in relative brain age and change in waist circumference, six liver markers, three glycemic markers, four lipid markers, and four MRI fat deposition measures. Moreover, change in self-reported consumption of food, stratified by categories such as 'processed food' and 'sweets and beverages', was tested for an association with change in relative brain age. Their analysis revealed no evidence for a general reduction in relative brain age in the tested sample. However, changes in BMI, as well as changes in several liver, glycemic, lipid, and fat-deposition markers showed significant covariation with changes in relative brain age. Three markers remained significant after additionally controlling for BMI, indicating an incremental contribution of these markers to change in relative brain age. Further associations were found for variables of subjective food consumption. The authors conclude that lifestyle interventions may have beneficial effects on brain aging.

      Overall, the writing is concise and straightforward, and the langue and style are appropriate. A strength of the study is the longitudinal design that allows for addressing individual accelerations or decelerations in brain aging. Research on biological aging parameters has often been limited to cross-sectional analyses so inferences about intra-individual variation have frequently been drawn from inter-individual variation. The presented study allows, in fact, investigating within-person differences. Moreover, I very much appreciate that the authors seek to publish their code and materials online, although the respective GitHub project page did not appear to be set to 'public' at the time (error 404). Another strength of the study is that brain age models have been trained and validated in external samples. One further strength of this study is that it is based on a registered trial, which allows for the evaluation of the aims and motivation of the investigators and provides further insights into the primary and secondary outcomes measures (see the clinical trial identification code).

      One weakness of the study is that no comparison between the active control group and the two experimental groups has been carried out, which would have enabled causal inferences on the potential effects of different types of interventions on changes in relative brain age. In this regard, it should also be noted that all groups underwent a lifestyle intervention. Hence, from an experimenter's perspective, it is problematic to conclude that lifestyle interventions may modulate brain age, given the lack of a control group without lifestyle intervention. This issue is fueled by the study title, which suggests a strong focus on the effects of lifestyle intervention. Technically, however, this study rather constitutes an investigation of the effects of successful weight loss/body fat reduction on brain age among participants who have taken part in a lifestyle intervention. In keeping with this, the provided information on the main effect of time on brain age is scarce, essentially limited to a sign test comparing the proportions of participants with an increase vs. decrease in relative brain age. Interestingly, this analysis did not suggest that the proportion of participants who benefit from the intervention (regarding brain age) significantly exceeds the number of participants who do not benefit. So strictly speaking, the data rather indicates that it's not the lifestyle intervention per sé that contributes to changes in brain age, but successful weight loss/body fat reduction. In sum, I feel that the authors' claims on the effects of the intervention cannot be underscored very well given the lack of a control group without lifestyle intervention.

      We agree that this point, also raised by reviewer #2, should be made clear, and we now state this in the limitation section of the Discussion:

      “We also note that the lack of a no-intervention control group limits our ability to directly relate our findings to the intervention. Hence, we can only relate brain age attenuation to the observed changes in health biomarkers.”

      Also, following reviewers #2 and #3, we refer to the weight loss following 18 months of lifestyle intervention instead of to the intervention itself. This is now explicitly mentioned in the title, abstract, and within the text:

      Title: “The effect of weight loss following 18 months of lifestyle intervention on brain age assessed with resting-state functional connectivity”

      Abstract: “…, we tested the effect of weight loss following 18 months of lifestyle intervention on predicted brain age, based on MRI-assessed resting-state functional connectivity (RSFC).”

      Another major weakness is that no rationale is provided for why the authors use functional connectivity data instead of structural scans for their age estimation models. This gets even more evident in view of the relatively low prediction accuracies achieved in both the validation and test sets. My notion of the literature is that the vast majority of studies in this field implicate brain age models that were trained on structural MRI data, and these models have achieved way higher prediction accuracies. Along with the missing rationale, I feel that the low model performances require some more elaboration in the discussion section. To be clear, low prediction accuracies may be seen as a study result and, as such, they should not be considered as a quality criterion of the study. Nevertheless, the choice of functional MRI data and the relevance of the achieved model performances for subsequent association analysis needs to be addressed more thoroughly.

      We agree that age estimation from structural compared to functional imaging yields a higher prediction accuracy. In a previous publication using the same dataset12, we demonstrated that weight loss was associated with an attenuation in brain atrophy, as we describe in the introduction:

      “We previously found that weight loss, glycemic control and lowering of blood pressure, as well as increment in polyphenols rich food, were associated with an attenuation in brain atrophy 12.”

      Here we were specifically interested in age-related functional alterations that are associated with successful weight reduction. Compared to structural brain changes aging effect on functional connectivity is more complex and multifaced. Hence, we decided to utilize a data-driven or prediction-driven approach for assessing age-related changes in functional connectivity by predicting participants’ functional brain age. We now describe this rationale in the introduction section:

      “Studies have linked obesity with decreased connectivity within the default mode network15,16 and increased connectivity with the lateral orbitofrontal cortex17, which are also seen in normal aging18,19. Longitudinal trials have reported changes in these connectivity patterns following weight reduction20,21, indicating that they can be altered. However, findings regarding functional changes are less consistent than those related to anatomical changes due to the multiple measures22 and scales23 used to quantify RSFC. Hence, focusing on a single measure, the functional brain age, may better capture these complex changes and their relation to aging.”

      We address the point regarding the low model performance in response to reviewer #2, comment #2.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      This manuscript describes studies that indicate roles for the ALK and LTK receptors in neuronal polarity, cortical patterning and behavior in mice. I really liked the study and overall think that it deserves publication in a high-ranking journal. It reports important and novel results and benefits from a comprehensive analysis at multiple levels, including cell biological, biochemical and behavior. The points raised below are suggestions for consideration at the discretion of the authors.

      We thank the reviewer for the positive and enthusiastic comments on our study and especially for noting that it is appropriate for publication in a high-ranking journal. We greatly appreciate the valuable suggestions, the majority of which we have incorporated into the revised manuscript.

      1. The term "DKO" appears in the Introduction without explanation. I assume this means double KO mice lacking both receptors from birth. It should be indicated here, just in case.

      We have added text at the first appearance of DKO (ie results section) to indicate that this refers to double knockout mice that lack both Ltk and Alk from birth.

      1. The last paragraph of the Introduction is redundant with the Abstract. This is a stylistic question, which is up to the authors. Nevertheless, as a suggestion, they could take the opportunity here to explain the rationale of the study and why they did what they did._

      We have made some modifications to provide an indication of the rationale for the studies.

      1. Is "single cell in situ mRNA analysis" standard in situ hybridization or something else? Why is it called "single-cell"? It could be misleading.

      This was a typographical error and has been corrected to single molecule in situ.

      1. In Fig. S1B, could the authors please include expression patterns of LTK in adult brain? It'd seem that is the most relevant place to look given the analysis that follows in the paper.

      We have replaced the previous panels with new plots (now Fig. S1G) showing the relative expression of Ltk, Alk and their ligand, Alkal2 in embryos (E15.5), newborn (P0) and post-natal Day 2 (P2) and Day 7 (P7) and in adults both in the cortex and whole brain. The results confirm that Alk and Ltk are both expressed in the cortex and brain but in varying patterns with Alk expression decreasing with age and Ltk increasing, particularly in the cortex. In contrast, Alkal2 expression is relatively constant throughout.

      Related comments #5, #7, #8 and #9.

      1. I have an issue in general in the first part of the manuscript with regards to the labeling of cortical layers. How were CP, IZ and SVZ/VZ defined? Specific markers should be used to identify their actual boundaries. Guesswork from the DAPI pattern (if that is what was used) is not really appropriate.

      2. In Fig. 1F, again, how were the boundaries between the cortical areas (dotted lines) determined? This is particularly important for the mutant sections....

      3. In Fig. S3C-F, the all-critical quantification of Ctip2 cells at P2 seems to be missing in this figure. It would important to provide this in light of the comments above. Again, the same problem with the layer boundaries is clear here.....

      4. In Figure 2A and B, % positive cells is plotted but we are not told what is the reference (100%) level. ... Also, the idea of drawing a little rectangle in the IZ and CP and counting only there is flawed. ...Finally, again, we are not told how the boundaries of the different cortex areas were established. ...

      Response to related comments #5, #7, #8 and #9.

      As exemplified in the related comments above, the reviewer indicated that they “__have an issue in general in the first part of the manuscript with regards to the labeling of cortical layers.”

      We thank the reviewer for this insightful comment. Development of the mouse cortex follows a stereotypical pattern, thus we used a combination of DAPI ( ie nuclear density is characteristic of some layers), and layer specific markers (Satb2, Ctip2, Pax6, Sox2, Tbr2) to label the cortical layers. While this is generally acceptable for wild type mice, we agree with the reviewer’s comment that this may not be appropriate in mutant mice. Accordingly, we have now taken a more unbiased approach and repeated all of the quantitation after creating equally sized bins that span the entire cortical length and have plotted the quantitation by bin location. The general location of layers in WT mice has been marked on the images for reference. Our conclusions that there are defects in early patterning that are resolving by ~P7 is unchanged.

      With this re-quantitation, some of the previous reviewer comments within #5, 7, 8 and 9 no longer apply (ie a missing plot, box placement being subjective, etc) and so have not been responded to. With regards to the question of what is the reference (ie 100%) for the plots showing the y-axis as % positive; this was determined based on the total number of DAPI+ cells counted in each region. This information has been added to the legends and methods along with details of the new quantitation method.

      1. Comparing Fig. 1 and Fig. S2, there would seem to be little or no additive nor synergistic effects of the double mutation, as the phenotype in the DKO appears to be completely attributable to the Ltk KO. What does this mean? Providing the expression patterns of the two receptors at the ages used here (i.e., P2 and P7) would also be helpful.

      The relative contribution of Alk or Ltk in comparison to the DKO, varies as a function of age (E15.5, P2, P7) that generally correlates with their level of expression, as per the Reviewer’s suggestion. For example, at E15.5, a reduction in the number of Sox2+ or Tbr2+ cells is observed for either Alk or Ltk knockouts alone, with a more prominent reduction in the case of Alk alone, and with the DKOs showing the greatest reduction. In contrast, when examining Ctip2 levels at P2, the loss of Ltk alone yields a stronger effect. In agreement with these observations, analysis of mRNA expression levels show that Alk levels are highest in the embryonic cortex and brain and steadily decline until adulthood, while Ltk expression increases with maximal levels occurring post-natally. As indicated for our reply to comment #4, we have now added plots showing the relative level of expression of Alk and Ltk at various ages from embryos to adults (Fig S1G).

      1. At the end of page 8, it is concluded that Alk/Ltk promote neuronal migration. Is this a cell-autonomous effect? Given the very sparse expression of these receptors (Fig S1), cell-autonomy (which is being implied by the authors) is not at all clear. Is the migration of Alk+ cells affected in the Ltk mutant? Vice-versa?

      In our analysis of mRNA expression using RNAscope we originally included a widefield image that depicts the entire cortex where it is difficult to see expression at the cell level. We now also provide a magnified image of the E15.5 SVZ/VZ that shows that most cells do express the receptors (Fig. S1B). Thus, the results are consistent with the idea that the defect in migration is a cell autonomous effect.

      1. In Fig. S4A, as every cell in these panels bears probe signal, it'd be important to present a negative control, perhaps from KO cultures or wild type cells lacking receptor expression in the same field as expressing cells. At a 75%, 1 in 4 cells in any field should be receptor-negative.

      As requested, we now provide images with a wider field of view that includes negative cells.

      1. Figure S4B is difficult to interpret in the absence of Tau and MAP2 markers, as GFP does not discriminate between axons and dendrites.

      In the original submission we quantitated Tau-1 and MAP2 co-stained neurons in many experiments to demonstrate that Ltk/Alk act on axons, but in some cases, we used Tuj1 to more easily visualize and quantitate neurites. Nevertheless, as requested by the reviewers, in the revised manuscript we have repeated and replaced most of the results with Tuj1 or phalloidin staining with experiments using Tau-1 and MAP2 antibodies, including Fig. 5B-D and Fig. 6A-D and G as well as for Fig. S4B. The new data is consistent with our results using Tuj1 staining and further support our conclusions that Ltk/Alk act via Igf1-r to regulate neuronal polarity.

      In general, the authors are recommended to show more than one cell per condition in their figures. Readers need to be convinced that these are robust phenotypes easily observed on many cells in the same field.

      Due to space constraints, we included only a single representative image for each condition and then provided quantitation to support our conclusions. We have numerous images for all of the presented data and could provide a collage for all panels if considered appropriate. In the meantime, we have added additional images for several experiments in the Main Figures (Fig. 5A-D, Fig. 6A, C) and in Suppl. Figure S4A, B, C where sufficient space was readily available.

      1. In Fig. S4C and D, do the KO neurons become bipolar? I don't see examples of multipolar neurons in the images provided.

      Upon siRNA mediated knockdown of Ltk and/or Alk, we observe about 50% of the neurons are bipolar (ie display the typical wild type single axon phenotype) while roughly 40% display the multiple axon phenotype. With the exception of the control (siCTL), the images provided were selected to show neurons with multiple axons. However, in some of the images, the arrowheads pointing to the axons were inadvertently omitted. These have now been added.

      1. Is there a way to quantify the effects shown in Fig. 3E?

      We attempted to quantitate the number and direction of neurites in the brain sections but because this is a dense tissue, even with Golgi staining, we found it impossible to trace individual neurites back to the cell body and thus were unable to quantitate the effects. As an alternative, we have provided additional images (Fig. S3B) from distinct mice to support our observations of aberrant horizontal neurites in the adult cortex.

      1. The DKO display a dramatically different behavior phenotype compared to single Kos. How can this result be explained given that DKOs are indistinguishable from single KOs in all other parameters studied?

      The reviewer is correct, that the single KO mice do not manifest noticeable behavioural defects except when older and challenged with the most demanding task, the Puzzle box, which measures complex executive functions. We speculate that alternative cortical re-wiring in the single knockouts is sufficient to maintain normal circuitry that cannot be compensated when both Ltk and Alk receptors are deleted. It is also possible that Ltk/Alk regulated signalling events, besides Igf-1r/PI3K could contribute to the behavioural defects observed in the DKO mice, such as the ALK-LIMK-cofilin pathway which regulates synaptic scaling mentioned by the reviewer (Zhou et al., Cell Rep. 2021). Nevertheless, the strong phenotype of the DKOs confirms that Ltk/Alk are important for proper brain function, thus our preference is to retain the behavioural data in the manuscript but to discuss that alternative Ltk/Alk pathways could contribute to the phenotype (which we have now incorporated into the text).

      1. At the end of the behavior section, the authors attribute the phenotypes observed to defects in neuronal polarization. Given that polarization was only studied in vitro, it may be a premature to conclude that neurons fail to polarize in vivo in the absence of direct evidence showing this.

      We agree and have modified the text to remove this inaccurate assertation.

      1. Regarding P-AKT studies, it would be interesting to assess the effects of the ALK7LTK ligands (e.g., from conditioned medium) on the levels of P-AKT in WT neurons.

      We agree that this would be interesting and we had attempted this experiment, but found that treatment of WT cortical neurons with medium conditioned with the ALKAL2 ligand did not change the levels of pAKT under our experimental conditions (namely 20-30 min treatment with ACM). Because the data is negative, it makes it difficult to make a firm conclusion, but if true, it is possible that other pathways might be involved when WT cortical neurons are stimulated with ligand.

      1. In the mid part of page 14, the sentence "Treatment of WT cortical neurons with AG1024 at a dose (1 μM) at which only IGF-1R but not InsR was inhibited restored the single axon phenotype in DKO neurons" is confusing. Treatment performed in WT neurons but assessed in DKO neurons? This must be a typo.

      Thank you for pointing out this typo. It has been corrected.

      1. For completion, it would be informative to test whether IGF-1 antagonizes the effects of ALK and LTK ligands in axon formation.

      As suggested, we performed the requested experiment (with 3 independent repeats). In brief, four hours post-plating neurons were treated with control or ALKAL2-conditioned media and Igf-1 was added after 1 hour. Neurons were fixed at 36 hours, stained for MAP2 and Tau-1 and axons (Tau-1+) quantitated. Consistent with our previous findings, Igf-1 promotes the formation of multiple axons while ligand inhibits axon formation. In the ligand-treated neurons, addition of Igf-1 did not result in a statistically-significant change in the number of axons. These findings are consistent with our model that activation of Ltk/Alk promotes a decrease in cell-surface Igf1-r. This data has been added to the manuscript (Fig. 7J).

      1. The quality of the blot provided to illustrate levels of activated Igf-1r in Fig. 7A is clearly suboptimal. It is not apparent from that blot that phosphorylation of Igf1r is increased in the mutant neurons as the band intensities are indistinguishable. Was this performed in cortex extracts or cultured neurons? Is it affected by treatment with ALK/LTK ligands?

      We apologize for a labelling error that has caused confusion for both reviewers. We have replaced the blots and corrected the labels. We have noted in the legend that the experiments were performed using cultured cortical neurons.

      1. Given the physical interaction between ALK/LTK and IGF-R1, these receptors are presumably co-internalized upon ligand treatment, or? Does treatment with IGF1 induces internalization of ALK or LTK?

      This is a very interesting question. Unfortunately, due to the lack of suitable antibodies for the mouse versions of Alk or Ltk, we are not able to perform these experiments in cortical neurons with endogenous receptor expression. However, our co-immunoprecipitation experiments and in vitro kinase assays, indicate that only versions of LTK and/or ALK with active kinase domains can interact with IGF-1R and that the activated LTK/ALK receptors then phosphorylate IGF-1R and trigger IGF-1R internalization (Fig. 7 and Fig. 8 model). Thus, we would expect that treatment with IGF-1 in the absence of LTK/ALK activation will not affect LTK/ALK internalization but will trigger IGF-1R endocytosis.

      1. The last paragraph in the Results section may be more appropriate for Discussion to avoid repetition. But it is of course up to the authors to decide on stylistic issues.

      We prefer to include a summary of the experimental findings and the model figure at the end of the results.

      1. There is a discussion of possible redundancies between ALK and LTK in the Discussion section which appears to contradict itself. It is first stated (end of p. 18) that the two receptors are not redundant but both required for function. But in p. 19, the significant behavioral phenotypes observed in DKO mice, but not in single KO mice, are attributed to redundancy and compensation between the receptors. This needs some clarification. It's difficult to understand how there can be redundancy for behavior but not for structure or function.

      We have clarified in the discussion, that both receptors are required in the context of neuronal polarity and migration whereas in the case of behaviour, compensatory mechanisms in neural circuitry or perhaps non-redundant Igf-1r independent pathways result in a strong phenotype only in DKO and can compensate for single but not double knockouts.

      Reviewer #1 (Significance):

      see above

      Reviewer #2 (Evidence, reproducibility and clarity):

      Christova et al. analyzed single and double knockout mice for Alk and Ltk to investigate their function in the nervous system and describe defects in cortical development and behavioral deficits. The defects in the formation of cortical layers suggest a delay in radial migration. In culture, 40% of cortical neurons from knockout embryos extend multiple axons. The mechanism responsible for this phenotype is explored in some detail. The authors conclude that Alk and Ltk function non-redundantly to regulate the Igf-1 receptor (Igf-1r). Inactivation of Alk or Ltk increases surface expression and activity of Igf-1r, which induces the formation of multiple axons. The authors propose that Alk and Ltk interact with Igf-1r and promote its endocytosis after activation by their ligand Alkal2, thereby preventing the formation of additional axons. However, the defects in neurogenesis, migration and behavior may have a different cause and should not be attributed only to Igf-1r.

      We would like to thank the reviewer for all the insightful comments and suggestions which we feel have strengthened our study.

      We appreciate the reviewer’s acknowledgement that we have shown that Igf-1r is in involved in Alk/Ltk-mediated regulation of axon outgrowth. To provide evidence that Igf-1r is also important for Ltk/Alk regulated migration in vivo, we explored the effect of the Igf-1r inhibitor, PPP on the migration of neurons in WT and DKO mice by BrdU labelling. Excitingly, this analysis revealed that PPP administration resulted in a partial rescue of the migration defect in Ltk/Alk DKO mice, with BrdU+ neurons being localized to the most superficial layers in P2 mice (Fig. 6F). Thus, these data are consistent with our model that loss of Ltk/Alk can disrupt both neuronal polarity and migration via IGF-1r. We do agree with the reviewer that we have not directly shown that the behavioural defects can be attributed to Igf-1r and it is certainly possible that other pathways or mechanisms may be involved in the complex phenotype. We have updated the manuscript and discuss the potential involvement of other pathways in the discussion.

      Major comments<br /> 1) The role of Alk/Ltk in suppressing the formation of multiple axons is demonstrated by culturing neurons from knockout mice, suppression with siRNAs and treatment with inhibitors. These experiments consistently show that about 40% of cultured neurons extend more than one axon when Alk, Ltk or both are inactivated. Single and double knockout mice are largely normal with the exception of a delay in the formation of distinct cortical layers. The phenotypes of the knockout lines indicate a function in cortical development but Alk and Ltk are not "indispensable" as suggested (p. 18)._

      We will modify the wording to remove the statement that Alk and Ltk are “indispensable” for cortical patterning and rather will indicate that the receptors ‘contribute’ to the timing of cortical patterning.

      The morphology of cortical neurons was analyzed by Golgi staining. A few potential axons (Fig. 3E) were identified only by an absence of dendritic spines and their aberrant trajectory. These results indicate that there are ectopic extensions in the cortex but do not demonstrate that neurons extend multiple axons also in vivo. It has to be confirmed that these extensions are positive for axon-specific markers and that several axons originate from one soma to demonstrate a multiple axon phenotype in vivo. A quantification of the number of neurons with multiple axons would be required to conclude that this phenotype occurs at a similar frequency in vivo.

      As indicated in response to reviewer #1, we attempted to quantitate the Golgi stained images but found it impossible to trace individual neurites to the cell body and thus could not unambiguously identify and quantitate axons. Accordingly, and as suggested by the reviewer, we have modified our conclusion to simply state there are aberrant extensions in the cortex in vivo. Although we were unable to do quantitation, to further support our conclusions, we have provided additional Golgi stained images of WT and DKO mice from an independent experiment (Fig. S3B).

      2) According to the model presented in Fig. 7, Alkal2 activates Alk and Ltk, which stimulate the endocytosis of Igf-1r and thereby prevents the formation of additional axons. A quantification of Igf-1r surface levels by the biotinylation of surface proteins and Western blot shows an increase in knockout neurons. The authors suggest that Alk/Ltk activation stimulates Igf-1r endocytosis but do not demonstrate this directly. An increase in surface expression could also result from a stimulation of exocytosis or recycling.

      We showed that ligand-induced activation of Ltk/Alk in WT neurons resulted in a loss of biotin-labelled cell-surface Igf-1r, which is strongly indicative of increased internalization and cannot be explained by exocytosis. However, the reviewer is correct, that we cannot exclude the possibility that changes in exocytosis or recycling might also occur and that in the unstimulated DKO neurons, the increase in surface expression of Igf-1r could also result from a stimulation of exocytosis or recycling. Indeed, several papers (Laurino et al, 2005, PMID: 16046480; Oksdath et al, 2017, PMID: 27699600; Quiroga et al, 2018, PMID: 29090510) have reported that exocytosis mediated transport of IGF-1R and activation of IGF-1R/PI3K pathway is essential for the regulation of membrane expansion during axon formation. Accordingly, we have modified the discussion text to incorporate this possibility.

      3) The localization of Alk, Ltk and Alkal2 was determined by in situ hybridization. The signals are weak and it is not clear if they are specific because a negative control is missing. An analysis by immunofluorescence staining would be more informative.

      RNAscope is designed so that a single molecule of RNA is visualized as a punctuate signal dot with high specificity. In lower magnification images, such as those we showed to provide an overall view of expression in the cortex, it is difficult to discern the individual ‘dots’, particularly for genes with low expression, giving the impression that the signal is weak. However, at high magnification (63X) the signals are readily visible as seen in a new panel in Fig. S1B). We also neglected to mention that positive probes with all 3 labels (POLR2A: Channel C1, PPIB: Channel C2, UBC:Channel C3) as well as a negative probe (Bacterial dap gene) supplied by the manufacturer were used on our samples to validate specificity. We have corrected the oversight and have now added this information to the methods section.

      Regarding immunofluorescence, we have rigorously tested numerous commercially-available antibodies and have undertaken repeated attempts to produce our own antibodies that recognize mouse Ltk or Alk, and are appropriate for immunofluorescence, but have had no success. The high specificity enabled by the RNAscope technology is thus currently the most reliable way we can examine expression, with the added advantage that we can simultaneously assess expression of both receptors and the ligand in an individual cell within a section.

      Alk appears to be expressed mainly in the ventricular zone (VZ) while Ltk shows a low expression in the SVZ and the cortical plate (CP). This expression pattern is not consistent with a function in regulating axon formation in multipolar neurons, which extend axons in the lower intermediate zone (IZ) (Namba et al., Neuron 2014) and not in the VZ or SVZ (p. 18).

      It is well described that multipolar neurons can be found in the SVZ, while bipolar neurons are preferentially in the IZ. Neurons expressing Ltk, Alk and their ligand, Alkal2 can be found in both compartments (albeit levels appear higher in the SVZ), thus we feel our results are consistent with a role for the receptors in regulating neuronal polarization.

      It is also essential to analyze the subcellular localization of Alk and Ltk at least in cultured neurons. Ltk has been reported as an ER-resident protein that regulates the export from the ER (Centonze et al., 2019), which would not be consistent with the model.

      Unfortunately, the lack of antibodies with mouse reactivity prevents us from analyzing the subcellular localization of Alk and Ltk in cultured neurons. As mentioned by the reviewer, LTK has been reported as an ER-resident protein (in cancer cells) and similarly, many other tyrosine kinase receptors including IGF1R, have been reported to be localized to diverse intracellular compartments like Golgi, nucleus or mitochondria (reviewed in Rieger and O’Connor, 2021, Front Endocrinol:PMID: 33584548). However, since extracellular ligands for LTK and ALK are known, we feel it is a reasonable expectation that they will have a role as cell-surface receptors. Understanding the functions of RTK receptors and the interplay between the various compartments would nevertheless be an interesting area for future research.

      4) The results convincingly show that an increased activity of Igf-1r is responsible for the formation of additional axons by cultured knockout neurons. The model in Fig. 7 explains how Alk/Ltk suppress the formation of multiple axons in culture but a key question remains to be addressed: why does Igf-1r remain active in the future axon? Are Alk/Ltk restricted to or selectively activated in dendrites? It is important to determine if Alk and Ltk are absent from the future axon before or after neuronal polarity is established.

      We thank the reviewer for acknowledging that we have provided convincing data that increased activity of Igf-1r is responsible for the formation of multiple axons. Addressing why Igf-1r remains active in the future axon and if and how Ltk/Alk are selectively activated in dendrites and axons are all excellent questions, which we plan to pursue in future work, particularly when antibodies for Alk and Ltk become available.

      Which cells produce Alkal2 in neuronal cultures and in vivo?_ _These points can be easily addressed and should be investigated.

      We have confirmed that Alkal2 is expressed in the isolated cortical neurons, consistent with our demonstration that siRNA-mediated abrogation of Alkal2 expression in cultured neurons regulates polarity and that ligand levels do not change in Ltk/Alk double knock out mice (Fig. S1G and S6A). Whether other non-neuronal cell types also express Alkal2 would be an interesting future direction.

      Why does an increase of Igf-1r surface expression in knockout neurons result in a stimulation of Igf-1r autophosphorylation? Neurons are cultured in a defined medium without Igf-1 and increased surface levels by themselves should not lead to an increased activity.

      We have not mechanistically determined why/how Igf-1r displays enhanced autophosphorylation in DKO neurons. Thus, we can only speculate about possibilities. Perhaps there are low levels of Igf-1 in the cortical cell extracts, or is produced by the cortical neurons; there may be compensatory mechanisms engaged when Ltk/Alk are lost to ensure neuronal survival, or perhaps the increase in cell-surface Igf-1r promotes ligand-independent activation of receptors in the absence of ligand.

      The results presented in this manuscript are consistent with a role of Igf-1r in the formation of multiple axons in the absence of Alk/Ltk. However, inhibition of Igf-1r by various means does not prevent axon formation in controls. Igf-1 has been implicated in axon formation (Sosa at al., 2006) but a knockout of Igf-1r does not result in a loss of axons but a reduction of axon length in cultured neurons (Jin et al., PLoS One 2019). Axon-specific markers are used only for some experiments but not in Figs. 3D, 5B-D and 6 where the neuronal marker Tuj1 does not allow the unambiguous identification of axons. Staining with an axonal marker and a quantification of axon length are required to distinguish between a block in axon formation and a reduction in axon growth in Figs. 3A, 5 and 6.

      In the original submission we quantitated Tau-1 and MAP2 co-stained neurons in many experiments to demonstrate that Ltk/Alk act on axons, but in some cases we used Tuj1 to more easily visualize and quantitate neurites. Nevertheless, as requested by the reviewers, in the revised manuscript we have repeated and replaced most of the results with Tuj1 or phalloidin staining with experiments using Tau-1 and MAP2 antibodies, including Fig. 5B-D and Fig. 6A-D and G, as well as for Fig. S4B requested by reviewer #1). The new data is consistent with our results using Tuj1 staining and further support our conclusions that Ltk/Alk act via Igf1-r to regulate neuronal polarity. With regards to Fig. 3D, we have been experiencing ongoing technical issues in generating human stem cell derived cortical neurons and have been unable to undertake Tau1/MAP2 staining of the human cortical neurons. Given that the point being made is minor, we have removed this panel from the paper.

      With regards to the comment on that inhibition of Igf1-r did not prevent basal axon formation: in our prior quantitation of WT neurons in which Igf1-r was inhibited using either siIgf1-r or PPP, we noticed a trend towards an increase in the number of neurons with no axons, but this was not statistically significant. Upon the repeat of experiments and re-quantitation with Tau-1/MAP2 co-staining, we do see a statistically-significant increase in the number of WT neurons without axons. This is in agreement with several prior studies (including one cited by the reviewer) indicating Igf1-r is important for neuronal polarity (Sosa, 2006; PMID:16845384, Neito Guil 2017 PMID:28794445). The text has been modified accordingly.

      5) The analysis with layer specific markers and BrdU labeling reveals defects in the formation of cortical layers that suggest a delay in neuronal migration. The number of Sox2+ and Tbr2+ cells is lower in knockout neurons indicating a possible reduction in the number of proliferating progenitors and a defect in neurogenesis (Fig. 1). The number of neurons positive for layer-specific markers or BrdU was quantified as the percent of DAPI-positive cells. This does not allow distinguishing between a change in the distribution and a reduction in the number of neurons due to defects in neurogenesis. It would be more informative to quantify the total number Ctip+, Satb2+ or BrdU+ cells in the VZ, SVZ, IZ and CP._

      In the in vivo BrdU labelling experiment, we did not co-stain sections with DAPI. However, in the immunofluorescence analysis in mice of the same ages, we did determine the total number of cells (ie by DAPI) that is shown in the plots in Fig. 1A and Fig. S2A/B. These results show that there are a similar number of cells in WT and mutant SVZ/VZ, consistent with the notion that there is a change in distribution rather than in reduction in the number of neurons due to defective neurogenesis. We neglected to mention this important point in the results and have now modified the text accordingly.

      6) The deficits observed in behavioral tests do not correlate with the defects in neuronal development. While the single knockouts show defects in cortical development only the double knockout displays behavioral deficits. The behavioral phenotype could be completely independent of Igf-1r. Alk has been implicated in regulating retrograde transport (Fellows et al., EMBO Rep. 2020) and synaptic scaling (Zhou et al., Cell Rep. 2021). Since there is no clear correlation between structural and behavioral changes these data are not obviously linked to the other results.

      The reviewer is correct, that the single KO mice do not manifest noticeable behavioural defects except when older and challenged with the most demanding task, the Puzzle box, which measures complex executive functions. We speculate that alternative cortical re-wiring in the single knockouts is sufficient to maintain normal circuitry that cannot be compensated when both Ltk and Alk receptors are deleted. However, we do agree that Ltk/Alk regulated signalling events, besides Igf-1r/PI3K could contribute to the behavioural defects observed in the DKO mice, such as the ALK-LIMK-cofilin pathway which regulates synaptic scaling as cited by the reviewer (Zhou et al., Cell Rep. 2021). Nevertheless, the strong phenotype of the DKOs confirms that Ltk/Alk are important for proper brain function, thus our preference is to retain the behavioural data in the manuscript but to discuss that alternative Ltk/Alk pathways could contribute to the phenotype (which we have now incorporated into the text).

      It should be noted that the study by Fellows et al in EMBO Rep 2020 shows Igf1-r, not ALK regulates retrograde transport so we have not included this study in the updated text.

      Minor comments

      1) Fig. 3 shows defects in the corpus callosum where axons are restricted to the upper half in the wild type but not the knockout. These results could indicate a guidance defect but do not show a "failure in axon migration through the corpus callosum" (p. 17). It is also not demonstrated "that the aberrant axon tracts may be the result of effects on neuronal morphology" (p. 19). Without additional experiments to trace axonal projections e.g. by DiI labeling it is not possible to determine the actual cause for the observation shown in Fig. 3F._

      We agree with the reviewer and have modified the concluding sentence so that the defects are described without attributing the cause to the defects on neuronal morphology.

      2) Active kinases from SignalChem are used for the in vitro kinase assays. The increased phosphorylation of Igf-1r could also result from a stimulation of auto-phosphorylation and not a direct phosphorylation by Ltk. Previous results indicate that phosphorylation of Y1250/1251 leads to increased internalization and degradation (Rieger et al., Sci. Signal. 2020), which would be an alternative explanation how Alk/Ltk regulate surface expression. Antibodies that are specific for Igf-1r phosphorylation at Y1135/1136 or Y1250/1251 could address this possibility (Rieger at al., Sci. Signal. 2020).

      It is rather surprising that for the Igf-1r, which is such a well-studied receptor, the mechanisms that regulate trafficking, exocytosis recycling, etc are so poorly understood and that this topic is currently an active area of investigation. The focus of our study was on understanding the role of Ltk/Alk in the brain and as part of this effort we demonstrated that Ltk/Alk can control neuronal polarity through Igf-1r phosphorylation. We believe that shedding light on the detailed mechanism of how enhanced Igf-1r phosphorylation induced by Ltk/Alk activation regulates Igf-1r trafficking is an exciting project for future work, but we feel that to thoroughly investigate this question is beyond the scope of the current study. We have, nevertheless, highlighted these points with additional references in the discussion.

      3) The specificity of the siRNAs has to be verified in neurons by rescue experiments and the suppression of the targeted proteins confirmed by immunofluorescence staining.

      We agree that rescue experiments are the gold standard, and we attempted to do this. However, we found that nucleofection of both siRNAs and cDNAs encoding either EGFP alone or Ltk/Alk was highly toxic to neurons with few surviving the treatment. As an alternative we used a pool of siRNAs, to minimize off-target effects and used genetic KOs or chemical inhibitors to verify the observations.

      4) The position of molecular weight markers is missing for most Western blots.

      We added the position of molecular weight markers for all the western blots in the revised manuscript.

      5) It is not indicated which conditions show a significant difference in Fig. 6.

      We thank the reviewer for pointing this out. We added the significant differences to all figures, including Fig. 6.

      6) Why does the Western blot in Fig. 7A show a double band with the anti-phospho-Igf-1r antibody in the knockout? Which of the bands was used for the quantification?

      We apologize for a labelling error that has caused confusion for both reviewers. We have replaced the blots and corrected the labels.

      7) Details of the plasmids used and information (catalog number) for recombinant GST-Ltk and His-Igf-1r should be included in Materials and Methods.

      The additional information and catalog numbers have been added to the Materials and Methods.

      Reviewer #2 (Significance):

      The receptor tyrosine kinase Alk has been studied mainly for its involvement in several types of cancer but the physiological functions of Alk and its close relative Ltk remain poorly understood. The regulation of Igf-1r is an interesting and important result to understand the physiological function of Alk and Ltk. However, several points have to be addressed before the manuscript would be suitable for publication.

      We thank the reviewer for indicating that this is interesting and important study. We trust that the additional data and clarifications provided, have addressed the reviewers concerns.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This manuscript describes studies that indicate roles for the ALK and LTK receptors in neuronal polarity, cortical patterning and behavior in mice. I really liked the study and overall think that it deserves publication in a high-ranking journal. It reports important and novel results and benefits from a comprehensive analysis at multiple levels, including cell biological, biochemical and behavior. The points raised below are suggestions for consideration at the discretion of the authors.

      1. The term "DKO" appears in the Introduction without explanation. I assume this means double KO mice lacking both receptors from birth. It should be indicated here, just in case.
      2. The last paragraph of the Introduction is redundant with the Abstract. This is a stylistic question, which is up to the authors. Nevertheless, as a suggestion, they could take the opportunity here to explain the rationale of the study and why they did what they did.
      3. Is "single cell in situ mRNA analysis" standard in situ hybridization or something else? Why is it called "single-cell"? It could be misleading.
      4. In Fig. S1B, could the authors please include expression patterns of LTK in adult brain? It'd seem that is the most relevant place to look given the analysis that follows in the paper.
      5. I have an issue in general in the first part of the manuscript with regards to the labeling of cortical layers. How were CP, IZ and SVZ/VZ defined? Specific markers should be used to identify their actual boundaries. Guesswork from the DAPI pattern (if that is what was used) is not really appropriate.
      6. Comparing Fig. 1 and Fig. S2, there would seem to be little or no additive nor synergistic effects of the double mutation, as the phenotype in the DKO appears to be completely attributable to the Ltk KO. What does this mean? Providing the expression patterns of the two receptors at the ages used here (i.e., P2 and P7) would also be helpful.
      7. In Fig. 1F, again, how were the boundaries between the cortical areas (dotted lines) determined? This is particularly important for the mutant sections, as apparent cortical thickness would be easily be affected by the plane of the section. Simply assuming that the CP is of equal thickness than the one in the WT may be incorrect. I feel the authors cannot just place dotted lines in the figure without explaining the criteria that was used to determine their location. Also, there is a significant (many fold) increase in Ctip2 cells in the IZb of the mutant (1F) that it's not explained in the text. The quantification of Ctip2 cells in the CP and IZa of the mutant is missing in the histogram. It should be indicated, even if very low. Again, the key point here is the criteria used for the<br /> boundaries between areas. May be what it's marked as IZa in the mutant is still part of the CP, in which case the number of Ctip2 cells would be increased there, not decreased, as claimed in the text.
      8. In Fig. S3C-F, the all-critical quantification of Ctip2 cells at P2 seems to be missing in this figure. It would important to provide this in light of the comments above. Again, the same problem with the layer boundaries is clear here. The Ltk KO would have normal levels of Ctip2 cells if the CP thickness were to be larger (due to e.g., the plane of the section not being perfectly perpendicular to the brain surface).
      9. In Figure 2A and B, % positive cells is plotted but we are not told what is the reference (100%) level. Was it the total number of cells in the entire cortex (including SVZ and VZ)? That cannot be the case, since CP+IZ in WT alone reaches almost 100%. What is 100% here please? Also, the idea of drawing a little rectangle in the IZ and CP and counting only there is flawed. The values would change drastically depending on where the rectangle is placed. They need to count the whole field of view, as it was done in the previous figures. Finally, again, we are not told how the boundaries of the different cortex areas were established. As explained earlier, distance from the surface (or from<br /> the bottom) of the cortex would be greatly affected by the plane of the section. This problem will need a more satisfying solution for the data to be interpreted in the way it has been done.
      10. At the end of page 8, it is concluded that Alk/Ltk promote neuronal migration. Is this a cell-autonomous effect? Given the very sparse expression of these receptors (Fig S1), cell-autonomy (which is being implied by the authors) is not at all clear. Is the migration of Alk+ cells affected in the Ltk mutant? Vice-versa?
      11. In Fig. S4A, as every cell in these panels bears probe signal, it'd be important to present a negative control, perhaps from KO cultures or wild type cells lacking receptor expression in the same field as expressing cells. At a 75%, 1 in 4 cells in any field should be receptor-negative.
      12. Figure S4B is difficult to interpret in the absence of Tau and MAP2 markers, as GFP does not discriminate between axons and dendrites. In general, the authors are recommended to show more than one cell per condition in their figures. Readers need to be convinced that these are robust phenotypes easily observed on many cells in the same field.
      13. In Fig. S4C and D, do the KO neurons become bipolar? I don't see examples of multipolar neurons in the images provided.
      14. Is there a way to quantify the effects shown in Fig. 3E?
      15. The DKO display a dramatically different behavior phenotype compared to single Kos. How can this result be explained given that DKOs are indistinguishable from single KOs in all other parameters studied?
      16. At the end of the behavior section, the authors attribute the phenotypes observed to defects in neuronal polarization. Given that polarization was only studied in vitro, it may be a premature to conclude that neurons fail to polarize in vivo in the absence of direct evidence showing this.
      17. Regarding P-AKT studies, it would be interesting to assess the effects of the ALK7LTK ligands (e.g., from conditioned medium) on the levels of P-AKT in WT neurons.
      18. In the mid part of page 14, the sentence "Treatment of WT cortical neurons with AG1024 at a dose (1 μM) at which only IGF-1R but not InsR was inhibited restored the single axon phenotype in DKO neurons" is confusing. Treatment performed in WT neurons but assessed in DKO neurons? This must be a typo.
      19. For completion, it would be informative to test whether IGF-1 antagonizes the effects of ALK and LTK ligands in axon formation.
      20. The quality of the blot provided to illustrate levels of activated Igf-1r in Fig. 7A is clearly suboptimal. It is not apparent from that blot that phosphorylation of Igf1r is increased in the mutant neurons as the band intensities are indistinguishable. Was this performed in cortex extracts or cultured neurons? Is it affected by treatment with ALK/LTK ligands?
      21. Given the physical interaction between ALK/LTK and IGF-R1, these receptors are presumably co-internalized upon ligand treatment, or? Does treatment with IGF1 induces internalization of ALK or LTK?
      22. The last paragraph in the Results section may be more appropriate for Discussion to avoid repetition. But it is of course up to the authors to decide on stylistic issues.
      23. There is a discussion of possible redundancies between ALK and LTK in the Discussion section which appears to contradict itself. It is first stated (end of p. 18) that the two receptors are not redundant but both required for function. But in p. 19, the significant behavioral phenotypes observed in DKO mice, but not in single KO mice, are attributed to redundancy and compensation between the receptors. This needs some clarification. It's difficult to understand how there can be redundancy for behavior but not for structure or function.

      Significance

      see above

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      The manuscript from Li et al. describes the authors' attempt to redirect the exocytic Rab Sec4 to endocytic vesicles by fusing the GEF-domain of Sec2 to the CUE domain of the endosomal GEF Vps9, which binds to ubiquitin. The authors show that the localization of the Sec2GEF-GFP-CUE construct is slightly shifted from polarized towards non-polarized sites. Sec2GFP-CUE positive structures acquire Sec4 and Sec4 effectors like exocytic vesicles but are less motile and show delayed plasma membrane fusion. Expression of Sec2GEF-GFP-CUE was enhanced if expressed in a subset of secretory and endocytic mutants and cause delayed Mup1 uptake from the plasma membrane. As Vps9, Sec2GEF-GFP-CUE accumulated on Class E compartments in vps4Δ strains.<br /> The authors ask here whether vesicular identity is largely predetermined by the correct localization of the specific GEFs of small GTPases and thus localization of the Rab. Although this an interesting hypothesis, the authors observed that endocytic traffic was not reversed by relocating Sec4 to these vesicles. This seems to be due to the strong affinity of the Sec2 GEF-domain for Sec4 but probably also due to the rather weak relocalization via the CUE domain. Thus, only a portion of Sec2 was displaced from its native site. Since the efficiency of this rewiring was not defined, it remains unclear whether the observed mild effects indeed speak against the assumed dominant role of the GEFs and small GTPases in shaping organelle identity or whether they are rather due to an inefficient relocalization.

      Our data demonstrate a dramatic relocalization of Sec2-GEF-GFP-CUE relative to Sec2-GEF-GFP. In the case of Sec2-GEF-GFP or Sec2-GEF-GFP-CUE M419D the cytoplasmic pool is predominant and only 30% of cells exhibit a detectable concentration, while in the case of Sec2-GEF-GFP-CUE 80% of cells show bright puncta and there is little or no detectable cytoplasmic pool (Fig 1A). Clearly the CUE domain can function as a localization domain that relies upon ubiquitin binding. Furthermore, half of the Sec2-GEF-GFP-CUE puncta colocalize with Vps9 (Fig S1). The high cytoplasmic background of Vps9 could mask additional colocalization, therefore we reexamined colocalization in a vps4__D_ _mutant in which the Vps9 cytoplasmic pool is reduced due to increased association with the expanded Class E late endosomes. In this situation we observe about 80% colocalization with Vps9 as well as substantial colocalization with Ypt51 and Vps8 (Fig 2). We now also show significant colocalization with PI(3)P (Fig S3D). Thus, our data demonstrate that addition of the CUE domain does indeed relocalize Sec2GEF to endocytic membranes. The Sec2 GEF activity then leads to the recruitment of Sec4 and Sec4 effectors, including Myo2 which in turn leads to their delivery to polarized sites. We now show by EM that the bright Sec2-GEF-GFP-CUE puncta correlate with clusters of 80 nm vesicles (Fig 5B). Our data argues that these are hybrid compartments carrying both endocytic and exocytic markers. We have restructured our paper to help clarify and emphasize this key point.

      Specific comments:<br /> 1. The authors state decidedly that the recruitment of Vps9 occurs ubiquitin-dependent via the CUE-domain. While the CUE-domain is the only known and a likely localization determinant of Vps9, it was not a strong localization determinant. Apart from being present in some puncta, Vps9 localized strongly to the cytosol (Paulsel et al. 2013, Nagano et al. 2019). Shideler et al. also showed that ubiquitin-binding is not required for Vps9 function in vivo, which indicates that other localizing mechanisms may play a role e. g. by positive feedback of GEF-domain-Rab5 interactions which might be initiated by the other Rab5-GEF Muk1 or as suggested by transport from the Golgi (Nagano et al. 2019). These observations indicate that the CUE-domain is a rather weak recruitment domain, which was not discussed in this manuscript. The localization of the Sec2GEF-GFP-control to the polarized sites in 30% of the cells furthermore suggests that the used Sec2GEF-GFP-CUE retains some native localization via the GEF-domain. Since the relocation efficiency of Sec2GEF-GFP-CUE was not defined, the obtained phenotypic effects allow for only vague conclusions. Although the mild endo- and exocytosis defects as well as the accumulation of Sec2GEF-GFP-CUE at Class E compartments indicate that the CUE-domain indeed conferred some relocation to endosomes, this was not shown for the sec2Δ strain e. g. by looking at colocalizations with endocytic versus exocytic markers and comparing their relative abundance at the Sec2GEF-GFP-CUE-positive structures. While some of the Sec2GEF-GFP-CUE-positive structures colocalized with Mup1 in the Mup1-uptake assay, it would be important to clarify how many endosomal properties are retained and how many exocytic properties are gained by these chimeric vesicles e. g. by looking for the presence of specific phosphoinositides, or Rab5 and Rab5 effectors. A competition between endosomal and the acquired exocytic factors could also be another possible explanation for the immobility of the Sec2GEF-GFP-CUE structures and less efficient recruitment of Sec4 effectors in addition to the proposed lack of PI4P.

      As summarized above, we observed dramatic relocalization of Sec2GEF that was strongly dependent upon the ability of the CUE domain to bind to ubiquitin. We also observed colocalization with Ypt51 and Vps8 as well as transient colocalization with internalized Mup1. We now also show significant colocalization with PI(3)P (Fig S3D). Full length Vps9 is probably subject to additional levels of regulation, perhaps autoinhibitory in nature, however our construct contains only the CUE domain which can clearly function as an efficient localization domain on its own. The high cytoplasmic pool of Vps9 reflects the rapid turnover of its ubiquitin binding sites, since it is efficiently recruited to membranes in vps4__D_ cells. The relocalized Sec2GEF domain was quite effective in recruiting Sec4 as well as most known Sec4 effectors. The recruitment of Myo2 leads to localization to sites of polarized growth. All of our studies were done in a sec2__D _background except for the analysis of dominant growth effects, as now explicitly stated at the beginning of the Results section.

      1. While the colocalization of the Sec2GEF-GFP-CUE-signal with Sec4 indicates that this GEF-construct is generally active, it remains unclear whether the activity of the tagged constructs differ from that of the wild type Sec2 protein. This could be analyzed in vitro via a MANT-GDP GEF-activity assay (Nordmann et al., 2010). Again, it remains unclear how much of the Sec2GEF-Sec4 colocalization represents the retained native localization versus synthetic localization at chimeric endo-exocytic vesicles.

      The structure and nucleotide exchange mechanism of the Sec2 GEF domain have been thoroughly analyzed in prior studies and are well understood. There is no reason to think that the constructs we generated here would alter the exchange activity as the fusions are far removed from the Sec4 binding site and our analysis here confirms that they are active in vivo. We do not feel that there would be much to be gained by doing in vitro exchange assays and it would entail a great deal of work.

      1. The authors mention that tagging with GFP increases the stability of the expressed constructs. However, it remains unclear whether this is also the case for the other tags (NeonGreen, mCherry) used in the other experiments. Are the constructs expressed at similar levels?

      We have compared the levels of the various tagged constructs and they appear to be similar (Fig S5A).

      1. In Figure 5: The incomplete colocalization of Sec2GEF-GFP-CUE with Vps9 is explained by the short-timed accessibility of ubiquitin moieties. Apart from the likely retained native localization or weak CUE-domain-function, this observation could also be due to competition between Vps9 and Sec2GEF-GFP-CUE for the available ubiquitin target structures.

      As previously shown, Vps9 normally displays a prominent cytoplasmic pool. Deletion of Vps4 leads to recruitment of this pool to expanded endosomes through an increase in the lifetime of the ubiquitin binding sites. The high cytoplasmic background in VPS4 cells could obscure some colocalization with Sec2GEF-GFP-CUE and indeed we observe increased colocalization in vps4__D_ _cells in which the cytoplasmic pool of Vps9 has been recruited to endosomes. Expression of Sec2GEF-GFP-CUE does not appear to significantly alter the localization of Vps9.

      Minor remarks:<br /> 1. Fig. 3C do not contain the arrowheads as indicated in the legend, making it harder to interpret.

      These have been added.

      1. The image chosen for Sec2-GFP in Fig. 4B suggests less colocalization between Sec2-GFP and Sec8 than between Sec2GEF-GFP-CUE and Sec8. They rather look next to each other.

      The images initially chosen were not representative. We have replaced them with better images from the same experiment.

      1. Figure 5: While resolution limits are possibly reached regarding endosomes, it might be interesting to check by thin section electron microscopy whether and how class E compartment formation is affected by Sec2GEF-GFP-CUE expression.

      We have now done EM using permanganate fixation of both VPS4 and vps4__D_ cells (Fig 5B and below). In both backgrounds Sec2GEF-GFP-CUE expression leads to the formation of clusters of 80 nm vesicles that appear to correlate with the fluorescent puncta visible by light microscopy. The vps4__D _cells have in addition curved linear membrane structures that represent class E endosomes (see images at end of this file). The class E endosomes appear similar in cells expressing Sec2GEF-GFP-CUE, Sec2-GFP or Sec2. We did not observe any obvious spatial relationship between the class E structures and the vesicle clusters.

      1. Discussion: "Furthermore, delivery of Mup1-GFP to the vacuole was slowed in Sec2GEF-GFP-CUE cells..." - The authors studied "the clearance of Mup1-GFP from the plasma membrane" and not vacuolar delivery. They did not show much vacuolar localization.

      We now include quantitation of Mup1-GFP at both the plasma membrane and vacuole (Fig 6 and Fig S8). This shows a reduced rate of depletion from the plasma membrane and a delayed appearance in the vacuole.

      Literature:<br /> Nagano, M., Toshima, J. Y., Siekhaus, D. E., & Toshima, J. (2019): Rab5-mediated endosome formation is regulated at the trans-Golgi network. Nature Communications Biology, 2 (1), 1-12.<br /> Nordmann, M., Cabrera, M., Perz, A., Bröcker, C., Ostrowicz, C., Engelbrecht-Vandré, S., & Ungermann, C. (2010): The Mon1-Ccz1 complex is the GEF of the late endosomal Rab7 homolog Ypt7. Current Biology, 20(18), 1654-1659.<br /> Paulsel, A. L., Merz, A. J., & Nickerson, D. P. (2013): Vps9 family protein Muk1 is the second Rab5 guanosine nucleotide exchange factor in budding yeast. Journal of Biological Chemistry, 288 (25), 18162-18171.<br /> Shideler, T., Nickerson, D. P., Merz, A. J., & Odorizzi, G. (2015): Ubiquitin binding by the CUE domain promotes endosomal localization of the Rab5 GEF Vps9. Molecular Biology of the Cell, 26 (7), 1345-1356.

      Reviewer #1 (Significance):

      • see above
      • has some deficits in interpretation as the Rab relocalization was not complete and thus conclusions are limiting

      Reviewer #2 (Evidence, reproducibility and clarity):

      This paper tries to address a fundamental question in cell biology, namely, what machinery is sufficient to tell a vesicle know where to go and what to do when it gets there. Several groups have shown that localization of some Rab/Ypt GEFs to an orthogonal compartment can lead to redirecting a Rab/Ypt to that membrane, where they can bind their partners abnormally. This story tries to explore what happens next.

      Here, Novick and colleagues took a part of the SEC2 GEF for secretory vesicle SEC4 Rab/Ypt and anchored it to endocytic structures to ask whether that was enough to relocalize those structures and drive inappropriate trafficking events. A challenge and advantage in the study is the fact that not all of the GEF relocalized-and that enables the cells to survive as SEC4p is needed for cell growth and membrane delivery--but this incomplete relocalization complicates phenotypic analysis--some SEC4 is on secretory vesicles and some is relocalized apparently to endocytic structures. Another challenge is that the two compartments both show "polarized" distributions so it is hard to know what compartment the reader is looking at, in a given figure. This makes the story very challenging to digest for a non-yeast expert trying to understand the conclusions.

      The authors show that the CUE domain can serve to partially localize SEC2GEF-GFP-CUE and this function relies on its ability to interact with ubiquitin. The localization is distinct from that of full length Sec2, nonetheless "many structures bearing Sec2GEF-GFP-CUE localize close to the normal sites of cell surface growth despite their abnormal appearance". The authors conclude that SEC4p and its effectors were recruited to these puncta with variable efficiency and the puncta were static; normal secretion was not blocked. This is not really a surprise as some SEC4p is still directed to secretory granules and cells do not show a vesicle accumulation phenotype by EM. Missing seems to be a clear-cut visual assay for exocytosis of secretory granules or endocytic structures despite attempts to include live cell imaging.

      We now show that the bright Sec2GEF-GFP-CUE_ puncta correspond to clusters of 80nm vesicles (Fig 5B). Our FRAP analysis demonstrates that Sec2GEF-GFP-CUE _is able to enter into pre-existing, bleached puncta (Fig 1E). One interpretation is that the vesicle cluster remains static, while individual vesicles enter and exit the cluster.

      The authors showed that SEC2-GFP-CUE structures fail to acquire Sro7 and do not seem to be able to assemble a complex with the tSNARE SEC9. Is this because Sro7 is being retained on the remaining secretory vesicles that also have SEC4 and other effectors that may be recruited to those structures by coordinate recognition?

      We demonstrate that at least half of the Sec2GEF-GFP-CUE puncta colocalize with Vps9 and this becomes even more evident in a vps4__D_ _mutant (Fig 2A). There is also substantial colocalization with the Rab5 homolog Ypt51, the endocytic marker Vps8 and PI(3)P (Fig 2 and Fig S3D). Nearly all of these puncta also colocalize with Sec4 and most of its downstream effectors. Thus, it seems that we have generated a hybrid compartment, as we intended. The surprise is how well the cells can cope with this situation. One possible explanation is offered in the Discussion: In yeast the TGN is thought to play the role of the early endosome and may be the site of Vps9 membrane recruitment. Thus Sec2GEF-GFP-CUE might be initially recruited to the TGN and the hybrid vesicles formed from this compartment might function to bring secretory cargo from the TGN to the cell surface just like normal secretory vesicles, with the caveat that the presence of endocytic machinery is somewhat inhibitory to Sro7 function, slowing fusion.

      There seem to be no issues with data as presented; a diagram of the SEC2-GFP-CUE would help the reader as would use of terms "secretory vesicle" and "endocytic vesicle" and how they were always distinguished rather than "polarized structure" which cannot distinguish these compartments.

      We have tried to be careful in our use of terms. We refer to the Sec2-GFP-CUE puncta using the unbiased terms “structures” or “puncta” until we show EM demonstrating that these puncta represent clusters of 80 nm vesicles.

      CROSS-CONSULTATION COMMENTS<br /> The two assessments come to the same conclusion--I agree that better definition of the precise phenotypes could be valuable but the limitation of incomplete relocalization will be hard to overcome in the absence of enormous effort.

      Reviewer #2 (Significance):

      This story represents a valiant effort and presents clean data but the impact and significance of the findings are limited due to the difficult phenotypic starting points (SEC4 in two places), and lack powerful exo- or endocytosis assays and better compartment-specific markers.

      The work will be of interest to yeast cell biologists studying the secretory and endocytic pathways. My expertise is mammalian cell biology of the secretory and endocytic pathways.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      This manuscript by Gouignard et al., reports that a matrix metalloproteinase MMP28 regulates neural crest EMT and migration by transcriptional control rather than matrix remodeling. The manuscript is clearly written and provides sufficient evidence and control experiments to demonstrate that the MMP28 can translocate into nucleus of non-producing cells and that nuclear localization and catalytic activity are essential for the activity of MMP28 to regulate gene transcription. ChIP-PCR analysis also suggests that MMP28 can bind to the proximal promoters of Twist and others. However, since weak binding is also detected between MMP14 and the promoters, a more direct evidence that such binding can indeed promote Twist expression will be more appreciated.

      Thank you for this comment. First, to represent the data from our ChIP assays we normalized all intensities to the GFP condition such that all levels are expressed fold change to GFP and we performed statistical comparisons. This shows that the enrichment of promoter regions by MMP28 and MMP14 are not equivalent.

      Second, to substantiate our previous ChIP data, we performed a new set of ChIP experiments, by performing three independent chromatin immunoprecipitations (biological replicates), and used primers targeting three new domains in the proximal promoter of Twist and primers against two domains in the proximal promoter of E-cadherin and one domain 1kb away from transcription start of E-cadh. We found that pull down with MMP28 significantly enriches the three tested domains within the proximal promoter of Twist but not those of the E-cadherin promoter, compared to GFP pull down. These data were added to Figure 7.

      However, we do not propose that MMP28 might act as a transcription factor and be able to promote Twist expression on its own. We apologize if some of the initial description of our data were too blunt and might have misled the reviewers. First, the protein sequence of MMP28, like those of all other MMPs, does not contain any typical DNA binding sites. In addition, ectopic overexpression of MMP28 is not sufficient to promote ectopic Twist expression (as shown in supplementary Figure 4) whereas, by contrast, Twist is able to promote ectopic expression of Cadherin-11 (see new Supplementary Figure 11). This indicates that MMP28 has an effect on Twist expression in the context of neural crest only and is not capable of activating Twist expression by itself.

      Also, it should be added that enrichments of promoter domains by MMP28 pull-down are very modest in comparison to enrichments obtain with Twist pull-downs. Therefore, a more plausible role for MMP28 is to be part of a regulatory cascade with other factors involved in regulating the expression of the target genes important for EMT. Other MMPs such as MMP14 and MMP3 have been shown to interact with chromatin with some transcriptional downstream effects but multiple domains of these proteins seem to equally mediate such interactions. None of the data published in these studies rules out a relay via cofactors. We extensively modified the text describing our data and provided additional context.

      Identifying the putative partners and their functional relationship with MMP28 is a project on its own and beyond the scope of this study.

      While the nuclear translocation and transcription regulation activity of MMP28 is clearly the focus of the study, there are some minor issues that should be further clarified in the functional studies in the earlier part of the manuscript.

      First, the effect of the splicing MO is somewhat unexpected. I would think that the splicing MO would lead to the retention of intron one and therefore premature termination or frameshift of the protein product, but RT-PCR or RT-qPCR suggest that there is no retention of intron 1, but a reduction in the full-length transcript, exon 1, or exon 7-8. Why is that?

      Thank you for this comment. This is presumably due to nonsense mediated RNA decay. We have not explored the biochemistry of MMP28 RNA following injection with MOspl. Splicing MOs can have multiple effects. As explained on the GeneTools website splicing MOs disturb the normal processing of pre-mRNA and cells have various ways to deal with this and there are multiple possible outcomes. The PCR with E1-I1 suggests that intron 1 is not retained. Therefore, a putative concern would be that MOspl led to exon-skipping and to the generation of a truncated form of MMP28. However, we have checked that it is not the case. The fact that the PCR using E7-E8 primers indicates a reduction as well suggests an overall degradation of the mRNA for MMP28. Importantly, the effect of MOspl can be rescued using MMP28 mRNA indicating that the knockdown is specific.

      Second, the effect of the splicing MO and ATG MO in NC explant spreading seems to be somewhat different, with ATG MO strongly repressed explant spreading, cell protrusion, and cell dispersion, while splicing MO does not affect cell dispersion, but affects the formation of cell protrusions. Does this reflects different severity of the phenotype or does the product of splicing MO display some activity?

      Thank you for this comment. However, we think that there may be a confusion. Data on Fig2 (MOatg) and Fig3 (MOspl) both show a decrease of neural crest migration in vivo (Figure 2a-b) and of neural crest dispersion ex vivo (Fig2c, Fig3i-k). Along the course of the project we have never observed a difference in penetrance or intensity of the phenotypes between the two MOs.

      Also, the switch between ATG MO and splicing MO is a bit confusing, maybe it is better to keep splicing MO only in the main text and move results involving ATG MO to supplementary studies.

      The reason is purely historical. We had an effect with MOatg that can be rescued but there is no available anti-Xenopus MMP28 to assess its efficiency. So we turned to MOspl to have an internal control of efficiency by PCR. This provides an independent knockdown method reinforcing the findings. Both MOs have been controlled for specificity by rescue with MMP28 and display similar effect on NC migration/dispersion. We see no harm in keeping both in the main figures but if the reviewer feels strongly about this we could perform the suggested redistribution of data between main and supp figures.

      Lastly, in Figure 3C and 3J, it says that the distance of migration or explant areas were normalized to CMO, while normalization against the contralateral uninjected side, or explant area at time 0 makes more sense.

      Thank you for this comment as it will allow us to explain better these quantifications. Regarding in vivo measurements (Figure 3c), it is indeed the ratio between injected and non-injected sides that is performed in all conditions and then the ratios are normalized to CMO. We have now clarified this point on all instances throughout the figures.

      Regarding ex vivo measurements (Figure 3j), NC explants are placed onto fibronectin and left to adhere for 1 hour before time-lapse imaging starts. NC cells extracted from MMP28 morphant embryos are not as efficient at adhering and spreading as control NC cells. Therefore, normalizing to t0 would erase that initial difference between control and MMP28 conditions. By normalizing to CMO at t_final we can visualize the initial defect of adhesion and spreading as well as the overall defects since CMO at t_final represents the 100% dispersion possible over the time course of the movie.

      Referee Cross-commenting

      I agree with comments from both Reviewers 2 and 3, especially that whether MMP28 regulates placode development (through Six1 expression) should be addressed.

      Reviewer #1 (Significance):

      This work provides novel insights of how a metalloprotease that is normally considered to function extracellularly can transfer into the nucleus of neighboring cells and regulate transcription. This would be of interest to researchers studying EMT, cell migration, and the functions of extracellular proteins in general. My expertise is in neural crest EMT and migration, and cytoskeletal regulation of cell behavioral changes. I do not have enough background on biochemical analysis.

      Reviewer #2 (Evidence, reproducibility and clarity):

      Summary:

      In this study, Gouignard et al. beautifully use the Xenopus neural crest as a model system to examine the role of the matrix metalloproteinase MMP28 during EMT. The authors show that mmp28 is expressed by the placodes adjacent to the neural crest. Using in vivo and in vitro perturbation experiments, they show that the catalytic function of MMP28 is necessary for the expression of several neural crest markers, as well as neural crest migration and adhesion. Next, the authors use grafting, confocal imaging, and biochemistry to convincingly demonstrate that MMP28 is translocated into the nucleus of neural crest cells from the adjacent placodes. Finally, nuclear localization of MMP28-GFP is necessary to rescue twist and sox10 expression in MMP28 morphants, and ChIP-PCR experiments suggest direct interactions between MMPs and the proximal promoters of several neural crest genes. These results have significant implications on the field of EMT and highlight an underappreciated role for MMPs as direct regulators of gene expression.

      Major comments:

      Overall, the experiments presented in this study are thoroughly controlled and the results are clearly quantitated and rigorously analyzed. Most claims are well supported by multiple lines of experimental evidence; however, there are a few experiments or observations that this reviewer thinks should be reconsidered for more clarity and accuracy.

      1. Supplementary Figure 1 shows the effect of MMP28-MOspl on additional ectodermal markers and shows that there is a significant loss of six1 expression from the placodal domain following MMP28 knockdown. The authors note this as a "slight reduction" on line 95, but since this shows a larger reduction in gene expression than some of the neural crest markers (snai2, sox8, foxd3), this reviewer thinks these results warrant a more significant discussion in this study.

      Thank you for this comment. We apologize for the poor choice of word regarding the description of the effect on Six1 expression. We corrected the associated paragraph.

      Although we do observe a reduction of Six1 expression upon MMP28 knockdown, this cannot explain the observed downregulation of some neural crest genes in our MMP28 experiments. There are noticeable differences between the effects of Six1 loss of function that have been reported in the literature and the MMP28 knockdown phenotypes we describe. As suggested by the reviewer, we added a paragraph in the discussion.

      Does MMP28 localize to the nucleus of placodal cells as it does with neural crest? If so, is it through interaction with the six1 proximal promoter? If MMP28 does not localize to the nucleus, that would suggest MMP28 function with a different mechanism between epithelial cells distinct from role in EMT. These questions could be addressed by analysis of the placode cells in the images in Figure 5 and use of primers against the six1 proximal promoter on any remaining samples from the ChIP experiment.

      Thank you for this comment. To address whether nuclear entry is specific to the neural crest-placodes interaction, we performed new grafts:

      • 1/ we replaced neural crest cells from embryos expressing MMP28-GFP by placodal cells injected with Rhodamine-dextran. This generates grafted embryos with control placodes next to placodes overexpressing MMP28-GFP. There, we can analyze entry of MMP28-GFP in placodal cells that do not overexpress it. We detected MMP28 in the cytoplasm and in the nucleus of these placodal cells. However, the rate of nuclear entry was lower than in NC cells.

      • 2/ To assess the importance of the cell type producing MMP28, we grafted NC cells injected with Rhodamine-dextran next to caudal ectoderm expressing MMP28-GFP. MMP28 was detected in cytoplasm and the nucleus of the NC cells but with a lower efficiency than when NC are grafted next to placodes expressing MMP28-GFP.

      • 3/ We made animal caps sandwiches with animal caps injected with Rhodamine-dextran and animal caps expressing MMP28-GFP. In this case MMP28-GFP is detected in the cytoplasm but fails to reach the nucleus.

      Collectively, these data indicate that placodes can import MMP28 produced by placodes and that NC can import MMP28 produced by other cells than placodes. However, in both cases the rate of nuclear entry was lower than in the NC-placode situation. Finally, the animal cap sandwiches indicate that entry into the cells does not predict entry into the nucleus. All these data were added to Supp Figure 7. Statistical comparisons of the proportion of cells with cytoplasmic and nuclear MMP28-GFP in all grafts were added to Figure 5.

      The Six1 promoter analysis suggested is beyond the scope of this study as our focus is primarily on the role of MMP28 in neural crest development.

      1. In Figure 2c, the authors rescue MMP28-MOatg with injection of MMP28wt mRNA. Does the MOatg bind to the exogenous mRNA? If so, this may just reflect titration of the MOatg. If this is the case, this experiment should be repeated with MOspl instead of MOatg.

      Thank you for this comment. MOatg is designed upstream of the ATG and thus the binding site is not included in the expression construct. We added this important technical information in the methods. Of note, we already have the suggested equivalent of Fig2C with the MOspl on figure 3.

      1. Is there a missing data point in Figure 2d corresponding to the upper bounds of the whisker in the 6 hour time point for the MMP28-MOatg dataset?

      Thank you for pointing this out. The top data point was indeed missing from the graph, and we apologize for this oversight. We have now updated the figure with the correct graph.

      1. The authors present ChIP-PCR results in Figure 7 as the major evidence to support the mechanism of nuclear MMP28 in regulating neural crest EMT through physical interaction with target gene promoters. However, the experimental design and presentation in Figure 7 are somewhat unconventional, and as such, difficult to interpret. First, instead of displaying the band brightness across the gel, the authors should normalize their bands to their negative GFP control, thus allowing for interpretation as a "fold enrichment over GFP control". It would be most clear to present these results in the form of a plot similar to Shimizu-Hirota et al., 2012, Figure 6D. Using qPCR instead of gel-based quantitation would further increase reproducibility by removing any bias in image analysis.

      Thank you for this comment. For each band the value of the adjacent local background was subtracted. We have now normalized to GFP to provide graphs showing the fold change to GFP enrichment as requested.

      However, we would like to point out that we do not propose that MMP28 might act as a transcription factor and be able to promote Twist expression on its own. First, the protein sequence of MMP28 does not contain any typical DNA binding sites, as is the case for any other MMPs. In addition, ectopic overexpression of MMP28 is not sufficient to promote ectopic Twist expression (see sup figure 4) contrary to Twist that can ectopically induce Cadherin-11 for instance (see sup figure 11). Further, enrichments of promoter domains by MMP28 pull downs are very modest in comparison of the enrichments promoted by Twist pull downs.

      A more plausible role for MMP28 is that it is recruited via an interaction with other factors involved in regulating the expression of the target genes related to EMT. Identifying the partners and their functional relationship with MMP28 is a project on its own, and beyond the scope of this study.

      Second, a proximal promoter sequence represents only ~250 bp upstream from the transcriptional start site. What is the rationale for testing multiple loci up to 3 kb upstream?

      Thank you for pointing this out. The use of the term “proximal” was indeed misleading we have now corrected this part in the text. Regulatory sequences can be located anywhere so we initially had a broader approach to test for interactions. Following on this reviewer’s comment, we removed the data points corresponding to the very distal sites. In addition, we performed three new independent ChIP-PCR assays with primers in the proximal portion of Twist and E-cadherin promoters and found enrichment in ChIP with MMP28-GFP compared to GFP for Twist but not for E-cadherin (whose expression was not affected by MMP28 knockdown). These data were added to Figure 7.

      It is surprising to see that most of these proteins do not show significant enrichment to a particular locus across this ~3 kb territory, while this reviewer would expect to see enrichment close to the TSS that quickly is lost as you move further upstream. Can you explain why MMP28, MMP14, and often Twist, show similar enrichment across this long genomic region?

      Thank you for this comment. Our initial choice of representation did not allow to compare profiles properly. Fold-enrichment to GFP, as suggested by this reviewer, now shows that Twist, MMP28 and MMP14 do not display the same pattern of enrichment across the various loci and that MMP28 pull downs leads to significant enrichments of some of the domains tested in Cad11 and Twist promoters.

      Third, the authors should include additional genomic loci to act as negative controls. For example, E-cadherin was unaffected by MMP28-MOspl, thus there may be no physical interaction between the E-cadherin locus and MMP28. It would be ideal to display results from at least one neural crest-related and one non-neural crest-related gene. Finally, this experiment requires statistical analyses to increase confidence in these interactions.

      Thank you for this comment. We tested binding to E-cadherin promoter for GFP and MMP28-GFP and found no enrichment with MMP28. We also performed statistics as requested. These data were added to Figure 7.

      Minor comments:

      1. The authors should expand their abstract to more explicitly describe the experiments and results presented within this study.

      Done

      1. In the introduction, line 57 is unclear. "MMP28 is the latest member..." Is this chronologically? Evolutionarily? After this, the authors' statement that the roles of MMP28 are "poorly described" (lines 59-60) seems contradicting with their next sentences citing several studies that document the roles of MMP28 in diverse systems.

      Thank you for this comment. The term “poorly described” was meant with respect to other MMPs with more extensive literature. We have now rephrased this part. Regarding the “latest member” we meant the last to be identified. We have now rephrased this part.

      1. To increase clarity, the authors should define which cell types are labeled by in situ hybridization for sox10 and foxi4.1 in Figure 1e.

      Thank you, we performed the requested clarifications and expanded the change to add the cell types labelled by the other genes used on the figure (see figure legend).

      1. The PCR analysis for mmp28 splicing shown in Figure 1g is very clear and well demonstrates the efficacy of the MMP28-MOspl. However, the authors should note in the figure legend what the "ODC" row represents as this is unclear.

      We added the definition of ODC in the figure legends and in the methods.

      1. On line 118 the authors first reference "MOatg" but should explicitly define this reagent and its mechanism of action for clarity.

      We performed the requested clarification.

      Referee Cross-commenting

      As with Reviewer #1, I was surprised that the RT-PCR analysis presented in support of the splicing MO lacked retention of intron one. I reasoned this might be due to reduced transcript abundance through a mechanism such as nonsense-mediated decay, but I agree that this data raises questions that the authors should address.

      Thank you for this comment. Indeed, this is presumably due to nonsense mediated RNA decay. We have not explored the biochemistry of MMP28 RNA following injection with MOspl. Splicing MOs can have multiple effects. As explained on the GeneTools website splicing MOs disturb the normal processing of pre-mRNA and cells have various ways to deal with this and there are multiple possible outcomes. The PCR with E1-I1 suggests that intron 1 is not retained. Therefore, a putative concern would be that MOspl led to exon-skipping and to the generation of a truncated form of MMP28. However, we have checked that it is not the case. The fact that the PCR using E7-E8 primers indicates a reduction as well suggest an overall degradation of the mRNA for MMP28. Importantly, the effect of MOspl can be rescued using MMP28 mRNA indicating that the knockdown is specific.

      I also agree with the other comments from Reviewers 1 and 3.

      Reviewer #2 (Significance):

      This study by Gouignard et al. provides compelling evidence for the role of MMP28 during neural crest EMT. As neural crest cells share similar EMT and migration mechanisms with cancer progression, they represent a powerful system in which to study these biological processes in vivo. Previous work on MMP function has focused primarily on extracellular matrix remodeling and the effect on cell migration, with less attention given to the role of MMPs during EMT. More recent reports in other systems have begun to elucidate a role for MMP translocation into the nucleus, indicating a surprising and novel mechanism for these proteins. This work would be of particular interest to audiences interested in cancer, cell, and developmental biology, as it highlights the importance of the non-canonical function of metalloproteinases during EMT and migration.

      Reviewer #3 (Evidence, reproducibility and clarity):

      Summary

      This study by Gouignard and colleagues explores the mechanisms involving the matrix-metalloprotease MMP28 in the epithelial-to-mesenchymal transition (EMT) of neural crest cells. Interestingly and provocatively, they focus not only on the extracellular functions of this protease but also on the roles of MMP28 in the nucleus. This in non-conventional sub-cellular localization is shared with other MMPs, but its significance remains poorly understood. Here, the authors show that the nuclear function of MMP28 impacts the expression of key EMT regulators in neural crest cells in vivo.

      Using Xenopus laevis as a powerful animal model to explore the early development, the authors show that mmp28 expression is found in the ectodermal placodal tissue adjacent to the neural crest prior and after EMT.<br /> In the first part of the study, the authors show that MMP28 depletion affects a subset of neural crest marker gene expression (snai2, twi, sox10) but not others (sox9, snai1), suggesting a specific role on a subset of the genes important for neural crest EMT. The MMP28 depletion phenotype is restored by coinjecting MMP28 MO and MMP28 mRNA, provided that the catalytic activity of the encoded protein is maintained. Next, epistasis (rescue) experiments show that Twist1 can compensate MMP28 depletion.<br /> The second part of the study elegantly shows that MMP28 produced by host adjacent tissues can translocate into the nucleus of neural crest cells grafted from a donor embryo (devoid of MMP28-GFP expression). It also shows that MMP28 nuclear localization as well as its catalytic activity are both required for activating the neural crest gene twist1 and sox10; and that MMP28 is found bound on the chromatin of twist1, cad11 and sox10.<br /> Altogether, these experiments strongly support a model for the nuclear role of MMP28 in the activation (or maintenance) of key genes of the EMT program in vertebrate neural crest cells.

      Major comments

      The key conclusions are:

      Conclusion 1: MMP28, expressed and secreted by placodes, is important for complete neural crest patterning prior to EMT, including activation of twist1 and EMT effector cadherin 11 genes. MMP28 is important for neural crest EMT and migration in vivo and in explant assay in vitro.

      However, this conclusion omits potential indirect effect of interfering with placode formation itself, as indicated by the strong decrease in six1 expression in morphant embryos. The effect of MMP28MO on the expression of six1 is as strong as for neural crest markers snai2, twi, for example. Line 95, "slight reduction" should be modified.

      Thank you or this comment. We have now modified the associated text.

      What this may mean for placodal development itself, as well as for indirect effects on neural crest cells need to be discussed.

      Following this comment, we added a paragraph in the discussion about Six1.

      Conclusion 2: Gain of Twist 1 (but not Cadherin 11) rescues MMP28 morphant phenotype, allowing EMT to occur and restoring several parameters of cell migration in vivo and in explant assay

      Conclusion 3: When secreted from adjacent cells, MMP28 is translocated into the nucleus of neural crest cells and displays a nuclear function important for the activation of twist1 expression.

      Both conclusions 2 and 3 are supported by multiple elegant and convincing experimental data. These conclusions do not depend on mmp28 exclusive expression by the placodal ectoderm, and would still be important if there was a minor expression in the neural crest cells themselves (and thus an autocrine effect).

      Additional experiments to strengthen the conclusions<br /> Related to Conclusion 1:

      • line 102-106: In the rescue experiment, is six1 expression rescued too?

      Thank you for this comment. As detailed in the newly added discussion paragraph about the effects of Six1 loss of function that have been described in the literature, it is very unlikely that our NC phenotypes stem from the observed reduction of Six1 expression.

      Nonetheless, following this comment we checked for Six1 expression in the placodal domain following MMP28 knockdown and rescue condition. In the rescue condition, only 25% of the embryos had recovered Six1 expression in placodes while 75% of the embryos recovered Sox10 expression in neural crest cells. These data further confirm that rescue of placodal genes is not a pre-requisite for the rescue of neural crest genes and were added in Supp Figure 5.

      Although MMP28 is likely to have a role in placodes as well, the expansion of Sox2 and Pax3 expression domain and the loss of Eya1 expression typically associated with Six1 knockdown did not occur in MMP28 knockdown. Our story being focused on neural crest cells, we did not investigate further how the MMP28-dependent effect on Six1 might impair placode development.

      • Figure 2g: qPCR analysis suggests that mmp28 is expressed in the neural crest explants themselves, levels being lowered by the MO injection. The levels of this potential expression in the neural crest itself should be compared to the levels in the placodal ectoderm. How do the authors exclude an effect of the MO within the neural crest tissue, independently of roles from the placodal tissue?

      Thank you for this comment. There is a very small subpopulation of NC cells called the medial crest that expresses MMP28. They are along a thin line along the edge of the neural folds. We previously described this in Gouignard et al Phil Trans Royal Soc B 2020. It is useful for us as an internal control for MO efficiency but the expression in placodes is much stronger and involves many more cells. However, this expression called our attention at the onset of the project and we performed some experiments to assess whether some of the observed effects were due to a NC-autonomous effect, as suggested by this reviewer. To test for this we performed targeted injected of the MO such that the medial crest would receive the MO but not the placodes. Targeting the medial crest with MMP28-MO had no effect on Sox10 expression. These data were added to new supp Figure 1.

      The cost and time for these additional experiments is limited (about 3 weeks), and uses reagents already available to the authors.

      Data and Methods are described with details including all necessary information to replicate the study. Replication is carefully done and statistical analysis seems convincing.

      Minor comments

      Experimental suggestions to further strengthen the conclusions.<br /> Related to Conclusion 1: - Figure 1e, frontal histological sections would help distinguishing between placodal tissue and neural crest mesenchyme.

      Thank you for this comment. We previously published a detailed expression pattern with such sections (Gouignard et al Phil Trans Royal Soc B, 2020). We rephrased the text to better refer to this previous publication.

      Related to Conclusion 2: - Figure 3: in explants co-injected with twist1 mRNA, is cad11 expression restored? Could this indicate if cad11 is (or is not) part of the program controlled by Twist1 (as suggested by the last main figure)?

      Thank you for this comment. We checked for Cadherin-11 expression in control MO, MMP28-MOspl and MOspl+Twist mRNA and Twist is indeed capable of inducing Cadherin-11 and even leads to ectopic activation of Cad11 on the injected side. These data were added to new Supp Figure 11.

      Related to Conclusion 3: is MMP28 translocation seen in any cell context? Could the authors repeat experiments in Figure 6a with animal cap ectoderm? And with sandwich animal cap ectoderm, one expressing MMP28-GFP versions (wt, deltaSPNLS) and the other Rhodamine Dextran only? This would allow to generalize the mechanism or on the contrary to show a neural crest specificity.

      Thank you for this comment. Following this suggestion and comments from the other reviewers, we performed new grafting experiments.

      • 1/ we replaced neural crest cells from embryos expressing MMP28-GFP by placodal cells injected with Rhodamine-dextran. This generates grafted embryos with control placodes next to placodes overexpressing MMP28-GFP. There, we can analyze entry of MMP28-GFP in placodal cells that do not overexpress it. We detected MMP28 in the cytoplasm and in the nucleus of these placodal cells. However, the rate of nuclear entry was lower than in NC cells.
      • 2/ To assess the importance of the cell type producing MMP28 we grafted NC cells injected with Rhodamine-dextran next to caudal ectoderm expressing MMP28-GFP. MMP28 was detected in cytoplasm and the nucleus of the NC cells but with a lower efficiency than when NC are grafted next to placodes expressing MMP28-GFP.
      • 3/ We made animal caps sandwiches with animal caps injected with Rhodamine-dextran and animal caps expressing MMP28-GFP. In this case MMP28-GFP is detected in the cytoplasm but fails to reach the nucleus. These data indicate that placodes can import MMP28 produced by placodes and that NC can import MMP28 produced by other cells than placodes. However, in both cases the rate of nuclear entry was lower than in the NC-placode situation. Finally the animal cap sandwiches indicate that entry into the cells does not predict entry into the nucleus. All these data were added to new Supp Figure 7 and quantifications of import of MMP28-GFP in the cytoplasm and the nucleus all conditions added to Figure 5.

      In supplementary figure 4a, the grey (RDx) is not visible in the zoom in images.

      As the grey channel interferes with visualizing the green channel, we only show the grey channel on the first low magnification image so that the position of grafted cells can be seen. We found it better to omit it from the zoomed in images to avoid masking the GFP signal.

      In figure 7a,b MMP14 is green, GFP is grey (mentioned wrongly in line 276)

      Thank you for pointing this out. We have extensively modified Figure 7 and such issues are now resolved.

      Bibliographical references are accurate. Clarity of the text and figures is excellent, except maybe Figure 7, where a qPCR analysis would be easier to visualize, especially with low-level or fuzzy bands on the gel.

      Thank you. We have now modified Figure 7, including normalization to GFP to show fold-change enrichment and have added new data from three independent ChIP assays for proximal Twist and E-cadherin promoters that we hope further substantiate our initial observations.

      Reviewer #3 (Significance):

      Place of the work in the field's context:

      In cancer, the MMP proteins are widely described in multiple tumor contexts and promote cell invasion. In development, several studies have focused on their functions in the extracellular space. The nuclear localization of MMP family proteins has been described previously but remained poorly understood so far. This work is thus a pioneer study aiming to understand MMP28 nuclear function.

      Advance:

      This study makes a significant advance in the field, by unraveling the importance of the MMP28 activity in the cell nucleus for the expression of key EMT regulators. Moreover, the study suggests that extracellular MMP28 secreted by adjacent cells or tissues can be internalized and transported to cell nucleus into cells located several cell diameters away. This study thus supports a novel facet of MMP proteins activity, complementary to their previously described role on the extracellular matrix, and further favoring cell invasion, in development and potentially in cancer too.

      The target audience goes without doubt beyond developmental biologists (the primary interest) and also includes cell and cancer biologists, and any biologist interested by MMPs or cell invasion mechanisms in vivo.

      My field of expertise is developmental biology focused on neural and neural crest early development, mainly using animal models in vivo and some cell culture experiments. I also focus on some aspects of cancer cell migration.

    1. For years inventions have extended man's physical powers rather than the powers of his mind.
    2. warfare
      • Comment
      • Observation
        • it is known that warfare is a significant source of technological innovation
        • this can be explained by evolutionary biology
        • our instruct for survival is strongest in ( inter-species) conflict
        • such is the deep irony of human progress
        • now, in the Anthropocene, humanity is waging another war for survival, caused by our war against nature
      • we can characterized this war as a war against past ignorance
    3. “Consider a future device …  in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory.”
      • The explorations of a system that could
      • record our learning trail in life
        • personal individual synthesis
        • new knowledge gained by social learning:
          • from direct, synchronous, real-time interaction with another live other human being
          • from indirect, asynchronous, non-real-time interaction with cultural artefacts produced by another
      • Bush famously named cc this the "memex"
    4. Even if utterly new recording procedures do not appear, these present ones are certainly in the process of modification and extension.

      Inventions are not made in a vacuum, requiring cultural influences

    1. Author Response

      Reviewer #1 (Public Review):

      Using health insurance claims data (from 8M subjects), a retrospective propensity score matched cohort study was performed (450K in both groups) to quantify associations between bisphosphonate (BP) use and COVID- 19 related outcomes (COVID-19 diagnosis, testing and COVID-19 hospitalization. The observation periods were 1-1-2019 till 2-29-2020 for BP use and from 3-1-2020 and 6-30-2020 for the COVID endpoints. In primary and sensitivity analyses BP use was consistently associated with lower odds for COVID-19, testing and COVID-19 hospitalization.

      The major strength of this study is the size of the study population, allowing a propensity-based matched- cohort study with 450K in both groups, with a sizeable number of COVID-19 related endpoints. Health insurance claims data were used with the intrinsic risk of some misclassification for exposure. In addition there probably is misclassification of endpoints as testing for COVID-19 was limited during the study period. Furthermore, the retrospective nature of the study includes the risk of residual confounding, which has been addressed - to some extent - by sensitivity analyses.

      In all analyses there is a consistent finding that BP exposure is associated with reduced odds for COVID-19 related outcomes. The effect size is large, with high precision.

      The authors extensively discuss the (many) potential limitations inherent to the study design and conclude that these findings warrant confirmation, preferably in intervention studies. If confirmed BP use could be a powerful adjunct in the prevention of infection and hospitalization due to COVID-19.

      We thank the reviewer for this overall very positive feedback. We appreciate the reviewer's comments regarding the potential risks associated with misclassification of exposure and other potential limitations, which we have sought to address in a number of sensitivity analyses and are also addressing in the discussion of our paper. In addition, as noted by the reviewer, the observed effect size of BP use on COVID-19 related outcomes is large, with high precision, which we feel is a strong argument to explore this class of drugs in further prospective studies.

      Reviewer #2 (Public Review):

      The authors performed a retrospective cohort study using claims data to assess the causal relationship between bisphosphonate (BP) use and COVID-19 outcomes. They used propensity score matching to adjust for measured confounders. This is an interesting study and the authors performed several sensitivity analyses to assess the robustness of their findings. The authors are properly cautious in the interpretation of their results and justly call for randomized controlled trials to confirm a causal relationship. However, there are some methodological limitations that are not properly addressed yet.

      Strengths of the paper include:

      (A) Availability of a large dataset.

      (B) Using propensity score matching to adjust for confounding.

      (C) Sensitivity analyses to challenge key assumptions (although not all of them add value in my opinion, see specific comments)

      (D) Cautious interpretation of results, the authors are aware of the limitations of the study design.

      Limitation of the paper are:

      (A) This is an observational study using register data. Therefore, the study is prone to residual confounding and information bias. The authors are well aware of that.

      (B) The authors adjusted for Carlson comorbidity index whereas they had individual comorbidity data available and a dataset large enough to adjust for each comorbidity separately.

      (C) The primary analysis violates the positivity assumption (a substantial part of the population had no indication for bisphosphonates; see specific comments). I feel that one of the sensitivity analyses 1 or 2 would be more suited for a primary analysis.

      (D) Some of the other sensitivity analyses have underlying assumptions that are not discussed and do not necessarily hold (see specific comments).

      In its current form the limitations hinder a good interpretation of the results and, therefore, in my opinion do not support the conclusion of the paper.

      The finding of a substantial risk reduction of (severe) COVID-19 in bisphosphonate users compared to non- users in this observational study may be of interest to other researchers considering to set up randomized controlled trials for evaluation of repurpose drugs for prevention of (severe) COVID-19.

      We thank the reviewer for the insightful comments and questions related to our manuscript. Our response to the concerns regarding limitations of our study is as follows:

      (A) We agree that there is likely residual confounding and information bias due to use of US health insurance claims datasets which do not include information on certain potentially relevant variables. Nonetheless, given the large effect size and precision of our analysis, we feel that our findings support our main conclusion that additional prospective trials appear warranted to further explore whether BPs might confer a meaure of protection against severe respiratory infections, including COVID-19. We have added a sentence on the second page of our Discussion (line 859-860) to emphasize this point: "Specifically, there is the potential that key patient characteristics impacting outcomes could not be derived from claims data."

      (B) The progression of this study mirrors the real-world performance of the analysis where we initially used the CCI in matching to control for comorbidity burden on a broader scale. This was our a priori approach. After observing large effect sizes, we performed more stringent matching for sensitivity analyses 1 and 2. Irrespective of the matching strategy chosen, effect sizes remained similar for all outcome parameters. Therefore, we elected to include both the primary analysis and the sensitivity analyses with more stringent matching in order to more transparently show what was done in entirety during our analyses, as we feel it displays all of the efforts taken to identify sources of unmeasured confounding which could have impacted our results.

      (C) We agree that the positivity assumption is a key factor to consider when building comparable treatment cohorts. We also agree that it is the important to separately perform the analysis for either all patients with an indication for use of BPs and for other anti-osteoporosis medications, as we have done in our analysis of the Osteo-Dx-Rx cohort and Bone-Rx cohort, respectively. However, we did not have sufficient data, a priori, to determine whether BP users would be more similar in their risk of COVID-19 outcomes to non- users or to other users of anti-resorptive medications. In addition, we believe that this specific limitation does not negate our findings in the primary analysis for the following reasons: (1) ‘Type of Outcome’: the outcomes in this study are related to infectious disease and are not direct clinical outcomes of any known treatment benefits of BPs. The clinical benefits being assessed - impact of BP use on COVID-19-related outcomes - were essentially unknown at the time of the study data; this fact mitigates the impact of any violation of the positivity assumption; and (2) ‘Clinical Population’: after propensity score matching, both the BP user and the BP non-user group in the primary analysis mainly consisted of older females (90.1% female, 97.2% age>50), which is the main population with clinical indications for BP use. According to NCHS Data Brief No. 93 (April 2012) released by the CDC, ~75% and 95% of US women between 60-69 and 70-79 suffer from either low bone mass or osteoporosis, respectively, and essentially all women (and 70% of men) above age 80 suffer from these conditions, which often go undiagnosed (https://www.cdc.gov/nchs/data/databriefs/db93.pdf). Women aged 60 and older make up ~75% of our study population (Table 1). Although bone density measurements are not available for non- BP users in the matched primary cohort, there is a high probability that the incidence of osteoporosis and/or low bone mass in these patients was similar to the national average. This justifies the assumption that BP therapy was indicated for most non-BP users in the matched primary cohort. Arguably, for these patients the positivity assumption was not violated.

      (D) We will discuss in detail below the specific issues raised by the reviewer regarding our sensitivity analyses. In general we acknowledge that individual analytical and/or matching approaches may each have their own limitations, but the analyses performed herein were done to test in a systematic fashion the different critical threats to the validity of our initial results in the primary cohort analysis, which were based on a priori-defined methods and yielded a large and robust effect size. Thus, the individual sensitivity analyses should be considered in the greater context of the entire project.

      Specific comments (in order of manuscript):

      Methods:

      Line 158: it is unclear how the authors dealt with patients who died during the follow-up period. The wording suggests they were excluded which would be inappropriate.

      When this study was executed, we were unable to link the patient-level US insurance claims data with patient-level mortality data due to HIPAA concerns. Therefore, line 158 (now 177) defines continuous insurance coverage during the observation period as a verifiable eligibility criterion we used for patient inclusion. It was necessary to disqualify individuals who discontinued insurance coverage for a variety of reasons, e.g. due to loss or change of coverage, relocation etc., but our approach also eliminated patients who died. Appendix 3 (line 2449ff) describes methods we employed post hoc to assess how censoring due to death could have impacted our analyses. We discuss our conclusions from this post hoc analysis in the main text (lines 1053-1058) as follows: "An additional limitation is potential censoring of patients who died during the observation period, resulting in truncated insurance eligibility and exclusion based on the continuous insurance eligibility requirement. However, modelling the impact of censoring by using death rates observed in BP users and non-users in the first six months of 2020 and attributing all deaths as COVID-19-related did not significantly alter the decreased odds of COVID-19 diagnosis in BP users (see Appendix 3)."

      Why did the authors use CCI for propensity matching rather than the individual comorbid conditions? I presume using separate variables will improve the comparability of the cohorts. The authors discuss imbalances in comorbidities as a limitation but should rather have avoided this.

      CCI was the a priori approach defined at the study outset and was chosen due to the widespread use and understanding of this score. The general CCI score was originally planned for matching in order to have the largest possible study population since we did not know how many patients would meet all criteria as well as have an event of interest. After realizing we had adequate sample size to power matching using stricter criteria, we proceeded to perform subsequent sensitivity analyses on more stringently matched cohorts (sensitivity analysis 2).

      Line 301-10: it seems unnecesary to me to adjust for the given covariates while these were already used for propensity score matching (except comorbidities, but see previous comment). The manuscript doesn't give a rationale why did the authors choose for this 'double correction'.

      The following language was added to the methods section (lines 325-327): “Demographic characteristics used in the matching procedure were also included in the final outcome regressions to control for the impact of those characteristics on outcomes modelled.”

      The following language was added to the Discussion section regarding the potential limitations of our srudy (lines 1078-1085): “Another limitation in the current study is related to a potential ‘double correction’ of patient characteristics that were included in both the propensity score matching procedure as well as the outcome regression modelling, which could lead to overfitting of the regression models and an overestimation of the measured treatment effect. Covariates were included in the regression models since these characteristics could have differential impacts on the outcomes themselves, and our results show that the adjusted ORs were in fact larger (showing a decreased effect size) when compared to the unadjusted ORs, which show the difference in effect sizes of the matched populations alone.”

      In causal research a very important assumption is the 'positivity assumption', which means that none of the individuals has a probability of zero or one to be exposed. Including everyone would therefore not be appropriate. My suggestion is to include either all patients with an indication (based on diagnosis) or all that use an anti-osteoporosis (AOP) drug (or one as the primary and the other as the sensitivity analysis) instead of using these cohorts as sensitivity analyses. The choice should in my opinion be based on two aspects: whether it is likely that other AOP drugs have an effect on the COVID-19 outcomes and whether BP users are deemed to be more similar (in their risk of COVID-19 outcomes) to non-users or to other AOP drug users. Or alternatively, the authors might have discussed the positivity assumption and argue why this is not applicable to their primary analysis.

      The following text has been added to the Discussion section addressing potential limitations of our study (lines 987-1009): " Another potential limitation of this study relates to the positivity assumption, which when building comparable treatment cohorts is violated when the comparator population does not have an indication for the exposure being modelled 56. This limitation is present in the primary cohort comparisons between BP users and BP non-users, as well as in the sensitivity analyses involving other preventive medications. This limitation, however, is mitigated by the fact that the outcomes in this study are related to infectious disease and are not direct clinical outcomes of known treatment benefits of BPs. The fact that the clinical benefits being assessed – the impact of BPs on COVID-related outcomes – was essentially unknown clinically at the time of the study data minimizes the impact of violation of the positivity assumption. Furthermore, our sensitivity analyses involving the “Bone-Rx” and “Osteo-Dx- Rx” cohorts did not suffer this potential violation, and the results from those analyses support those from the primary analysis cohort comparisons. Moreover, we note that the propensity score matched BP users and BP non-users in the primary analysis cohort mainly consisted of older females. According to the CDC, ~75% and 95% of US women between 60-69 and 70-79 suffer from either low bone mass or osteoporosis, respectively (https://www.cdc.gov/nchs/data/databriefs/db93.pdf). Essentially all women (and 70% of men) above age 80 suffer from these conditions, which often go undiagnosed. Women aged 60 and older represent ~75% of our study population (Table 1). Although bone density measurements are not available for non-BP users in the matched primary cohort, there is a high probability that the incidence of osteoporosis and/or low bone mass in these patients was similar to the national average.Thus, BP therapy would have been indicated for most non-BP users in the matched primary cohort, and arguably, for these patients the positivity assumption was not violated."

      Sensitivity Analysis 3: Association of BP-use with Exploratory Negative Control Outcomes: what is the implicit assumption in this analysis? I think the assumption here is that any residual confounding would be of the same magnitude for these outcomes. But that depends on the strength of the association between the confounder and the outcome which needs not be the same. Here, risk avoiding behavior (social distancing) is the most obvious unmeasured confounder, which may not have a strong effect on other health outcomes. Also it is unclear to me why acute cholecystitis and acute pancreatitis-related inpatient/emergency-room were selected as negative controls. Do the authors have convincing evidence that BPs have no effect on these outcomes? Yet, if the authors believe that this is indeed a valid approach to measure residual confounding, I think the authors might have taken a step further and present ORs for BP → COVID-19 outcomes that are corrected for the unmeasured confounding. (e.g. if OR BP → COVID-19 is ~ 0.2 and OR BP → acute cholecystitis is ~ 0.5, then 'corrected' OR of BP → COVID-19 would be ~ 0.4.

      We appreciate the reviewer’s thoughtful comments regarding the differential strength of the association between unmeasured confounders and outcome. We had initially selected acute cholecystitis and pancreatitis-related inpatient and emergency room visits as negative controls because we deemed them to be emergent clinical scenarios that should not be impacted by risk avoiding behavior. However, upon further search, we identified several publications that suggest a potential impact of osteoporosis and/or BPs on gallbladder diseases (DOIhttps://doi.org/10.1186/s12876-014-0192-z; http://dx.doi.org/10.1136/annrheumdis-2017-eular.3900), thus calling the validity our strategy into question. We therefore agree that the designation of negative control outcomes is problematic and adds relatively little to the overall story. Therefore, we have removed these analyses from the revised manuscript.

      Sensitivity Analysis 4: Association of BP-use with Exploratory Positive Control Outcomes: this doesn't help me be convinced of the lack of bias. If previous researchers suffered from residual confounding, the same type of mechanisms apply here. (It might still be valuable to replicate the previous findings, but not as a sensitivity analysis of the current study).

      We agree that the same residual confounding in previous research papers could be present in our study. Nonetheless, it was important to assess whether our analysis would be potentially subject to additional (or different) confounding due to the nature of insurance claims data as compared to the previous electronic record-based studies. Therefore, it was relevant to see if previous findings of an association between BP use and upper respiratory infections are observable in our cohort.

      The second goal of sensitivity analysis #4 (now #3) was to see whether associations could be found on different sets of respiratory infection-based conditions, both during the time of the pandemic/study period as well as during the pre-pandemic time, i.e. before medical care in the US was significantly impacted by the pandemic. In light of these considerations, we feel that sensitivity analysis 4 adds value by showing consistency in our core findings.

      Sensitivity Analysis 5: Association of Other Preventive Drugs with COVID-19-Related Outcomes: Same here as for sensitivity analysis 3: the assumption that the association of unmeasured confounders with other drugs is equally strong as for BPs. Authors should explicitly state the assumptions of the sensitivity analyses and argue why they are reasonable.

      The following sentence was added to the Discussion section (lines 1019-1020): “ "These analyses were based on the assumption that the association of unmeasured confounders with other drugs is comparable in magnitude and quality as for BPs."

      Results: The data are clearly presented. The C-statistic / ROC-AUC of the propensity model is missing.

      Unfortunately, a significant amount of time has passed since execution of our original analysis of the Komodo dataset by our co-authors at Cerner Enviza. To date, our ability to perform follow-up studies with the Komodo dataset (which is exclusively housed on Komodo's secure servers) has become limited because business arrangements between these companies have been terminated, and the pertinent statistical software is no longer active. This issue prevents us from attaining the original C-statistic and ROC-AUC information, however, we were able to extract the actual; propensity scores themselves for the base cohort matching (BP-users versus non-users). The table below illustrates that the distribution of propensity scores for the base cohort match ranged from <0.01 to a max of 0.49, with 81.4% of patients having a propensity score of 10-49%, and 52.9% of patients having a propensity score of 20-49%. This distribution is unlikely to reflect patients who had a propensity score of either all 0 or all 1.

      Discussion:

      When discussing other studies the authors reduce these results to 'did' or 'did not find an association'. Although commonly practiced, it doesn't justify the statistical uncertainty of both positive and negative findings. Instead I encourage the authors to include effect estimates and confidence intervals. This is particularly relevant for studies that are inconclusive (i.e. lower bound of confidence interval not excluding a clinically relevant reduction while upper bound not excluding a NULL-effect).

      We appreciate the reviewer’s suggestion and have added this information on p.21/22 in the Discussion.

      Line 1145 "These retrospective findings strongly suggest that BPs should be considered for prophylactic and/or therapeutic use in individuals at risk of SARS-CoV-2 infection." I agree for prophylactic use but do not see how the study results suggest anything for therapeutic use.

      We have removed “and/or therapeutic use” from this sentence (line 1088-1090).

      The authors should discuss the acceptability of using BPs as preventive treatment (long-term use in persons without osteoporosis or other indication for BPs). This is not my expertise but I reckon there will be little experience with long-term inhibiting osteoblasts in people with healthy bones. The authors should also discuss what prospective study design would be suitable and what sample size would be needed to demonstrate a reasonable reduction. (Say 50% accounting for some residual confounding being present in the current study.)

      Although BPs are also used in pediatric populations and in patients without osteoporosis (for example, patients with malignancy), we do recognize the lack of long-term safety data in use of BPs as preventative treatments. We tried to partially address this concern in our sub-stratified analysis of COVID-19 related outcomes and time of exposure to BP. Reassuringly, we observed that patients newly prescribed alendronic acid in February 2020 also had decreased odds of COVID-19 related outcomes (Figure 3B), suggesting that the duration of BP treatment may not need to be long-term. This was further discussed in the last paragraph of our Discussion where we state that " BP use at the time of infection may not be necessary for protection against COVID-19. Rather, our results suggest that prophylactic BP therapy may be sufficient to achieve a potentially rapid and sustained immune modulation resulting in profound mitigation of the incidence and/or severity of infections by SARS- CoV-2."

      We agree that a future prospective study on the effect of BPs on COVID-19 related outcomes will require careful consideration of the study design, sample size, statistical power etc. However, we feel that a detailed discussion of these considerations is beyond the scope of the present study.

      The authors should discuss the fact that confounders were based on registry data which is prone to misclassification. This can result in residual confounding.

      Some potential sources of misclassification have been discussed on line 932-948. In addition, the following language was added (line 970-985): "Additionally, limitations may be present due to misclassification bias of study outcomes due to the specific procedure/diagnostic codes used as well as the potential for residual confounding occurring for patient characteristics related to study outcomes that are unable to be operationalized in claims data, which would impact all cohort comparisons. For SARS- CoV-2 testing, procedure codes were limited to those testing for active infection, and therefore observations could be missed if they were captured via antibody testing (CPT 86318, 86328). These codes were excluded a priori due to the focus on the symptomatic COVID-19 population. Furthermore, for the COVID-19 diagnosis and hospitalization outcomes, all events were identified using the ICD-10 code for lab-confirmed COVID-19 (U07.1), and therefore events with an associated diagnosis code for suspected COVID-19 (U07.2) were not included. This was done to have a more stringent algorithm when identifying COVID-19-related events, and any impact of events identified using U07.2 is considered minimal, as previous studies of the early COVID-19 outbreak have found that U07.1 alone has a positive predictive value of 94%55, and for this study U07.1 captured 99.2%, 99.0%, and 97.5% of all COVID-19 patient-diagnoses for the primary, “Bone-Rx”, and “Osteo-Dx-Rx” cohorts, respectively."

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to reviewers' comments

      We thank the reviewers for their constructive evaluation of our manuscript. In the following point-by-point response, we explain how we will implement the suggested modifications.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary:

      The formation of meiotic double-stranded DNA breaks is the starting point of meiotic recombination. DNA breaks are made by the topoisomerase-like SPO11, which interacts with a number of regulatory factors including REC114, MEI4 and IHO1. Despite the key role this process has in the continuation, and genetic variation, or eukaryotic life, there is very little known about how this process is regulated. Laroussi et al make use of biochemical, biophysical and structural biological approaches to extensively characterise the REC114-MEI4-IHO1 complex.

      This is an outstanding biochemical paper. The experiments are well planned and beautifully executed. The protein purifications used are very clean, and the figures well presented. Importantly Laroussi et. al describe, and carefully characterise through point mutational analysis, the direct physical interaction between IHO1 and REC114-MEI4. This is an interaction that has, at least in yeast, previously been suggested to be driven by liquid-liquid separation. The careful and convincing work presented here represents an important paradigm-shift for the field.

      I am fully supportive of publication of this excellent and important study.

      We thank the reviewer for his/her positive comments, appreciation of the importance of our study and suggested modifications.

      Major comments:

      Point 1:

      My only major concern is regarding Figure 4, and specifically the AF2 model of the coiled-coil tetramer of IHO1. Given the ease with which MSAs of coiled-coils can become "contaminated" with non-orthologous sequences, I would urge some caution with this model. This is especially since the yeast ortholog of IHO1, Mer2, has been previously reported to be an anti-parallel tetramer (albeit, not very well supported by the data). The authors have several choices here. 1) They could simply reduce the visibility of the IHO1 tetramer model, and indicate caution in the parallel tetramer model. 2) They could consider using a structure prediction algorithm that doesn't use an MSA (e.g. ESMFold). 3) They could try to obtain experimental evidence for a parallel coiled-coil tetramer, e.g. through EM, SAXS or FRET approaches. I would like to make it crystal clear, however, that I would be *very* supportive of approach 1) or 2). An experimental approach is *not* necessary.

      Assuming the authors don't take a wet-lab approach, this shouldn't take more than a couple of weeks.

      This is a very good suggestion. We are aware of the previously reported anti-parallel architecture of the yeast IHO1 ortholog Mer2 (Claeys Bouuaert et al., Nature 2021). It should be noted, that in the recent preprint, posted by the Claeys Bouuaert lab (BioRxiv, https://doi.org/10.1101/2022.12.16.520760), a high confidence model of yeast Mer2 (and for human) parallel tetrameric coliled-coil is presented, apparently consistent with their previous XL-MS results (Claeys Bouuaert et al., Nature 2021).

      To clarify this issue we will follow the suggestions of Reviewer 1 and 2.

      1. As suggested also by Reviewer 2, we will produce a tethered dimer of IHO1125-260, connected by a short linker and determine its MW by SEC-MALLS (and SAXS).
      2. In the meantime we followed the suggestion of Reviewer 1 and modelled the IHO1130-281 by the ESMfold, which is another recent powerful AI-based program that does not use multiple sequence alignments. Remarkably, the predicted structure is very similar to the one predicted by AlphaFold, also predicting the parallel arrangement of IHO1. This model will be included as a supplementary figure.
      3. We will also point out in the text that these models, despite being very convincing, remain models.

        Minor comments:

      Point 2:

      The observation that REC114 and MEI4 can also form a 4:2 complex is very interesting and potentially important. Did the authors also try to model this higher order complex in AF2?

      Yes, we did this with the hope that we could identify residues whose mutation could limit the fast exchange between the 2:1 and 4:2 states. Unfortunately, no convincing additional contacts are modelled by AlphaFold. This PAE plot will be included as a supplementary figure.

      Point 3:

      Similarly to above, what does the prediction of the full-length REC114:MEI4 2:1 complex look like? Presumably the predicted interaction regions align well with experimental data, but it would be interesting to see (and easy to run).

      The AlphaFold modelling of the FL REC114:MEI4 (2:1) complex will be included as supplementary figure. It is consistent with the model comprising only the interacting regions. No additional convincing contacts are predicted.

      Point 4:

      Did the authors carry out SEC-MALS experiments on any IHO1 fragment lacking the coiled-coil domain? It was previously reported for Mer2 that the C-terminal region can form dimers, for example (OPTIONAL).

      We can easily do that. We have the N- and C- terminal regions lacking the coiled-coil expressed as MBP fusions and they will be analysed by SEC-MALLS.

      Point 5:

      Given that full-length REC114 is used for the IHO1 interaction studies, do the authors have any data as to the stoichiometry of the REC114FL-MEI41-127 complex? (OPTIONAL)

      We have repeatedly analysed the REC114-MEI4-IHO1 complex sample by SEC-MALLS and native mass spectrometry, but in both cases the sample is too complex to be interpreted. This is like due to the fast exchange between REC114-MEI4 2:1 and 4:2 complexes and low binding affinity of IHO1 for REC114.

      Point 6:

      Did the authors try AF2 modelling of the REC114-IHO1 interaction using orthologs from other species?

      Yes, but not extensively. We will repeat this modelling again.

      **Referees cross commenting**

      I will add cross-comments to the comments of Reviewer #2

      Firstly, the comments made by Reviewer #2 are technically correct. Firstly, reviewer #2 points out that the oligomerization states that the authors report could, in part, be artifactual the based on the his-tag purification method. This is indeed correct. However, given that none of the oligomerization states reported are per se unusual, given what is already known (including pre-prints from the Keeney and Claeys Bouuaert laboratories), I think the authors could forego this step.

      Secondly, the use of an experimental structural method, such as SAXS, would certainly add value to the paper. Also Reviewer #2 is correct in pointing out the availability of the ESRF beamlines to the authors. However, while SAXS is a useful method, I personally consider the use of mutants to validate the interactions, an even stronger piece of evidence that the AlphaFold2 interactions are correct. I must disagree somewhat with Reviewer #2 with their argument that SAXS would validate the fold. Certainly if one of the AF2 predicted structures is radically wrong, then SAXS would produce scattering data, and a subsequent distance distribution plot that is incompatible with the AF2 model. However, a partly correct AF2 model, of roughly the right shape, might still fit into a SAXS envelope.

      Reviewer #2 shares my concern on the parallel coiled-coil of IHO1, and their suggested solution is very elegant.

      In my view, due to the time constraints imposed by the partially competing work from the Keeney and Claeys Bouuaert laboratories (recently on biorxiv). I would support the authors if they chose the quickest route to publication.

      Reviewer #1 (Significance (Required)):

      General assessment: The strengths of the paper are as follows:

      1) Quality of experiments - The proteins used have been properly purified (SEC) and properly handled. The experiments are carefully carried out and controlled.

      2) Detail - The authors carry out the considerable effort of characterising protein interactions. through separation-of-function mutants. This adds to the quality of the paper, and renders the AF2 models as convincing as experimentally determined structures

      3) Conceptual advances - The most important conceptual advance is the direct binding of the N-term of IHO1 to REC114. That this is the same region as used by both TOPOVIBL and ANKRD31 points to a complex regulation.

      4) Integrity - the authors have taken great care not to "oversell" the results. The data are presented, and analysed, without hyperbole.

      Limitations - The only limitation of the paper is the lack of in vivo experiments to test their findings. However given the time and effort required, and the pressing need to publish this exciting study, this is not a problem.

      Advance: The paper provides advances from a number of directions, both conceptual and mechanistic. Mechanistically the description of the REC114-MEI14 complex is important, and in particular the observation that it can also form a higher order 4:2 structure. Likewise, while IHO1 was inferred to be a tetramer (based on work on Mer2) this was never proven formally. Most importantly, is the physical linkage between IHO1 and REC114, and that this is an interaction that is incompatible with TOPOVIBL and ANKRD31.

      Audience:

      Given the central role of meiotic recombination in eukaryotic life, any studies that shed additional light on the regulation of meiosis are suitable for a broad audience. However, this subject paper will be more specifically of interest to the meiosis community. The elegant methodology will also be of interest to structural biologists and protein biochemists.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This manuscript addresses the structure of the REC114-MEI4-IHO1 complex, which controls the essential process of programmed DSB induction by SPO11/TOPOVIBL in meiosis.

      The manuscript carefully combines biochemistry, biophysics and modelling in an integrative manner to report the architecture of the full REC114-MEI4-IHO1 complex that is not itself amenable to direct structure elucidation such as by X-ray crystallography. These are important results that will be of interest to the recombination and meiosis fields. The data are generally convincing and interpretations appear correct, so the manuscript is certainly suitable for publication. I have included some suggestions below that I believe would strengthen the manuscript and enhance our confidence in the findings. Whilst the manuscript is publishable in its current format, I believe the suggestions given below would make it into a much stronger paper.

      We thank the reviewer for his/her positive comments on our study and the suggestions below.

      I have two general suggestions:

      Point 1:

      Analyses have been performed on fusion proteins (His, His-MBP etc). we have previously observed that bulky tags such as MBP can interfere with oligomeric state through steric hindrance, and that His-tags can mediated formation of larger oligomers, seemingly through coordination of metals leached from IMAC purification. This latter point has also been observed by others

      https://www.sciencedirect.com/science/article/pii/S1047847722000946.

      Where possible, I would repeat SEC-MALS experiments using untagged proteins, or at least following incubation with EDTA to mitigate the potential for His-mediated oligomerization.

      We agree with this reviewer’s comment that expression tags can have unexpected impact of the protein behaviour.

      1. For REC114-MEI4 complex the stoichiometry is assessed by several techniques. Figure 1f,g shows analytical ultracentrifugation, which was performed on the minimal REC114226-254-MEI41-43 complex that contains no fusion tag showing that this stoichiometry is independent of fusion tags. We will nevertheless repeat the SEC-MALLS on REC114-MEI41-127 after removing the His-tag of MEI4 as suggested.
      2. For the REC114 dimer, we cannot remove the His-MBP tag since this short fragment of REC114226-254 is no stable without MBP. The dimerization of Rec114 was already reported in (Claeys Bouuaert et al., Nature 2021). The dimerization is sensitive to specific point mutations within REC114. We will however, repeat the SEC-MALLS experiment following incubation with EDTA to mitigate the potential for His-mediated oligomerization.
      3. The presented SEC-MALLS on IHO1 fragments (Figure 4b) was done on proteins without fusion tags. Reviewer 1 and 2 also agreed that additional repeats of the experiments without fusion tags are not necessary.

      The authors have relied upon mutagenesis to validate Alphafold2 models. Their results are convincing but only confirm that contacts involved in structures rather than the specific fold per se. Their finding would be greatly strengthen by collecting SEC-SAXS data and fitting models directly to the scattering data. This is extremely sensitive, so a close fit provides the best possible evidence of accuracy of the model. SAXS is affected by unstructured regions and tags, so would have to be performed using structural cores of untagged proteins rather than full-length constructs. Given the local availability of world-class SAXS beamlines at the ESRF, which is next door to the leading author's institute, it seems that the collection of SAXS data would be practical for them.

      The usage of SAXS is discussed in the specific points below. We will attempt to do SEC-SAXS on the REC114-MEI4 complex. Due to instability of REC114226-254 without MBP, SAXS cannot be done. We will also do SAXS on the IHO1 tetramer.

      My specific comments are below:

      Point 2:

      Figure 1d

      The SEC-MALS shows multiple species, with 2:1 and 4:2 representing a minority of total species present. What are the larger oligomers? Could these be an artefactual consequence of the His-tags (as described above)?

      This SEC-MALLS will be repeated without the His-tag on MEI4.

      Point 3:

      Figure 1f,g

      The AUC changes over concentration and pH are intriguing - have they tried MALS in these conditions? This would be much more informative as it would reveal the range of species present. Low concentrations could be analysed by peak position even if scattering is insufficient to provide interpretable MW fits. I would advise doing this without his tag or adding EDTA (as described above).

      We will perform this experiment as suggested.

      Point 4:

      Figure 2

      I would like to see the models validated by SAXS using minimum core untagged constructs. This could be sued to test the validity of the 2:1 model directly, and to model the 4:2 complex by multiphase analysis and/or docking together of 2:1 complexes.

      The hydrophobic LALALAII region of MEI4 is interesting and the mutagenesis data do agree with the model. However, it is important to point out that any decent model would place this hydrophobic helix in the core of the complex, and so would be disrupted by mutagenesis. Hence, the mutagenesis results confirm that the hydrophobic helix is critical for the interaction, but does not confirm that the specific alphafold model is more valid than any other model in which the helix is similarly in a core position.

      We will attempt to perform the SEC-SAXS measurements. The challenge here will be obtaining a sample that is monodisperse in solution being required for SAXS. We showed the fast exchange between the 2:1 and 4:2 oligomeric state. The AUC data indicates that the sample has a predominantly 2:1 stoichiometry at 0.2 mg/ml, pH 4.5 and 500mM NaCl. Given the small size of the complex, the signal at 0.2 mg/ml is likely to be noisy.

      Point 5:

      Figure 3

      This would also benefit from SAXS validation of the structural core. The mutagenesis here provides convincing evidence in favour of the structure as specific hydrophobics ether disrupt or have no effect, exactly as predicted. Hence, their data strongly support the dimer model. As this provides the framework for the 2:1 complex, these data make me far more confident of the previous 2:1 model in figure 2. I am wondering whether it would be better to present these data first such that the reader can see the 2:1 model being built upon these strong foundations?

      We agree with this suggestion and will present the REC114 dimerization data before the REC114-MEI4 complex. However, REC114226-254 is not stable without the MBP tag so is not suitable for SAXS analysis.

      Point 6:

      Figure 4

      The MALS data convincingly show formation of a tetramer. How do we know that it is parallel? The truncation supports this but coiled-coils do sometimes form alternative structures when truncated (e.g. anti-parallel can become parallel when sequence is removed), and alphafold seems to have a tendency of predicting parallel coiled-coils even when the true structure of anti-parallel (informal observation by us and others). A simple test would be to make a tethered dimer of 110-240, with a short flexible linker between two copies of the same sequence - if parallel it should form a tetramer of double the length, if anti-parallel it should form a dimer of the same length - determinable by MALS (and SAXS).

      To address this point we will perform this experiment as suggested by Reviewer 2. We will produce a tethered dimer of IHO1 110-240, connected by a short linker and determine its MW by MALS (and possibly SAXS). We also performed ESMfold modelling (Reviewer 1, Point 1), resulting in the same model. As the IHO1 tetramer is likely suitable for SAXS analysis, we will also perform SAXS on it.

      Point 7:

      Figures 5/6

      The interaction is clear albeit low affinity (but within the biologically interesting range). It would be nice to obtain MALS (using superose 6) data showing the stoichiometry of the complex - are the data too noisy to be interpretable owing to dissociation? The alpahfold model and mutagenesis data strongly support the conclusion that the IHO1 N-term binds to the PH domain, as presented.

      We have repeatedly analysed the REC114-MEI4-IHO1 complex sample by SEC-MALLS (on Superose 6) and native mass spectrometry, but in both cases the sample is too complex to be interpreted. This is likely due to the fast exchange between REC114-MEI4 2:1 and 4:2 complexes and low binding affinity of IHO1 for REC114.

      **Referees cross commenting**

      Just to clarify a couple of points regarding consultation comments from reviewer 1:

      The suggestion regarding tags was mostly directed to the cases in which MALS data are noisy, with multiple oligomeric species (such as figure 1d). In these cases, i wondered whether the large MW species may be artefactual and could be resolved by removal of the tags. In cases where oligomers agree with those reported by other labs, I agree that there's no need to explore these further.

      In terms of SAXS, I agree that fitting models into envelopes will not distinguish between similar folds. However, fitting models directly to raw scattering data is extremely sensitive and I have never seen good fits with low chi2 values for incorrect models (even when very similar in overall shape to the correct structure).

      Reviewer #2 (Significance (Required)):

      The manuscript carefully combines biochemistry, biophysics and modelling in an integrative manner to report the architecture of the full REC114-MEI4-IHO1 complex that is not itself amenable to direct structure elucidation such as by X-ray crystallography. These are important results that will be of interest to the recombination and meiosis fields. The data are generally convincing and interpretations appear correct, so the manuscript is certainly suitable for publication. I have included some suggestions below that I believe would strengthen the manuscript and enhance our confidence in the findings. Whilst the manuscript is publishable in its current format, I believe the suggestions given below would make it into a much stronger paper.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Laroussi et al used Alphafold models to predict the assembly of REC114-MEI4-IHO1 complex, and verified the structure using different biochemical experiments. Both Alphafold predictions and experiment data are convincing for the overall protein complex assembly. Importantly, they identified a motif on IHO1 that share the same binding site on REC114 with TOPOVIBL and ANKRD31, suggesting that REC114 acts as a regulatory base coordinating different binding partners during meiosis progression. Overall, I believe this is a nice biochemistry paper, but lacks insights into the biology (I believe those in vivo data is beyond the scope of this paper), at least more discussions are needed in terms of these findings.

      We thank the reviewer for the supportive comments on our manuscript and its evaluation. We agree with the reviewer, that including in vivo data, that might provide further biological insights, would be useful. However, there is currently no good cellular model for meiotic recombination in mouse and thus our structure-based mutations will need to be tested in transgenic mice. Such data will take a long time to obtain and would delay the publication these in-vitro results that already provide novel insight into the REC114-MEI4-IHO1 complex architecture. We will, nevertheless, as suggested, strengthen the discussion of the biological implications of our findings.

      Some minor points:

      Point 1:

      Any data showing MEI4 forms a dimer on its own?

      As mentioned in the manuscript, full-length MEI4 is difficult to produce in bacteria or insect cells. Thus, we worked with the N-terminal fragment which in absence of REC114 is nor very stable. We will perform SEC-MALLS to assess its oligomeric state. Alphafold suggests dimeric arrangement of MEI4, but only with low confidence.

      Point 2:

      In Fig2 and Sup Fig4, HisMBP-MEI4, you see more MBP than the fusion protein, especially more obvious in the mutants. What's your explanation?

      The N-terminus of MEI4 is well produced when co-expressed with REC114. For the pull-down experiments in Figure 2 we expressed it as His-MBP fusion in absence of REC114. In this situation, there is a degradation between MBP and MEI4. We find this very often for proteins that not very stable, which is the case of MEI4 without REC114. This is the best way we could produce at least some MEI4 in absence of REC114. The MBP protein could probably be removed by other chromatography techniques, but we think that for the purpose of the pull-down its presence is not interfering with the REC114-MEI4 binding.

      Point 3:

      TOPOVIBL and ANKRD31, I am curious if you have looked at the conserved residues on these motifs.

      We show a strong conservation of the IHO1 among vertebrates (Fig. 6c). We will further analyse the sequence conservation in more distant species.

      Point 4:

      Reference needed when stating that IHO1 was not interacting with REC114 in previous biochemical assay in the discussion part.

      This will be corrected

      Point 5:

      Also, have you run AlphaFold that gives multiple models? Sometimes, with short motifs, 1 or 2 models of several models give good confidence for the interaction.

      Using in-house Alphafold installation producing 25 models did not reveal better models.

      Reviewer #3 (Significance (Required)):

      While most of the interactions between REC114 and MEI4 or IHO1 were established with Y2H or other biochemical assays before. This paper used the AlphaFold, and finally verified the findings with biochemical experiments, which helps to establish the exact motifs/residues involved in the interaction. For example, the MEI4-REC114 interfaces are novel, more interestingly, the IHO1 shares the same interface with ANKRD31 and TOPOVIBL. Thus, this finding of REC114-MEI4-IHO1 complex assembly would be interesting to people with different working areas. I would like to see more studies on the coordination IHO1 with ANKRD31 or TOPOVIBL in the future.

    1. Author Response

      Reviewer #2 (Public Review):

      1) The main limitation of this study is that the results are primarily descriptive in nature, and thus, do not provide mechanistic insight into how Ryr1 disease mutations lead to the muscle-specific changes observed in the EDL, soleus and EOM proteomes.

      An intrinsic feature of the high-throughput proteomic analysis technology is the generation of lists of differentially expressed proteins (DEP) in different muscles from WT and mutated mice. Although the definition of mechanistic insights related to changes of dozens of proteins is very interesting, it is a difficult task to accomplish and goes beyond the goal of the high-throughput proteomic analysis presented here. Nevertheless, the analysis of DEPs may indeed provide arguments to speculate on the pathogenesis of the phenotype linked to recessive RyR1 mutations. In the unrevised manuscript, we pointed out that the fiber type I predominance observed in congenital myopathies linked to recessive Ryr1 mutation are consistent with the high expression level of heat shock proteins in slow twitch muscles. However, as suggested by Reviewer 3, we have removed "vague statements" from the text of the revised manuscript, concerning major insights into pathophysiological mechanisms, since we are aware that the mechanistic information, if any, that we can extract from the data set, cannot go over the intrinsic limitation of the high-throughput proteomic technology.

      b) Results comparing fast twitch (EDL) and slow twitch (soleus) muscles from WT mice confirmed several known differences between the two muscle types. Similar analyses between EOM/EDL and EOM/soleus muscles from WT mice were not conducted.

      We agree with the point raised by the Reviewer. In the revised manuscript we have changed Figure 2. The new Figure 2 shows the analysis of differentially expressed proteins in EDL, soleus and EOMs from WT mice. We have also added 2 new Tables (new Supplementary Table 2 and 3) and have inserted our findings in the revised Results section (page, 7, lines 157-176, pages 8 and 9).

      c) While a reactome pathway analysis for proteins changes observed in EDL is shown in Supplemental Figure 1, the authors do not fully discuss the nature of the proteins and corresponding pathways impacted in the other two muscle groups analyzed.

      We have now included in the revised manuscript a new Figure 2 which includes the Reactome pathway analysis comparing EDL with soleus, EDL with EOM and soleus with EOM (panels C, F and I, respectively). We have also inserted into the revised manuscript a brief description of the pathways showing the greatest changes in protein content (page 7 line 156-175, pages 8 and 9). We agree that the data showing changes in protein content between the 3 muscle groups of the WT mice are important also because they validate the results of the proteomic approach. Indeed, the present results confirm that many proteins including MyHCIIb, calsequestrin 1, SERCA1, parvalbumin etc are more abundantly expressed in fast twitch EDL muscles compared to soleus. Similarly, our results confirm that EOMs are enriched in MyHC-EO as well as cardiac isoforms of ECC proteins. This point has been clarified in the revised version of the manuscript (page 8, lines 198-213; page 9 lines 214-228). Nevertheless, we would like to point out that the main focus of our study is to compare the changes of protein content induced by the presence of recessive RyR1 mutations.

      Reviewer #3 (Public Review):

      a) it would be useful to determine whether changes in protein levels correlated with changes in mRNA levels …….

      We performed qPCR analysis of Stac3 and Cacna1s in EDL, Soleus and EOM from WT mice (see Figure 1 below). The expression of transcripts encoding Cacna1s and Stac3 is approximately 9-fold higher in EDL compared to Soleus. The fold change of Stac3 and Cacna1s transcripts in EDL muscles is higher compared to the differences we observed by Mass spectrometry at the protein level between EDL and Soleus. Indeed, we found that the content of the Stac3 protein in EDL is 3-fold higher compared to that in soleus. Although there is no apparent linear correlation between mRNA and protein levels, we believe that a few plausible conclusions can be drawn, namely: (i) the expression level of both transcripts and proteins is higher EDL compared to EOM and soleus muscles, respectively, (ii) the expression level of transcripts encoding Stac3 correlate with those encoding Cacan1s and confirm proteomic data. In addition, the level of Stac3 transcript does not changes between WT and dHT, confirming our proteomic data which show that Stac3 protein content in muscles from dHT is similar to that found in WT littermates. Altogether these results support the concept that the differences in Stac3 content between EDL and soleus occur at both the protein and transcript levels, namely high Stac3 mRNA level correlates with higher protein content (EDL) and low mRNA levels correlated with low Stac3 protein content in Soleus muscles (see Figure 1 below).

      Figure 2: qPCR of Cacna1s and Stac3 in muscles from WT mice. The expression levels of the transcripts encoding Cacna1s and Stac3 are the highest in EDL muscles and the lowest in soleus muscles (top panels). There are no significant changes in their relative expression levels in dHT vs WT. Each symbol represents the value from of a single mouse. * p=0.028 Mann Whitney test qPCR was performed as described in Elbaz et al., 2019 (Hum Mol Genet 28, 2987-2999).

      ….and whether or not the protein present was functional, and whether Stac3 was in fact stoichiometrically depleted in relation to Cacna1s.

      We thought about this point but think that there are no plausible arguments to believe that Stac3 is not functional, one simple reason being that our WT mice do not have a phenotype which would be associated with the absence of Stac3 (Reinholt et al., PLoS One 8, e62760 2013, Nelson et al. Proc. Natl. Acad. Sci. USA 110:11881 2013).

      b) In the abstract, the authors stated that skeletal muscle is responsible for voluntary movement. It is also responsible for non-voluntary. The abstract needs to be refocused on the mutation and on what we learn from this study. Please avoid vague statements like "we provide important insights to the pathophysiological mechanisms..." mainly when the study is descriptive and not mechanistic.

      The abstract of the revised manuscript has been rewritten. In particular, we removed statements referring to important “pathophysiological mechanistic insight”.

      c) The author should bring up the mutation name, location and phenotype early in the introduction.

      In the revised manuscript we provide the information requested by the Reviewer (page 2 lines 36-38 and page 4, lines 98-102).

      d) This reviewer also suggests that the authors refocus the introduction on the mutation location in the 3D RyR1 structure (available cryo-EM structure), if there is any nearby ligand binding site, protomers junction or any other known interacting protein partners. This will help the reader to understand how this mutation could be important for the channel's function

      The residue Ala4329 is present inside the TMx (Auxiliary transmembrane helices) domain which spans from residue 4322 to 4370 and interposes structurally (des Georges A et al. 2016 Cell 167,145-57; Chen W, et al. 2020 EMBO Rep. 21, e49891). Although the structural resolution of the region has been improved (des Georges et al, 2016), parts of the domain still remain with no defined atomic coordinates, especially the region encompassing a.a. E4253 – F4540. Because of such undefined atomic coordinates of the region E4253-F4540, we are not able to determine the real orientation and the disposition of the amino acids in this region, including the A4329 residue. As reference, structure PDB: 5TAL of des Georges et al, 2016 was analyzed with UCSF Chimera (production version 1.16) (Pettersen et al. J. Comput. Chem. 25: 1605-1612. doi: 10.1002/jcc.20084).

    1. Author Response

      Reviewer #1 (Public Review):

      This manuscript describes a relatively novel approach to discovering combinations of herbal medications that may help modulate immune responses, and in turn help treat diseases such as cancer. The authors use breast plasma call mastitis as a disease in which they present results from a non-blinded clinical trial with modest results. The main shortcomings are a lack of rigor around standardizing the control group given steroids versus the treatment group given the combinations of herbal medications. There needs to be a detailed statistical analysis of the comparison in tumor size, stage, invasiveness, etc. as well as consideration of confounding disease states (autoimmune disease, prior cancers, diabetes, etc.). While the results are interesting in that the use of herbal medications is often overlooked in Western medicine, the manuscript needs great detail in the clinical comparison in order to provide convincing evidence for an effect.

      Many thanks for your very kind words about our work. We are excited to hear that you think our manuscript is relatively novel with considerable translational impact to the field of herbal medications. We are grateful for your valuable time and efforts you have spent to provide your very insightful comments, which are of great help for our revision.

      Reviewer #2 (Public Review):

      The work is rather interesting and novel because for the first time, the authors employed knowledge graph, a cutting-edge technique in the domain of artificial intelligence, to identify a novel herbal drug combination for the treatment of PCM. The results of the clinical trial study clearly demonstrated that the drug combination is effective to ameliorate the symptoms of PCM patients and improve the general health status of the patients. Overall, the strategy of this manuscript may provide a paradigm for the design of drug combination towards many other human disorders.

      We are truly grateful for your very kind words about our work. It is very encouraging to know that you think our work is novel and of significance for the field. We sincerely appreciate the valuable time and kind efforts that you have spent on the thorough review of our manuscript.

      Reviewer #3 (Public Review):

      The major merit of the manuscript is that the authors introduced the concept of knowledge graph into the domain of herbal drugs or TCM. Namely, the authors designed a knowledge graph towards systematic immunity or immunotherapy based on massive data mining techniques. The authors successfully identified an herbal drug combination for PCM with the help of a scoring system. Moreover, the authors conducted a clinical trial study and the clinical data showed that the herbal drug combination holds great promise as an effective treatment for PCM. The weakness of the manuscript is that some details for the herbal drug combination and the clinical trial study are missing.

      Many thanks for your very kind words about our work. We are excited to hear that you think our work is relatively novel and holds great promise as an effective remedy for PCM. We are truly thankful for your valuable time and efforts you have spent to provide your very insightful comments, which are of great help for our revision.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors developed a new concept: Skeletal age, which is chronological age + years lost due to suffering a low-energy fracture. There seem to be conceptual problems with this concept: It is not known if the years lost are lost due to the fracture or co-morbidities.

      The Reviewer raises an important point, and we are happy to discuss it as follows. While it is not possible to show the causal relationship between a fragility fracture and excess mortality, it has been shown repeatedly that a fracture is associated with an increased risk of pre-mature mortality after accounting for comorbidities and frailty. Indeed, we and others have found that comorbidities contribute little to the increased risk10,11. Moreover, in a previous study using the ‘relative survival analysis’ technique12, we have shown that hip and proximal fractures were associated with reduced life expectancy after accounting for time-related changes in background mortality in the population, suggesting that hip and proximal fractures are an independent clinical risk factor for mortality.

      In this study, we used a multivariable Cox’s proportional hazards model to adjust for confounding effects of age and severity of comorbidities, and our result clearly indicated that a fracture is associated with years of life lost. Moreover, comorbidities were considered a factor in an individual's risk profile for estimating skeletal age. As a result, skeletal age reflects the common real-world scenario that the combination of comorbidities and proximal or lower leg fractures compounded post-fracture excess mortality, much greater than each alone13.

      Technically, there are two steps to individualise skeletal age for each individual with a specific risk profile. First, we used the statistical approach recommended for the individualisation of survival time prediction using statistical models14 to individualise specific mortality risk for each participant with a specific risk profile. Specifically, we calculated the prognostic risk index as a single-number summary of the combined effects of his/her specific risk profile of a specific fracture site and the severity of comorbidity. His/her individualised fracture-mortality association was then computed as the difference between his/her prognostic index and the mean prognostic index of “typical” people in the general population. In the second step, we used the Gompertz law of mortality and the Danish national lifetable data to transform the individualised association into life expectancy loss as a result of a fracture15.

      We have modified part of the description of the methodology as follows:

      “For the second aim, we determined skeletal age for individual based on the individual’s specific risk profile. First, we calculated the prognostic risk index as a single-number summary of the combined effects of his/her specific fracture site and the severity of comorbidity51. The prognostic index is a linear combination of the risk factors with weights derived from the regression coefficients. The individualised fracture-mortality association for an individual with a specific risk profile is then the difference between the individual's prognostic index and the mean prognostic index of 'typical' people in the general population51. In the second step, we used the Gompertz law of mortality and the Danish national lifetable data to transform the excess mortality into life expectancy loss as a result of a fracture49.”.

      In addition, with the possible exception of zoledronate after hip fracture, we have no evidence that this increased risk of mortality can be changed with interventions.

      We agree that there is a lack of strong evidence from randomised controlled trials supporting the benefit of anti-resorptive therapy on post-fracture survival. As mentioned above, the mention of zoledronic acid was simply for illustrating the use of skeletal age to convey a treatment benefit. We have decided to remove the section related to the benefit of pharmacological treatment on post-fracture mortality.

      Furthermore, it is not clear why the authors think that patients and doctors will better understand the implications of older "skeletal age", on future fracture risk and the need for prevention, for example, the 10-year risk of MOF? Knowing that my bones are older than me, could make a patient feel even more fragile and afraid of being physically active. The treatment will reduce the risk of future fractures, but this study provides no information about the effect on mortality of preventing the subsequent fracture or the risk of mortality associated with recurrent fractures.

      The risk of fracture is typically conveyed to patients and the public in terms of absolute risk metric (e.g., probability) or relative risk metrics (e.g., risk ratio). However, patients and doctors often struggle to comprehend probabilistic statements such as 'Your risk of death over the next 10 years is 5% if you have suffered from a bone fracture'. The underappreciation of post-fracture mortality's gravity has caused patients to be hesitant towards treatment and prevention, contributing to the current crisis of osteoporosis treatment.

      We consider that skeletal age will make doctor-patient risk communication more intuitive and probably more effective. For example, for the same 2-fold increased mortality risk of hip fracture, telling a 60-year man with a hip fracture that his skeletal age would be 66 years old, equivalent to a 6-year loss of life is much more intuitive. The patient might be thus more likely to accept the recommended pharmacological treatment, ultimately improving health benefits. However, we have not had RCT evidence for the effectiveness of skeletal age, and this will be one of our future research focus. We would like to point out that there is RCT evidence that effective age (such as 'Heart Age', 'Lung Age') could improve the uptake of preventive actions. For example, informing patients about their heart age, as shown by Lopez-Gonzalez et al16 was found to better improve their cardiovascular risk compared to informing the Framingham probabilistic risk score.

      Introduction:

      The statement that treatment reduces the risk of dying, needs modification as the majority of clinical trials have not demonstrated reduced mortality with treatment.

      We have modified the statement as follows: “In randomised controlled trials, treating high-risk individuals with bisphosphonates or denosumab reduces the risk of fracture4, though whether the reduction translates into reduced mortality risk remains contentious5, 6.”

      It is not clear how the skeletal age captures the risk of a future fracture. The other difference between the idea of "skeletal age" and for example "heart age" is that there are treatments available for heart disease that reduce the risk of mortality, as mentioned above this has not been shown consistently in clinical trials in osteoporosis.

      We take the Reviewer's point, but we would like to point out that there are at least two RCTs on zoledronic acid showing that treating patients with a fragility fracture reduces their risk of mortality17,18.

      Because the risk profile that is associated with a post-fracture mortality is also associated with the risk of fracture, skeletal age can be seen as a measure of the decline of the skeleton due to a fracture or exposure to risk factors that raise the risk of fracture. Thus, a 60-year-old with a skeletal age of 66 is in the same risk category as a 66-year-old with 'favourable risk factors' or at least the ones that are potentially modifiable. Hence, an older skeletal age means a greater risk of fracture.

      Neither the “Skeletal Age” nor the “Heart Age”16,19,20 has the treatment intervention incorporated into its calculator. We have added details to explain how the assessment of skeletal age would provide the conceptual risk of both fracture and post-fracture mortality as follows:

      “Unlike the current fracture risk assessment tools17 which estimate the probability of fracture over a period of time using probability-based metrics, such as relative risk and absolute risk, skeletal age quantifies the consequence of a fracture using a natural frequency metric. A natural frequency metric has been consistently shown to be easier and more friendly to doctors and patients than the probability-based metrics9 11 30. It is not straightforward to appreciate the importance of the two-fold increased risk of death (i.e., relative risk = 2.0) without knowing the background risk (i.e., 2 folds of 1% would remarkably differ from 2 folds of 10%). By contrast, for the same 2-fold mortality risk of hip fracture, telling a 60-year man with a hip fracture that his skeletal age would be 66 years old, equivalent to a 6-year loss of life, is more intuitive. The skeletal age can also be interpreted as the individual being in the same risk category as a 66-year-old with 'favorable risk factors' or at least the ones that are potentially modifiable. Hence, an older skeletal age means a greater risk of fracture.”.

      Discussion:

      The prevalent comorbidities; cardiovascular diseases, cancer, and diabetes, suggest that fracture patients die from their comorbidities and not their fractures.

      Please refer to the above response for more detail. Briefly, the multivariable Cox’s proportional hazards regression adjusted for the confounding effect of age and the severity of comorbidities, indicating the association between fracture and mortality was independent of aging and comorbidity severity. On the other hand, skeletal age is a measure of excess mortality related to either fracture or co-morbidities or both.

      The discussion should be more balanced as there is a number of clinical trials demonstrating reductions in vertebral and non-vertebral fractures without effect on mortality. There may be specific effects of zoledronate on mortality, but that has not been shown for the vast majority of treatments.

      Please refer to the above response for more detail. Specifically, as the study primarily aimed at introducing skeletal age as a new metric for risk communication, we have decided to omit the paragraph discussing the potential benefit of zoledronic acid on post-fracture mortality risk in order to maintain the clarity and focus of the study.

      It is not correct that FRAX does not take mortality into account? It does not tell you specifically how high the risk of dying and how high the risk of a fracture is but integrates the two. "Skeletal age" does not provide either information, it just tells you that your skeleton is older than your chronological age - most patients and doctors will not associate that with an increased risk of dying - only of frailty.

      Although it is commonly believed that FRAX accounts for competing risk of death, it does not provide the risk of post-fracture mortality. Indeed, none of the current fracture risk assessment tools was designed to provide post-fracture mortality risk5. Skeletal age fills the gap by providing the excess mortality following a fracture for an individual with specific risk profile.

      The statement that zoledronate reduces the "skeletal age" by 3 years, has not been demonstrated and it is not clear how this can be demonstrated by the analysis reported here. As the reduced mortality has only been shown for the Horizon RFT, this cannot be inferred for other treatments and other fracture types. The information provided by the "skeletal age" is only that the fracture you already had took x years of your remaining lifetime. With the exception of perhaps zoledronate after hip fracture, we have no indication from clinical trials that the treatment of osteoporosis will change this.

      The current study was not designed to examine the effectiveness of an intervention. The statement related to the survival benefit of zoledronate is used to illustrate how skeletal age is used to convey the treatment benefit in real-world doctor-patient risk communication. Given the hazard ratio of 0.72 for zoledronate-mortality association17, a patient might find the statement “Zoledronic acid treatment helps a patient with a hip fracture gain (back) 3 years of life” much easier to understand and probably more persuasive than the traditional statement of “Zoledronic acid treatment reduced the risk of death by 28%”.

      Reviewer #2 (Public Review):

      The paper of Tran et al. introduces the concept of 'skeletal age' as a means of conveying the combined risk of fracture and fracture-associated mortality for an individual. Skeletal age is defined as the sum of chronological age and the number of years of life lost associated with a fracture. Using the very comprehensive Danish national registry and employing Cox's proportional hazards model they estimated the hazard of mortality associated with a fracture. Skeletal age was estimated for each age and fracture site stratified by gender. The authors propose to replace the fracture probability with skeletal age for individualized fracture risk assessment.

      Strengths of the study lie in the novelty of the concept of 'skeletal age' as an informative metric to internalize the combined risks of fracture and mortality, the very large and well-described Danish National Hospital Discharge Registry, the sophisticated statistical analysis and the clear messages presented in the manuscript. The limitations of the study are acknowledged by the authors.

      We appreciate your positive remark that captures the essence of our work.

      References:

      1. Lujic S, Simpson JM, Zwar N, Hosseinzadeh H, Jorm L. Multimorbidity in Australia: Comparing estimates derived using administrative data sources and survey data. PloS one 2017; 12(8): e0183817.
      2. Andersen TF, Madsen M, Jorgensen J, Mellemkjoer L, Olsen JH. The Danish National Hospital Register. A valuable source of data for modern health sciences. Dan Med Bull 1999; 46(3): 263-8.
      3. Vestergaard P, Mosekilde L. Fracture risk in patients with celiac Disease, Crohn's disease, and ulcerative colitis: a nationwide follow-up study of 16,416 patients in Denmark. Am J Epidemiol 2002; 156(1): 1-10.
      4. Hundrup YA, Hoidrup S, Obel EB, Rasmussen NK. The validity of self-reported fractures among Danish female nurses: comparison with fractures registered in the Danish National Hospital Register. Scand J Public Health 2004; 32(2): 136-43.
      5. Beaudoin C, Moore L, Gagne M, et al. Performance of predictive tools to identify individuals at risk of non-traumatic fracture: a systematic review, meta-analysis, and meta-regression. Osteoporos Int 2019; 30(4): 721-40.
      6. Spiegelhalter D. How old are you, really? Communicating chronic risk through 'effective age' of your body and organs. BMC Med Inform Decis Mak 2016; 16: 104.
      7. Vestergaard P, Rejnmark L, Mosekilde L. Osteoporosis is markedly underdiagnosed: a nationwide study from Denmark. Osteoporos Int 2005; 16(2): 134-41.
      8. Roerholt C, Eiken P, Abrahamsen B. Initiation of anti-osteoporotic therapy in patients with recent fractures: a nationwide analysis of prescription rates and persistence. Osteoporos Int 2009; 20(2): 299-307.
      9. Cummings SR, Lui LY, Eastell R, Allen IE. Association Between Drug Treatments for Patients With Osteoporosis and Overall Mortality Rates: A Meta-analysis. JAMA Int Med 2019; 179(11): 1491-500.
      10. Chen W, Simpson JM, March LM, et al. Comorbidities Only Account for a Small Proportion of Excess Mortality After Fracture: A Record Linkage Study of Individual Fracture Types. J Bone Miner Res 2018; 33(5):795-802
      11. Vestergaard P, Rejnmark L, Mosekilde L. Increased mortality in patients with a hip fracture-effect of pre-morbid conditions and post-fracture complications. Osteoporos Int 2007; 18(12): 1583-93.
      12. Tran T, Bliuc D, Hansen L, et al. Persistence of Excess Mortality Following Individual Nonhip Fractures: A Relative Survival Analysis. J Clin Endocrinol Metab 2018; 103(9): 3205-14.
      13. Tran T, Bliuc D, Ho-Le T, et al. Association of Multimorbidity and Excess Mortality After Fractures Among Danish Adults. JAMA Netw Open 2022; 5(10): e2235856.
      14. Henderson R, Keiding N. Individual survival time prediction using statistical models. J Med Ethics 2005; 31(12): 703-6.
      15. Kulinskaya E, Gitsels LA, Bakbergenuly I, Wright N. Calculation of changes in life expectancy based on proportional hazards model of an intervention. Insur Math Econ 2020; 93: 27-35. 16 Lopez-Gonzalez AA, Aguilo A, Frontera M, et al. Effectiveness of the Heart Age tool for improving modifiable cardiovascular risk factors in a Southern European population: a randomized trial. Eur J Prev Cardiol 2015; 22(3): 389-96.
      16. Lyles KW, Colon-Emeric CS, Magaziner JS, et al. Zoledronic acid and clinical fractures and mortality after hip fracture. N Engl J Med 2007; 357(18): 1799-809.
      17. Reid IR, Horne AM, Mihov B, et al. Fracture Prevention with Zoledronate in Older Women with Osteopenia. N Engl J Med 2018; 379(25): 2407-16.
      18. Bonner C, Batcup C, Cornell S, et al. Interventions Using Heart Age for Cardiovascular Disease Risk Communication: Systematic Review of Psychological, Behavioral, and Clinical Effects. JMIR Cardio 2021; 5(2): e31056.
      19. Svendsen K, Jacobs DR, Morch-Reiersen LT, et al. Evaluating the use of the heart age tool in community pharmacies: a 4-week cluster-randomized controlled trial. Eur J Public Health 2020; 30(6): 1139-45.
      20. Suissa S. Immortal time bias in pharmaco-epidemiology. Am J Epidemiol 2008; 167(4): 492-9.
    2. Reviewer #1 (Public Review):

      The authors developed a new concept: Skeletal age, which is chronological age + years lost due to suffering a low-energy fracture.<br /> There seem to be conceptual problems with this concept: It is not known if the years lost are lost due to the fracture or co-morbidities. In addition, with the possible exception of zoledronate after hip fracture, we have no evidence that this increased risk of mortality can be changed with interventions. Furthermore, it is not clear why the authors think that patients and doctors will better understand the implications of older "skeletal age", on future fracture risk and the need for prevention, for example, the 10-year risk of MOF? Knowing that my bones are older than me, could make a patient feel even more fragile and afraid of being physically active. The treatment will reduce the risk of future fractures, but this study provides no information about the effect on mortality of preventing the subsequent fracture or the risk of mortality associated with recurrent fractures.

      Introduction:<br /> The statement that treatment reduces the risk of dying, needs modification as the majority of clinical trials have not demonstrated reduced mortality with treatment.<br /> It is not clear how the skeletal age captures the risk of a future fracture. The other difference between the idea of "skeletal age" and for example "heart age" is that there are treatments available for heart disease that reduce the risk of mortality, as mentioned above this has not been shown consistently in clinical trials in osteoporosis.

      Discussion:<br /> The prevalent comorbidities; cardiovascular diseases, cancer, and diabetes, suggest that fracture patients die from their comorbidities and not their fractures.<br /> The discussion should be more balanced as there is a number of clinical trials demonstrating reductions in vertebral and non-vertebral fractures without effect on mortality. There may be specific effects of zoledronate on mortality, but that has not been shown for the vast majority of treatments.<br /> It is not correct that FRAX does not take mortality into account? It does not tell you specifically how high the risk of dying and how high the risk of a fracture is but integrates the two. "Skeletal age" does not provide either information, it just tells you that your skeleton is older than your chronological age - most patients and doctors will not associate that with an increased risk of dying - only of frailty.<br /> The statement that zoledronate reduces the "skeletal age" by 3 years, has not been demonstrated and it is not clear how this can be demonstrated by the analysis reported here. As the reduced mortality has only been shown for the Horizon RFT, this cannot be inferred for other treatments and other fracture types.<br /> The information provided by the "skeletal age" is only that the fracture you already had took x years of your remaining lifetime. With the exception of perhaps zoledronate after hip fracture, we have no indication from clinical trials that the treatment of osteoporosis will change this.

    1. Author Response

      Reviewer #1 (Public Review):

      The article "Identification of a weight loss-associated causal eQTL in MTIF3 and the effects of MTIF3 deficiency on human adipocyte function" explored the functional roles of MTIF3 during adipocyte differentiation. In persons living with obesity, genetic variation at the MTIF3 locus associates with body mass index and responses to weight loss interventions. MTIF3 regulates mitochondrial protein expression and gene knockouts cause cardiomyopathy in mice. This paper provides insight into the impacts of MTIF3 knockout on adipocyte differentiation and the expression effects of the eQTL on MTIF3 levels. The authors implement a CRISPR/Cas9 gene editing approach coupled with an in vitro platform to detect influences of MTIF3 on adipocyte glucose metabolism and gene expression. This method may serve as a platform to explore knockouts in human cell lines, so it may allow the discovery of new gene x environment influences on in vitro outcomes related to differentiation, growth, and metabolism.

      The conclusions of this paper are mostly well supported by data, but some experimental conditions and data analysis needs to be clarified and extended.

      1) The authors use CRISPR/Cas9 to generate the rs1885988 variant in the human white adipocyte cell line and performed a comprehensive validation analysis of gene editing (Figure 1). qPCR analysis showed reduced MTIF3 expression during human adipocyte differentiation (Figure 1E, F). To expand the importance of the rs1885988 variant, the authors should have provided target gene measurements to verify the canonical differentiation profile (e.g., FABP4, ADIPOQ) and help readers understand the overall impact of gene editing at the MTIF3 locus.

      Thank you for your suggestions. As you requested, we have quantified several adipocyte differentiation markers in the allele-edited cells after 12 days of adipogenic differentiation. The data (Figure 1-figure supplement 1) shows no significant difference between cells with the different genotypes. We have added more information about this in lines 100-101, and also in another context in lines 105-116.

      Notably, the intra-group variation of the marker gene expression is large (Figure 1-figure supplement 1), which makes it difficult to clearly state how much the allele editing, as opposed to random variation resulting from single cell cloning, contributes to the differentiation outcome. However, if we also consider MTIF3 knockout cells (that do not need to be single-cell cloned), their differentiation marker expression also appears unaffected (Figure 3-figure supplement 1). Taken together then, it is unlikely the allele editing with the consequent effect on MTIF3 expression affects adipogenic differentiation in our experiments. We mention the absence of effect of MTIF3 knockout on differentiation in the paragraph starting on line 137.

      2) The direct mechanistic influences of MTIF3 on adipocyte function remain unclear. MTIF3 regulates the translation initiation of mitochondrial protein synthesis. Western blots of OXPHOS proteins do not per se underscore supercomplex formation, which is also a process mediated by MTIF3. Blue native gel electrophoresis may prove a better method to establish the effects of MTIF3 loss-of-function on supercomplex formation.

      As suggested, we have run blue native gel electrophoresis to detect the formation of OXPHOS respiration complexes. In the revised manuscript (lines: 158-168 and Figure 4 E,F), we show how MTIF3 knockout indeed interferes with the complex formation, with lower abundance of complexes V/III2+IV1, III2/IV2 and IV1. Additionally, although the blot signal for complex I+III2+IVn is diffuse, it appears higher in scrambled control cells than in MTIF3 knockout cells. Interestingly, complex II content is slightly higher in MTIF3 knockouts, which may result from a compensatory regulation mechanism, as none of the subunits of complex II is encoded by mitochondrial DNA. We also found several faster-migrating (“undefined bands” in the figure) in the MTIF3 knockout samples, although it is hard to determine whether those are single chain proteins, or degradation or mistranslation products. Overall though, the native gel blots show impaired OXPHOS complex assembly in MTIF3 knockout samples.

      In addition, we performed western blots for other mitochondrial proteins, including COX II (subunit of OXPHOS complex IV), ND2 (subunit of OXPHOS complex I), ATP8 (subunit of OXPHOS complex V), and CYTB (subunit of OXPHOS complex III). The data (Figure 4 A,B), show decreased ND2 and COX II, trending decrease of CYTB, and unaffected ATP8 content in MTIF3 knockout adipocytes.

      The methods (paragraph starting at line 479), results (paragraph starting at line 145), and discussion (lines: 261-263, 274-277) were incorporated in the revised manuscript.

      3) Based on the findings, the authors argue that MTIF3 knockout alters the function of adipocytes. However, many of the experiments show fairly small effect sizes (Figure 5A, Figure 6A). How does the MTIF3 knockout explicitly perform functions related to body weight regulation? Gene editing in vivo would have helped to substantiate the authors' conclusions.

      In the paper we are looking at the consequences of MTIF3 deficiency in one cell type, over short time, in vitro. The outcome of body weight regulation, e.g. during weight loss, would result from long-term effects of MTIF3-altered metabolism in more than one tissue. We envisage that small changes in energy metabolism in not only fat, but also in e.g. muscle, would make a substantial difference over time in vivo (this, we cannot capture in in vitro models). We have added this discussion to lines 294-311.

      As for in vivo genomic editing, the alleles of interest are specific to the human genome. Ideally, a genotype-based recall study in humans would be appropriate, but due to time and resource limitation, we are not able to conduct such a study at the moment (although we certainly hope to perform such a study in the future). As for modeling the MTIF3 deficiency in mice – the MTIF3 knockout mice are not viable [1], and certainly other options (e.g. overexpression, tissue-specific knockouts) are possible and tempting to investigate. This, however, would require considerable additional work which we could only perform in a future project.

      4) In several instances, the authors refer to 'feeding' cells with glucose (line 206, line 171). Feeding experiments often imply complex nutrient interventions in animal models and people, which cannot be easily recapitulated in cell culture. The in vitro experiments simply alter levels of glucose and more precise language would state the specific challenges accurately.

      In the revised manuscript, we have substituted “feeding” for exact glucose concentration, or “glucose concentration” where appropriate. (paragraph starting at line 215, and lines 577-578, 597, 873-879)

      Reviewer #2 (Public Review):

      Huang Mi, et al. investigated the role of MTIF3, the mitochondrial translation initiation factor 3, in the function of adipocytes. They first detected the expression of the obesity-related MTIF3 variants based on the GTEx database and found two variants lead to an increase in MTIF3 expression. Then they knockout MTIF3 in differentiated hWAs adipocytes and characterized the mitochondrial function. They found loss of MTIF3 decrease mitochondrial respiration and fatty acid oxidation. They further treated cells with low glucose medium to mimic weight loss intervention and found MTIF3 knockout adipocytes lose fewer triglycerides than control adipocytes. This paper provides new information about MTIF3 in adipocytes and the potential functional role of MTIF3 in mitochondrial function.

      1) The authors provided sufficient data to show those two genetic variants increase MTIF3 expression. Their CRISPR/Cas9 knockin cell line is also convincing. But they didn't show if the genetic variants affect adipogenesis. Adipogenesis is an important process for weight gain and fat deposition. In lines 103-107, the authors mentioned that the "allele-edited cells have some problem in differentiated state, e.g. triglyceride or mitochondrial content", so they used an inducible Cas9 system. However, the issue of differentiated allele-edited cells may be the functional effect of MTIF3 genetic variants, such as interrupting adipogenesis, decreasing triglyceride, or affecting mitochondrial number. The authors should provide that information.

      Thank you for all your suggestions. We think we were not clear regarding this issue. We did not mean that the allele-edited cells have problem in differentiated state, which then definitely could be (as you point out) due to the functional effect of MTIF3 genetic variants. The problem relates to the process of single-cell cloning itself, which inherently introduces random variation. As a consequence, the data on adipogenic differentiation in allele-edited cells has relatively high intra-group variation. We have added more clarifying text in lines 104-116.

      To provide the data on this, per your request, in the revised manuscript we include the results for the rs67785913-edited cells in Figure 1-figure supplement 1. As shown, we observed no differences in the expression of adipogenic markers (ADIPOQ, PPARG, CEBPA, SREBF1 and FABP4) or in mitochondrial content between the two rs67785913 genotypes. Since the intra-group variation is often high, it is hard to conclude how much the rs67785913 eQTL affects the quantified variables. Much of the variation could instead be ascribed to the effects of single cell cloning.

      The cloning per se introduces random variation, but is required to obtain homozygous allele-edited cells. Because of this dilemma, and to clarify how much MTIF3 expression can actually influence adipogenic differentiation, we have, during the revision, also used the hWAs-iCas9 cells to generate MTIF3 knockouts at the preadipocyte stage and then tested their differentiation capacity. As we show in Figure 3-figure supplement 1, we found no apparent differences in adipogenic marker gene expression between scrambled control and MTIF3 knockout cells (we mention that in lines 137-144). Taken together, our results may indicate that the rs67785913 genotype, through affecting MTIF3 expression, is unlikely to regulate adipogenic differentiation.

      2) In Figure 4, the author mentioned that MTIF3 knockout does not affect the expression of adipogenic differentiation markers. They need to provide more evidence to prove their point. Oil-red O staining is a clearer way to quantify adipocyte differentiation in cell culture. In addition, in Fig. 4B western blot, the author should include MTIF3 as a control to show the knockout efficiency. It is not clear the meaning of plus and minus in that panel. The author should also compare the total triglyceride levels in MTIF3 knockout cells and control cells.

      We have now included Oil-red O staining results and total triglyceride levels (Figure 3 F,G), which show no apparent differences between scrambled control and MTIF3 knockout cells (method: lines 427-431; results: lines 137-144). We also added the MTIF3 blots to figure 4A as a control, showing high and consistent MTIF3 knockout efficiency in independent experiments. In the original manuscript, the plus and minus referred to control and knockout, respectively. To clarify that, we have changed the expression to SC and KO in the revised manuscript.

      With regards to Oil-red O vs. quantification of adipogenic markers, we actually prefer the latter method, as it gives more accurate and less variable results than Oil-red O (at least in the cell line we use). We have, however, performed Oil-red O as well to address your question.

      3) MTIF3 is a translation initiation factor in mitochondria and is involved in the protein synthesis of mitochondrial DNA-encoding genes. The authors should check protein levels rather than the mRNA levels of mitochondrial DNA-encoding genes (Fig. 6E). It's interesting to see the increase of mRNA levels of ND1 and ND2, which might be feedback of lower translation. Since ND1 and ND2 are in OXPHOS complex I, the expression levels of complex I in MTIF3 KO cells would be worth checking. Additionally, the author should also check the mitochondria copy number.

      As suggested, we have detected several mitochondrial encoding proteins which are subunits of each mitochondrial OXPHOS complex. As shown in figure 4A, ND2 (subunit of OXPHOS complex I) and COX II (subunit of OXPHOS complex IV) expression were significantly reduced, CYTB (subunit of OXPHOS complex V) expression tended to decrease, and ATP8 expression was not affected in the MTIF3 knockout adipocytes. We also detected the formation of the OXPHOS respiration complex in extracted mitochondrial proteins and found MTIF3 perturbation affect mitochondrial complex assembly. The detailed methods (lines: 479-490), results (lines: 145-169) and discussion (lines: 260-262, 274-277) were incorporated in the revised manuscript.

      We have also added the mitochondrial copy number data (Figure 3A), showing that MTIF3 knockout has lower mitochondrial content (methods: lines 491-500; results: 156-157)

      4) MTIF3 knockout adipocytes retain more triglycerides under glucose restriction is interesting. It may link to the previous result of lower fatty acid oxidation in MTIF3 knockout adipocytes. However, the authors then showed there is no difference in lipolysis. The author should discuss those results in the manuscript.The authors could also check lipolysis in glucose restriction conditions. It's also necessary to include the triglyceride levels of KO cell lines at full medium

      We have now examined the glycerol release in glucose restriction condition, and found no differences between control and MTIF3 knockouts (Figure 6-figure supplement 1). Interestingly, in 1 mM glucose, both genotypes released less glycerol than at 25 mM glucose, and this has been observed before in SGBS cell line [2] According to your suggestion, we have added the total triglyceride content at 25 mM glucose condition (Figure 6C), which also was not different between control and MTIF3 knockout cells. We speculate the higher retention of triglycerides in the knockouts could be due to higher re-esterification of lipolytically released fatty acids, since, as we observed, fatty acid oxidation is impaired in the knockouts. In the revised manuscript, we added that to the discussion (lines: 289-293).

      References

      1. Rudler, D.L., et al., Fidelity of translation initiation is required for coordinated respiratory complex assembly. Sci Adv, 2019. 5(12): p. eaay2118.
      2. Renes, J., et al., Calorie restriction-induced changes in the secretome of human adipocytes, comparison with resveratrol-induced secretome effects. Biochim Biophys Acta, 2014. 1844(9): p. 1511-22.
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewer for their comments. We are encouraged that the reviewers found our research “important study that addresses the interplay between two major Rho-type small GTPases involved in cell division” and “of interest to those interested in the cell biology of mitotic exit”. We agree with the comments raised by the reviewers and have provided new data as per their recommendation. We have also made changes to the text and format of the paper. We feel that with these changes the manuscript is stronger and we thank the reviewers for their suggestions. Below we provide a detailed response to the reviewers’ comments.

      Reviewer #1

      *This manuscript focuses on the role of Cdc42 in Rho1 activation during fission yeast cytokinesis. The primary finding is that active Cdc42 and its downstream effector Pak1 prevents accumulation of active Rho1 and the synthesis of cell wall material, at early stages of cytokinesis and despite the local recruitment of the Rho1 GEF Rgf1. The data supporting these conclusions are reasonably sound. *

      *Additional experiments are presented to suggest that Cdc42 and Pak1 negatively regulate Rgf1, this conclusion is not as strongly supported (though it may be true) *

      *These study relies on a newly described probe for active Rho1. However this probe is not sufficiently well validated. *

      *Overall the manuscript was not assembled with sufficient care and rigor, these deficits could be readily corrected. *

      The major point of the paper is that Cdc42 and Pak1 negatively regulate Rho1 activation. However, during late cytokinesis, active Cdc42 and active Rho1 co-exist at the division site. Thus, Cdc42 activation induces a delay in Rho1 activation, but how this delay is overcome is not investigated or even discussed. Indeed, while the delay is shown the transience of this inhibition is not explicitly mentioned. At a minimum, the authors should highlight this point for the readers.

      We are encouraged by the fact that the reviewer found our data “reasonably sound”. We agree that this manuscript does not provide the molecular details of how Cdc42 inhibits Rho1 activation. Our genetic data suggests that this is likely mediated by multiple pathways possibly involving the regulation of the Rho regulators Rgf1, Rgf2 and Rga5. We are currently investigating the molecular details of this regulation and hope to report it in another manuscript.

      Our data shows that while Cdc42 inhibits Rho1, the SIN pathway is essential for Rho1 activation regardless of the presence of Cdc42. While Cdc42 is activated at the division site as the ring completes assembly, the SIN pathway is activated immediately prior to ring constriction similar to that of Rho1 activation. It is possible that once the SIN is activated at the division site, it overcomes Cdc42-mediated Rho1 inhibition. We have highlighted this in the discussion section of this manuscript and are currently investigating the molecular details of this regulation.

      *Specific points 1 - RBD probe This probe is central to this manuscript. However, there is insufficient validation of its target. Figure 1 shows the localization and its independence of Rho2. The authors should provide direct evidence that it recognizes Rho1 (for example using a repressible promotor or an anchor away approach).

      *

      We thank the reviewers for their comments on the RBD probe. We have now provided validation for the RBD-probe. We have used rho1 temperature-sensitive and switch-off mutants to show loss of RBD-probe localization in these mutants. This data is provided I the revised manuscript in Fig1 and Supplementary fig. S1.

      At various places in the manuscript the authors refer to this probe as "Rho-probe", RBD-probe, RBD, RBD-(mNG or tdTomato). On page 11 the authors state, "As per our observations, we refer to the Rho-probe signal at the division site as active Rho1 from here onwards." Yet, in the very next paragraph they refer to the localization of the "Rho probe". * This is also an issue with the figures. For example, in figures 4B,C ; 5B,C; 6B; 7B,C the figures are titled either "Rho1 activation at division site", "Rho1-probe at division site"; "Rho1-probe appearance at division site" ; "Rho1-probe in non-constricting rings". *

      We agree that these multiple terms to describe the probe is confusing. We have restricted the terms to either “RBD-mNG” or “RBD-tdTomato” when reporting the data and use “Rho-probe” for descriptive purposes.

      In fig 3, RBD-nNG is quantified in a graph entitled "localizaton [sic] of Rho1-GEFs at division site"

      We thank our reviewers for identifying this error in our labeling of the graph in Fig. 3E. This figure now reads “Localization of Rgf1, Rgf3, active Rho1 at the division site”

      In all figures but two, 5c and 7c, the authors quantify Rho1 activation by the presence or absence of the probe, rather than a quantitative measure or the extent of recruitment of the probe. This could be analyzed my quantitatively.

      We appreciate this comment and provide this response in order to clarify our reasoning for presenting this data. We quantified the intensities of RBD-mNG or RBD-tdTomato where ever relevant to the question we are addressing for each experiment performed.

      Where we look at Rho1 activation at the division site with respect to SPB distances, we are reporting the differences in the timing of Rho activation with respect to mitotic progression. However, in Figures 5c and 7c, and now also Fig 1 of the revised manuscript, we quantified the intensities of the probe as this indicated the changes in overall active Rho1 levels under our experimental conditions. We have added in the text for earlier experiments where we do not report intensity measurements for the active Rho probe that we do not observe any differences in the intensity levels.

      *2 - Regulation of Rgf1 by Cdc42 and Pak1. The results shown in figure 8 show that "early Rho1 activation in gef1 mutants is not Rgf3-dependent". Figure 9 establishes "loss of rgf1 prevents premature Rho1 activation in gef1Δ cells restoring it to normal in late anaphase (Fig.9A, B)." This finding indicates that Rgf1, but not Rgf3, is required for Rho1 premature activation. This finding doesn't rule out the possibility that Cdc42 and Pak1 might be required to turn off RhoGAPs to allow active Rho1 to accumulate. This analysis concludes with this unclear and ungrammatical sentence, "While we were unable to assess the Rho-probe in the rgf1Δ rgf2Δ double mutants due to its lethality [sic; is the Rho probe lethal?], our observations suggest that apart from Rgf1 early Rho1 activation in gef1Δ cells is either due to activation of Rgf2 or due to inhibition of Rga5." *

      We thank you for your insight and agree with these remarks. We could not investigate Rho1 activation in rgf1Δ rgf2Δ double mutants since the double mutants are inviable. We have re-worded the sentence to reflect our findings appropriately.

      *The conclusion that this regulation is due to control of Rgf1 should be toned down. E.g. from the abstract: "We provide functional and genetic evidence which indicates that Pak1 regulates Rho1 activation likely via the regulation of its GEF Rgf1." *

      We have now removed this statement from the abstract. We have also clarified in the discussion that the molecular details of how Cdc42 inhibits Rho1 is not known and needs to be investigated. While our data suggests that the regulator Rgf1 and Rga5 may be involved in the process the details are unclear and we are currently investigating this regulation.

      *SECTION B - Significance ======================== This manuscript ties together several recent papers from the author's lab on the control of Cdc42 activation during cytokinesis and older papers on the role of Rho1 in Bgs1 activation. It provides missing information into the temporal regulation of septum assembly.

      The authors make a point of the similarities of fission yeast cytokinesis to animal cell cytokinesis. Indeed the second sentence reads, "The fission yeast model system divides via an actomyosin-based contractile ring, which is assembled in the medial region of the cell, as in animal cells (Balasubramanian et al., 2004; Pollard, 2010).". However, the authors fail to point out the many differences between yeast and animal cell cytokinesis until the last paragraph of the discussion. If the authors want to include the similarities in the introduction, they should also include the differences. For example, ring assembly is independent of Rho1 activation in fission yeast, but dependent on RhoA activation in animal cells. *

      We thank the reviewer for pointing out this deficiency in our writing. We have now amended the introduction to highlight the differences between Rho1 activity in fission yeast and animal cells during cytokinesis. We have added the following text to the Introduction section.

      “The animal Rho1 homolog RhoA is required for ring formation and is essential for cytokinesis (Basant and Glotzer, 2018). While in yeast, Rho1 is essential for septum formation, the current literature suggests that it is dispensable for ring formation (Onishi et al., 2013; Yoshida, 2009). In fission yeast where both the actomyosin ring and the septum have important roles in the proper coordination of cytokinesis, Rho1 has no reported roles in ring formation but is essential for septation (Balasubramanian et al., 2004).”

      *This work will be of interest to biologists working on yeast cell division. To a lesser extent it will be of interest to biologists interested in cytokinesis and coordination of distinct GTPase pathways.

      Additional points*

      1 - The text is overly wordy and needs extensive revision. Many of the experiments could be explained more clearly and with somewhat less genetic jargon. The introduction has quite a bit of extraneous information and lacks relevant facts, such as the function of Bgs1, which is central to the results.

      We have now modified the text to remove unnecessary genetic jargon. We have also provided additional text to describe the role of Bgs1 in the Introduction.

      2 - page 4 "GEFs promote GTP binding, thus keeping the GTPase active while the GAPs increase GTP hydrolysis, thus promoting GTPase inactivation." GEFs promote GTP binding, but they do not keep the GTPase active (an inhibitor of a RhoGAP would do that), they activate the GTPases.

      We thank the reviewers for highlighting this error. We have corrected this sentence, which now reads “GEFs promote GTP exchange to activate the GTPase, while the GAPs increase GTP hydrolysis to promote GTPase inactivation.”

      *3 - The current literature on animal cell cytokinesis indicates little direct role in cytokinesis, rather than the author's statement, "In larger eukaryotes, the role of Cdc42 activation has been reported mostly in meiotic division events such as polar body extrusion in oocytes, but not much is known about its role in cytokinesis in somatic cell division (Drechsel et al., 1997; Na and Zernicka-Goetz, 2006)." See for example, PMID 10898977, 10871280 which indicate Cdc42 does not play a major role during cytokinesis in at least a few systems where it has been analyzed. *

      We thank our reviewer for this observation and agree that this statement can be expanded to further explain the role of Cdc42 in animal cytokinesis. The paragraph has been re-written as follows-

      Pg5 - “In animal cells, the direct role of Cdc42 in cytokinesis remains indefinite. In Xenopus embryos and mouse fibroblasts for example, constitutively active Cdc42 impairs cytokinesis completion (Drechsel et al., 1997). However, in other cases such as in mouse embryonic stem cells, Cdc42 was only critical for development but not cytokinesis (Chen et al., 2000). RNA interference in animal cells demonstrate that that while RhoA is required for cytokinesis, Cdc42 is not required for this process (Jantsch-Plunger et al., 2000). Cdc42 also promotes spindle positioning and polar body extrusion in mouse oocytes, but it is not known whether its localization at these spindles affects RhoA (Na and Zernicka-Goetz, 2006). Thus, the role of Cdc42 in the cytokinetic process may be cell-type specific, and these data highlight the importance for more investigation to elucidate Cdc42 regulation in dividing cells (Jordan and Canman, 2012).”

      Reviewer #2

      *In many fungal cells, including fission yeast, the deposition of a new cell wall (a septum) between daughter cells is essential for cytokinesis. Cell wall synthases are trafficked to and activated at the division site, and dysregulated trafficking and/or synthase activation can lead to cytokinetic defects. In this study, the authors use fluorescent probes for Cdc42 and Rho1 activity and live-cell imaging to investigate the timing and regulation of Rho1 activity in fission yeast, and specifically, the role of Cdc42 in regulating Rho1. Summary of the proposed model: Gef1 -> active Cdc42 -> Pak1 --| Rgf1 -> active Rho1 -> septum formation

      Major comments

      (1) As far as I can gather from the authors' description in the manuscript and quick literature search, this will be the first publication in S. pombe utilizing the HR1-C2 domain of Pkc2p as fluorescent probe for active Rho1 (RBD-mNG). While a comparable domain of S. cerevisiae Pkc1p (not "pck2" as referenced by the authors in Page 25) has been used for similar purposes, given the importance of this probe and the precedent it sets in the S. pombe literature, it is imperative that proper tests are performed to validate that its localization reflects activity of Rho1 and nothing else (such as membrane binding of the C2 domain or transcriptional regulation of the pkc2 promoter). Such tests should also be independent of the hypotheses central to the current study (i.e., effects of Gef1, Pak1, Rgf1/2 on the timing of RBD-mNG localization). Can the authors provide data to address this point? Examples include, but not limited to, rho1 mutants, expression of constitutively active Rho1, or temporary expression of dominant-negative Rho1.*

      We agree with the reviewer and now provide data to show loss of the localization of the Rho-probe RBD-mNG in rho1 mutants. Using temperature-sensitive and switch-off mutants we show that under mutant conditions the RBD-mNG localization is lost at the division site and also from the cell ends. This provides strong evidence that the probe detects active Rho1 in the cells.

      *(2) Related, M&M does not provide sufficient details about the amino-acid positions corresponding to the "RBD" domain of Pkc2, thus precluding readers from reproducing the experiments. This needs to be clarified. *

      We now provide in the materials and methods the details of how this probe was generated including the base pairs of the budding yeast PKC1 and the fission yeast pck2 promoter.

      (3) In Figure 1B, RBD1-mNG localizes clearly to the medial region of rho2∆ cells when the Rlc1-tdTomato ring has not formed. Does this mean that Rho2 has a major role in forming the contractile ring that is independent of Rho1 activation? On this other hand, however, data in Fig. S2BC suggest that RBD-mNG does not localize to the medial region in rho2∆ cells until Rlc1-tdTomato ring forms (the timing of which seems normal). This discrepancy needs to be addressed.

      In response to the issue raised here, we do not see active Rho1 at the division site of cells without rings. However, after cytokinesis, while cells are in septation, although the ring has disappeared, active Rho1 lingers at the division site. The cell shown in the panel is a septated cell after ring constriction completes. We have included DIC panels of these cells to show that active Rho1 lingers in septating cells.

      *(4) Given the nature of RBD-mNG localization, it seems unavoidable to have some level of arbitrariness in measuring the onset of its localization at the division site. It would be advisable for the authors to be specific in M&M about how they defined the onset of localization, i.e., whether it was based on universal threshold in signal intensity, ratio, etc. or on manual curation (ideally double-blind).

      *

      We have updated the methods to describe that “onset of localization” was performed via double-blind visual observations.

      Minor comments (1) Throughout the manuscript, there are quite a few places where inconsistencies in genetic nomenclature can cause confusion to readers. Below are some examples. Figs. 6B, 7B, 10B: pak1(-ts), shk1, and orb2-34 (including faint labels under category marks in 6B). Fig. 9B (gef1+ rgf1∆) vs 9C (rgf1∆). Wild-type alleles are implicit in some figures, while explicit in others.

      We have corrected these inconsistencies.

      *(2) The first hypothesis (Fig. 1C) is that the AMR might regulate Rho1 activation. The ring is disrupted with LatA, but Rho remains active. They cite this as evidence that the AMR does not activate Rho1, but were the cells treated before or after the rings formed? If before, then the experiment demonstrates what the authors claim, but if after, it only shows that the AMR is not essential to maintain Rho activity. *

      We agree with the reviewer that this is an important distinction. We have modified this statement to “These results indicate that while at the division site the actin cytoskeleton is not required for maintaining Rho1 activation, it is necessary at the growth sites of interphase cells.”

      *(3) Page 8: "Time-lapse imaging of cells simultaneously expressing CRIB-3xGFP and RBD-tdTomato [...] while Rho1 is activated ~20 minutes after SPB duplication (Fig. 2B)." This appears to refer to Fig. 2C. *

      We thank the reviewers for catching this error in the text. We have now corrected it, showing timelapse as Fig. 2C, and an Image of cells simultaneously expressing CRIB and RBD as Fig. 2B.

      *(4) Page 9: "[...] Rgf1 and Rgf3 localize as early as the time of ring assembly at an average SPB distance of 4-5 µm (Fig. 3D)." This sentence is confusing. How was the average calculated over the earliest ring assembly in non-time-lapse data? Fig. 3DE show distances between SPBs as short as 2.5 µm, not 4-5 µm, and average of ~8 µm for all cells at different stages of mitosis. This confusion needs to be clarified. *

      We thank the reviewer for observing this mistake in our writing and interpretation. We agree that the text does not reflect the accurate interpretations of the data collected and have now fixed these errors. The current sentence reading “In an asynchronous population of cells, we find that Rgf1 and Rgf3 localize as early as the time of ring assembly at an average SPB distance of 4-5µm.” has now been replaced with the description shown below-

      “Using the distance between SPBs of anaphase cells as a proxy for timing of cytokinesis, we find that in most anaphase cells, Rgf1-GFP and Rgf3-eGFP was localized at the division site at very early stages in anaphase (Fig. 3D, E). This can be observed by the short distance between the SPBs of ~2µm (Fig. 3D). We also measured the distance for which active Rho1 appeared at the division site, and find that at the distance between SPBs of ~10µm, active Rho1 was present at the division site in ~50% of the population of control cells (Fig. 3E).”

      *(5) Fig. 5. Both the intensity and onset of RBD-mNG localization were affected by cdc42g12v expression. These two may form a causative relationship: reduced overall RBD signal may cause failed detection of early RBD localization. Can the authors compare cells with similar mean RBD-mNG signal intensities (Fig. 5B) and confirm that the timing of appearance at the division site is still delayed in gef1+ cdc42g12v relative to gef1+ empty? *

      We thank the reviewers for pointing this out and appreciate the opportunity to further clarify our observations. While there is clear decrease in Rho-probe intensity at the division site of on cells expressing cdc42G12V, we did see some variation in the extent of the decrease likely due to the variation in the expression levels of cdc42g12V. To provide a more accurate analysis of our observation we have shown the changes in the timing and intensities of Rho-probe localization. However, due to the noisy nature of the data we cannot compare the intensities in individual cells at specific spindle pole body distance between cells. As observed cdc42G12V significantly reduces Rho1 activity globally, not just at the division site. To cherry-pick cdc42G12V cells with similar active rho1 intensity to assess time of Rho1 activation may lead to subconscious data manipulation and will not address how early Rho1 activation is regulated.

      *Reviewer #3

      Onwubiko et al., present a clear and well written manuscript detailing the mechanistic understanding of how Rho1 is activated in a timely manner to ensure cytokinesis occurs in a scheduled manner at the end of telophase. Using fission yeast as a model system, and with the development of a novel Rho1 biosensor, they implicate a series of GTPases, exchange factors, GTPase activating proteins and kinases acting downstream of Cdc42 in the timely activation of Rho1. Specifically, they find that Cdc42 prevents premature Rho1 activation in early anaphase in a manner requiring the kinase Pak1. They observe that the Rho1 activators Rgf1 and Rgf3 localise to the division site in early anaphase, but Rho1 doesn't get activated until late anaphase, suggesting that control mechanisms ensure that these GEFs are held inactive, or that RhoGAP activity is able to balance this activation in early anaphase. This suppression of Rho1 activity in early anaphase requires Cdc42 and Pak1 and implicate (by omission) Rgf1, rather than Rgf3, is the relevant GEF.

      I liked this manuscript, it was clearly written the experimental progression was logical and the data were easy to interpret from the figures. The conclusions were precise, believable and not overstated. The manuscript provides novel observations and through good use of a series of rescues/mutants, illuminates a pathway that is held in check by Cdc42 to ensure timely Rho1 activation. The novel Rho1 probe is exciting and shows well differently regulated pools of active Rho1 at the division site and the growing tips. I thought the co-imaging/measurement of ring placement and SBP duplication allowed a really clear understanding of the kinetics during this rapid phase of the cell cycle. A critique of the study is that the the mechanism by which Cdc42 controls Pak1, and by which Pak1 controls Rgf1/Rgf2 is left unclear. I guess there could always be a molecular expansion of these points (e.g., how does Cdc42 control Pak1; how does Pak1 control Rgf1; how is Rgf activity restricted when localised), but I think that would only enhance, rather than change, the level of detail of the paper's message. I think the paper's current conclusions stand on their own, the data is clear and believable, the experiments are well performed. There are a number of observations in the paper that are left open for future studies, and I think this is a positive (e.g., any separable role of Rgf1/Rgf2 and how Rga5 integrates into this pathway. As such, I am tempted to recommend accept with only minor amendments as outlined below.

      1. P8 P15: should the call out be to Fig 2C, rather than 2B. *

      We thank reviewer for their highlighting this error in our text. We have now fixed it.

      *P14 L17: should it be 'gef1+ rgf3-', not 'gef1+, rgf3+' *

      We have fixed this error and further clarified the terms for easy understanding.

      Structure wise, I thought the section on Rga5 didn't really fit well on P16; it seemed sandwiched between two sections on GEFs. Is there a more appropriate place to place these data - perhaps between the paragraph breaks on P17? Related to this data, the conclusion on P16 suggests 'other' regulators of RhoGAP activity act to repress Rho1 function. Would 'additional' regulators of RhoGAP activity be more appropriate as there is some function contributed by Rga5?

      We have now moved this section to the end of the section on Rho1 regulators after we discuss the Rho1 GEFs. We have also modified the text to clarify that multiple regulators are likely involved in the regulation of Cdc42-mediated Rho1 inhibition.

      *In Fig 10b, you haven't defined orb2-34. Is it the rgf1-delete?

      *

      The mutant orb2-34 is a temperature sensitive allele of the pak1 kinase. To avoid confusion, we have replaced the allele name with pak1-ts in figure 10 and in the text.

      • I find the sentence at the top of P18: 'Rho1 activation in pak1+ rgf1+....at 25oC and 35.5oc occurred at longer and similar SBP distances' quite hard to interpret. Could you perhaps expand it to make your message clearer? *

      We thank the reviewer for pointing this out. These statements have now been re-written for clarity. 'Rho1 activation in pak1+ rgf1+....at 25ºC and 35.5ºC” has been changed, and now reads as follows:

      “The timing of RBD-mNG localization at the division site occurs late in cytokinesis during late anaphase as depicted by longer SPB distances in pak1+ rgf1+, pak1-ts rgf1+, and pak1+ rgf1Δ cells at 25ºC (Fig.10B). As previously shown, RBD-mNG localizes to the division site in early anaphase in pak1-ts rgf1+ cells at the restrictive temperature (35.5ºC, Fig. 7A, B). In agreement with our reasoning, early RBD-mNG localization in pak1-ts mutants at 35.5ºC was rescued in the absence of rgf1 (Fig. 10A, B).”

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This manuscript focuses on the role of Cdc42 in Rho1 activation during fission yeast cytokinesis. The primary finding is that active Cdc42 and its downstream effector Pak1 prevents accumulation of active Rho1 and the synthesis of cell wall material, at early stages of cytokinesis and despite the local recruitment of the Rho1 GEF Rgf1. The data supporting these conclusions are reasonably sound.

      Additional experiments are presented to suggest that Cdc42 and Pak1 negatively regulate Rgf1, this conclusion is not as strongly supported (though it may be true)

      These study relies on a newly described probe for active Rho1. However this probe is not sufficiently well validated.

      Overall the manuscript was not assembled with sufficient care and rigor, these deficits could be readily corrected.

      The major point of the paper is that Cdc42 and Pak1 negatively regulate Rho1 activation. However, during late cytokinesis, active Cdc42 and active Rho1 co-exist at the division site. Thus, Cdc42 activation induces a delay in Rho1 activation, but how this delay is overcome is not investigated or even discussed. Indeed, while the delay is shown the transience of this inhibition is not explicitly mentioned. At a minimum, the authors should highlight this point for the readers.

      Specific points

        • RBD probe This probe is central to this manuscript. However, there is insufficient validation of its target. Figure 1 shows the localization and its independence of Rho2. The authors should provide direct evidence that it recognizes Rho1 (for example using a repressible promotor or an anchor away approach).

      At various places in the manuscript the authors refer to this probe as "Rho-probe", RBD-probe, RBD, RBD-(mNG or tdTomato). On page 11 the authors state, "As per our observations, we refer to the Rho-probe signal at the division site as active Rho1 from here onwards." Yet, in the very next paragraph they refer to the localization of the "Rho probe". This is also an issue with the figures. For example, in figures 4B,C ; 5B,C; 6B; 7B,C the figures are titled either "Rho1 activation at division site", "Rho1-probe at division site"; "Rho1-probe appearance at division site" ; "Rho1-probe in non-constricting rings". In fig 3, RBD-nNG is quantified in a graph entitled "localizaton [sic] of Rho1-GEFs at division site"

      In all figures but two, 5c and 7c, the authors quantify Rho1 activation by the presence or absence of the probe, rather than a quantitative measure or the extent of recruitment of the probe. This could be analyzed my quantitatively. 2. - Regulation of Rgf1 by Cdc42 and Pak1. The results shown in figure 8 show that "early Rho1 activation in gef1 mutants is not Rgf3-dependent". Figure 9 establishes "loss of rgf1 prevents premature Rho1 activation in gef1Δ cells restoring it to normal in late anaphase (Fig.9A, B)." This finding indicates that Rgf1, but not Rgf3, is required for Rho1 premature activation. This finding doesn't rule out the possibility that Cdc42 and Pak1 might be required to turn off RhoGAPs to allow active Rho1 to accumulate. This analysis concludes with this unclear and ungrammatical sentence, "While we were unable to assess the Rho-probe in the rgf1Δ rgf2Δ double mutants due to its lethality [sic; is the Rho probe lethal?], our observations suggest that apart from Rgf1 early Rho1 activation in gef1Δ cells is either due to activation of Rgf2 or due to inhibition of Rga5." The conclusion that this regulation is due to control of Rgf1 should be toned down. E.g. from the abstract: "We provide functional and genetic evidence which indicates that Pak1 regulates Rho1 activation likely via the regulation of its GEF Rgf1."

      Referees cross-commenting

      I think reviews are appropriate and speak for themselves.

      Significance

      This manuscript ties together several recent papers from the author's lab on the control of Cdc42 activation during cytokinesis and older papers on the role of Rho1 in Bgs1 activation. It provides missing information into the temporal regulation of septum assembly.

      The authors make a point of the similarities of fission yeast cytokinesis to animal cell cytokinesis. Indeed the second sentence reads, "The fission yeast model system divides via an actomyosin-based contractile ring, which is assembled in the medial region of the cell, as in animal cells (Balasubramanian et al., 2004; Pollard, 2010).". However, the authors fail to point out the many differences between yeast and animal cell cytokinesis until the last paragraph of the discussion. If the authors want to include the similarities in the introduction, they should also include the differences. For example, ring assembly is independent of Rho1 activation in fission yeast, but dependent on RhoA activation in animal cells.

      This work will be of interest to biologists working on yeast cell division. To a lesser extent it will be of interest to biologists interested in cytokinesis and coordination of distinct GTPase pathways.

      Additional points

        • The text is overly wordy and needs extensive revision. Many of the experiments could be explained more clearly and with somewhat less genetic jargon. The introduction has quite a bit of extraneous information and lacks relevant facts, such as the function of Bgs1, which is central to the results.
        • page 4 "GEFs promote GTP binding, thus keeping the GTPase active while the GAPs increase GTP hydrolysis, thus promoting GTPase inactivation." GEFs promote GTP binding, but they do not keep the GTPase active (an inhibitor of a RhoGAP would do that), they activate the GTPases.
        • The current literature on animal cell cytokinesis indicates little direct role in cytokinesis, rather than the author's statement, "In larger eukaryotes, the role of Cdc42 activation has been reported mostly in meiotic division events such as polar body extrusion in oocytes, but not much is known about its role in cytokinesis in somatic cell division (Drechsel et al., 1997; Na and Zernicka-Goetz, 2006)." See for example, PMID 10898977, 10871280 which indicate Cdc42 does not play a major role during cytokinesis in at least a few systems where it has been analyzed.
    1. Background

      This work has been peer reviewed in GigaScience ( see https://doi.org/10.1093/gigascience/giac097 ), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer name: Giulia De Riso

      In this study, a workflow is presented to generate classification models from DNA methylation data. Methods to deal with harmonization and missing data imputation are presented and the benefit of adopting them for classification tasks is tested on case-control datasets of schizophrenia and Parkinson disease. The authors support this workflow with source code. Although mostly based on already known methodologies, the present study may help orient studies aimed at building and applying DNA methylation based models. However, some major concerns can be raised:

      Majors: In different points of the manuscript, the authors refer to their approach as a pipeline. Indeed, this approach should be composed of sequential modules, in which the output of a module becomes the input of the next one. Although the modules are clearly distinguishable, their organization in the pipeline is less straightforward (also considering that modules can be adopted both to build a model and to use it on new data). The authors could think to draw a scheme of the pipeline, or to adopt a different term to refer to the presented approach. From the model performance perspective, the ML models poorly perform for schizophrenia. The authors point to inner characteristics of the disease as a possible reason for this. However, this point should be better commented in the Discussion section.

      Besides this, the impact of the smaller number of samples included in the training set and the higher proportion of imputed features compared to Parkinson disease on the classification accuracy should be discussed. In addition, since the authors provided the code, is there a way to select samples to include in training/test sets based on random choice (classical 70-30% splitting) instead of source dataset? "For machine learning models, we used only those CpG sites that have the same distribution of methylation levels in different datasets in the control group (methylation levels in the case group typically have greater variability because of disease heterogeneity).": is this filtering performed only on the datasets included in the training set, or also on the test set? It seems the former, but the authors should clearly state this point. Accuracy with weighted averaging should be defined with a formula in the methods section Regarding the ML models, the authors chose different types of decision-trees ensemble, along with a deep learning one. They should contextualize this choice (why different models from the same family?).

      In addition, ML models built on DNA methylation are often based on elastic net or Support-Vector Machines, which are not accounted for in this work. The authors should comment on this aspect in limitations, and state whether the code they provided for their approach could be customized to adopt different models from the ones they presented.

      Regarding the Imputation Method column in Table 2, the meaning is not clear. Are the different imputation methods described in the Imputation of missing values section paired with the ML models presented in Table 2? If yes, some of the methods (like KNN) are missing. In the harmonization section, Models for case-control classification are trained on different numbers and sets of CpGs. To assess the effect of harmonization alone, the number of CpGs should be instead fixed. This is especially critical for schizophrenia, when the number of features for the non-harmonized data is 35145 whereas the one for harmonized data is 110,137. Dimensionality reduction section: are the models from imputed and not-imputed data trained only on harmonized data? And how the set of 50911CpG sites for Parkinson and 110137 CpG sites for schizophrenia is selected?

      Imputation of missing values section: it is not clear on which CpGs and on which samples imputation is performed. Also, it is not clear whether the imputation has been tested on the best-performing model.

      Minors: Page 1, line 2: "DNA methylation is associated with epigenetic modification". DNA methylation is an epigenetic mark itself. Do the authors mean histone marks?

      Page 1, from line 7: "DNA methylation consists of binding a methyl group to cytosine in the cytosineguanine dinucleotides (CpG sites). Hypermethylation of CpG sites near the gene promoter is known to repress transcription, while hypermethylation in the gene body appears to have an opposite, also less pronounced effect.": references should be added

      Page 2, from line 2 : "Current epigenome-wide association studies (EWAS) test DNAm associations with human phenotypes, health conditions and diseases.": references should be added

      Page 3: "In most cases, an increase in dimensionality does not provide significant benefits, since lower dimensionality data may contain more relevant information". This point could be presented in a reverse way (higher dimensionality data may contain redundant information), introducing the collinearity issue. In addition, this issue could be introduced before the missing values and imputation section.

      Page 3: references for "Modern machine-l earning-based artificial intelligence systems are powerful and promising tools" could be more specific for the field of epigenetics and DNA methylation.

    1. Abstract

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giac094 ), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows: Reviewer name: George Taiaroa

      The authors provide a potentially useful dataset relating to transcripts from cultured SARS-CoV-2 material in a commonly used cell line (Vero). Relevant sequence data are publicly available and descriptions on the preparation of these data are for the most part detailed and adequate, although this is lacking at times.

      Although the authors state that this dataset overcomes the limitations of available transcriptomic datasets, I do not believe this to be an accurate statement; based on comparable published work in this cell line, transcriptional activity is expected to peak at approximately one day post infection (Chang et al. 2021, Transcriptional and epi-transcriptional dynamics of SARS-CoV-2 during cellular infection), with the 96 hour period of infection described likely representing overlapping cellular infections of different stages.

      Secondly, many in the field have moved to use more appropriate cell lines in place of the Vero African Monkey kidney cell line, to better reflect changes in transcription during the course of infection in human and/or lung epithelial cells (See Finkel et al. 2020, The coding capacity of SARS-CoV-2). Lastly, the study would ideally be performed with a publicly available SARS-CoV-2 strain, as has been the case for earlier studies of this nature to allow for reproducibility and extension of the work presented by others.

      That said, the data are publicly available and could be of use. Primary comments I think that a statement detailing the ethics approval for this work would be essential, given materials used were collected from posthumously from a patient. Similarly, were these studies performed under appropriate containment, given classifications of SARS-CoV-2 at the time of the study? I do not know what the authors mean in reference to a 'mixed time point sample' for the one direct RNA sample in this study; could this please be clarified? Secondary comments I believe the authors may over-simplify discontinuous extension of minus strands in saying that

      'The gRNA and the sgRNAs have common 3'-termini since the RdRP synthesizes the positive sense RNAs from this end of the genome'. Each of the 5' and 3' sequence of gRNAs/sgRNAs are shared through this process of replication. 'Infections are typically carried out using fresh, rapidly growing cells, and fresh cultures are also used as mock-infected cells.However, gene expression profiles may undergo alterations non-infected cells during the propagation therefore, we cannot decide whether the transcriptional changes in infected are due to the effect of the virus or to the time factor of culturing. This phenomenon is practically never tested in the experiments.' I do not follow what these sentences are referring to. 'Altogether, we generated almost 64 million long-reads, from which more than 1.8 million reads mapped to the SARS-CoV-2 and almost 48 million to the host reference genome, respectively (Table 1).

      The obtained read count resulted in a very high coverage across the viral genome (Figure 1). Detailed data on the read counts, quality of reads including read lengths (Figure 2), insertions, deletions, as well as mismatches are summarized Supplementary Tables.' Could this perhaps be more appropriately placed in the data analysis section, rather than background?

    1. Author Response

      Reviewer #1 (Public Review):

      The authors conducted a thorough analysis of the correlation between height and measures of cognitive abilities (what are essentially IQ test components) across four cohorts of children and adolescents in the UK measured between 1957 and 2018. The authors find the strength of the association between height and cognitive measures declined over this time frame--for example, among 10- and 11-year-olds born in 1958, height explained roughly 3% of the variation in verbal reasoning scores; this dropped to approximately 0.6% among those born in 2001. These associations were further attenuated after accounting for proxy measures of social class.

      The authors' analyses were performed carefully and their observations regarding declining height / cognitive measure associations are likely to be robust if we interpret their results with an important caveat: these results reflect measurements aimed at assessing cognition rather than cognition itself. The importance of this distinction is evidenced by the changing correlation structure of the cognitive measures over time. For example, age 11 verbal / math scores were correlated at >= 0.75 at the first two time points but dropped to 0.33 at the most recent time point. Similar patterns are present for the other cognitive measures and time points. The authors' conclude that such changes are unlikely to impact their primary findings, but I'm less certain. For example, one interpretation of this finding is that older cognitive measures were simply worse at indexing distinct cognitive domains and instead reflected a combination of cognitive ability together with non-specific factors relating to opportunity, health, class, etc. Further, height was historically a stronger proxy for class and economic status than it is today (e.g., by capturing adequate nutritional intake, risk for childhood disease, etc.). Together, then, previously high height / cognitive measure correlations might reflect the fact that both phenotypes previously indexed socio-economic factors to a greater extent than they might today (which is still non-negligible).

      We agree, it is possible that our results could in principle be explained by changes to the measures. We have provided further analysis to attempt to inform the likelihood of this suggestion and have expanded our discussion of this issue (Discussion, explanation of findings section; copied below).

      First, we conducted additional sensitivity analysis repeating our main analysis using cognition measures in which the number of response options was set to be the same for each test (the lowest common denominator across all cohorts). This was tested in two separate approaches: 1) by reducing the number of categories to the same number in each cohort; and 2) or by picking a random sample of question items for each category. Our main findings were unchanged: described in “Additional and sensitivity analyses” section, Figs S20-S21.

      Regarding the suggestion that “high height / cognitive measure correlations might reflect the fact that both phenotypes previously indexed socio-economic factors to a greater extent than they might today” – we sought to account for this by adjustment for measured indicators of socioeconomic position, and found the trend remained after adjustment (Fig 1 panel 2). As in other observational studies we cannot fully rule out the possibility of residual confounding however (Discussion, Explanation of findings paragraph 2).

      “The multi-purpose and multidisciplinary cohorts used cognition tests which differed slightly in each cohort. It is therefore possible that differences in testing could have either: 1) entirely generated the pattern of results we observed, such that if identical tests were used the association between cognition and height would otherwise have been identical in each cohort; in contrast to previous findings which reported using identical tests20; or 2) biased our results, such that if identical tests were used the decline in association between cognition and height would have been less marked than we reported. While we cannot directly falsify this alternative hypothesis given our reliance on historical data sources, a number of lines of reasoning suggest that the first scenario is unlikely. First, our results were similar when using 4 different cognitive tests (spanning mathematical and verbal reasoning); any bias which generated the results we observed should be similarly present across all 4 tests. Other things being equal, one would expect that more discriminatory tests (i.e., those with a greater number of responses) would have higher accuracy and thus better index cognition. Our results were similar when the youngest cohort had similar numbers of unique scores in cognitive tests compared with the oldest cohort (Verbal @ 11 years: n=41 in 1946c, n=40 in 2001c) and fewer unique scores (Maths @ 7/11: n=51 in 1946c, n=21 in 2001c). Our results were also similar in sensitivity analyses in which the number of response options were set to be the same in each cohort. Higher random measurement error in the independent variable (cognition) would lead to weakened observed associations with the outcome (height),52 yet we do not a-priori anticipate that this such error was higher in younger across all tests in such a manner that would have led to the correlation we observed. Ensuring comparability of exposure is a major challenge across such large timespans. Reassuringly, our results are consistent with those from a previous study which reported consistent tests being used (from 1939-1967).20 However, even seemingly identical require modification across time (e.g., for verbal reasoning/vocabulary there is typically a need to adapt question items due to societal and cultural changes over time in vocabulary and numerical use); further, changes to education such as increases in testing may have led to increasing preparedness and familiarity with testing than in the past even where identical tests are used.

      Interestingly, we observed a marked reduction in the correlation between cognitive tests across time (e.g., between verbal and maths scores). This trend has been reported in previous studies53 54 and warrants future investigation; it is consistent with evidence that IQ gains across time seemingly differ by cognitive domain,45 potentially capturing differences across time in cognitive skill use and development in the population. Previous studies using three (1958-2001c) of the included cohorts have also reported changing associations between cognition (verbal test scores at 10/11 years) and other traits: a declining negative association with birth weight19 and a change in direction of association with maternal age (from negative to positive);55 each finding has plausible explanations based on changes across time in relevant societal phenomena (improved medical conditions19 and changes in parental characteristics,55 respectfully), yet also cannot conclusively falsify the notion that differences in tests used influences the results obtained. In this paper, we used multiple tests and sensitivity analyses to attempt to address this.”

      Additionally, their findings add an interesting data point to a collection of recent results suggesting that the relationship between cognitive and anthropometric measures is complex and difficult to interpret. For example, studies using genetic markers to examine shared genetic bases have virtually all relied on methods assuming mating is random, which is not the case empirically. Howe et al. (doi.org/10.1038/s41588-022-01062-7) recently reported that the ostensible genetic correlation of -.32 between years of education and BMI attenuates to -.05 when using direct-effect estimates, which should theoretically be immune to the effects of non-random mating and other confounding variables. Likewise, Keller et al. (doi.org/10.1371/journal.pgen.1003451) and Border et al. (doi.org/10.1101/2022.03.21.485215) used very different approaches to arrive at the same conclusion that ~50% of the nominal genetic correlation between IQ and height could be attributed to bivariate assortative mating rather than shared causal biological factors. Given that assortative mating on both IQ measures and height involves many other traits (not just two as assumed in such bivariate models), the true extent to which height / IQ correlations reflect causal factors is plausibly even lower than these estimates suggest. For these reasons, I do not entirely agree with the authors' review of previous findings in the introduction, where they write "recent studies have suggested that links between higher cognition and taller height can be largely explained by genetic factors", though it is certainly true that this claim has been made.

      We have revised our introduction to better reflect the complexity of previous findings and to note that this claim.

      Reviewer #2 (Public Review):

      The authors use birth cohorts with extensive cognitive assessments and height measurements along with data on parental height and socioeconomic status. The authors estimate that the correlation between height and cognitive ability has approximately halved in the last 60 years.

      Quantile regression results suggest that this is due to a stronger association between low cognitive ability and short stature in older cohorts, potentially due to environmental factors that cause both and that have been removed by improvements in the environment in the last 60 years.

      While this is a plausible hypothesis, the evidence presented in the manuscript is unable to rule out alternative hypotheses, such as changes in assortative mating.

      The results in the manuscript will be of interest to researchers investigating how genetics and environment lead to correlations between cognitive and physical/health traits, and to researchers interested in the relationship between social and health inequalities.

      While my sense of the evidence presented is that there is fairly solid statistical evidence for a trend where the correlation between cognitive ability and height declines over time, there is no formal quantification of this trend nor measurement of the uncertainty in the trend.

      We now include additional statistical tests to compare estimates in each cohort (Fig S6). We have opted to include this in supplemental material given the large number of tests included already.

      Similarly, the quantile regression plots in Figure 2 appear to show a trend across the height deciles for the two oldest cohorts, but no quantification of how strong this is nor what uncertainty exists is calculated. Furthermore, if the apparent trend in the quantile regression plots is true, wouldn't this imply a non-linear association between height and cognitive ability for the older cohorts? Can this be seen in the scatterplots or in a non-linear regression?

      We included 95% confidence intervals in our quantile regression analyses which provide an indication of uncertainty. We believe that given the substantial amount of analyses (across 4 historical cohorts and 4 cognition tests; 23 supplemental results) further work would be best placed to undertake additional statistical exploration of both quantile regression and non-linear associations. We would be happy to reconsider this if requested.

      I think the authors could have done more with their data to investigate the contribution of assortative mating to the observed trend. Looking at Figure S4, it looks like the correlation between mother's education and father's height in the 2001 cohort is substantially lower than for previous cohorts. While cognitive ability may not be available for parents, one could look at, for example, father's education and mother's height across the cohorts and see if there is a downward trend in correlation.

      We now include in Figure S5 cross-cohort investigation of the correlation between parental height and maternal education. We find that the correlation is similar across 1946c, 1958c, and 1970c, yet is weaker in 2001c (Fig S5). We comment on this in the paper (see revised discussion, explanation of findings section). Interpretation of these results is complicated by measurement error in parental education (typically reported for both parents by mothers). Further, interpretation may be further complicated by reductions in the socioeconomic patterning of height across time (see https://www.thelancet.com/journals/lanpub/article/PIIS2468-2667(18)30045-8/fulltext). Future would which focuses on assortative mating could investigate these issues.

      Reviewer #3 (Public Review):

      A difficulty with the paper is the different cognitive tests used in the different cohorts; the authors address this at some length in the discussion. However, I am afraid that this matter makes the results hard or impossible to interpret along the lines of their research question. One would need to know that, if these cognitive tests were administered in a single cohort at one time, they would have the same correlation with height.

      Please see our responses to Reviewer 1 and our revised Discussion. We are reliant upon imperfect historical data to make inferences on long-run trends, in the absence of ideal data for this paper (eg, the same tests used in all cohorts born in 1946, 1958, 1970 and millennium; though even in this instance some changes would be required (eg, to the words chosen in verbal reasoning tasks; see Discussion, explanation of findings section)).

      I judge that the main limitation of the method is the fact that different cognitive tests are used in the different cohorts. The tests in themselves are valid tests of cognitive functions. However, given that the focus of the study is on the change in correlations across time, then it is a worry that the tests are different; that is, the authors have the burden of proving to us that, if the environmental/social changes had NOT been operative across time, then the height-cognitive test correlations would be the same. What can the authors do to prove to us that if, say, all of these different-cohort verbal tests had been given to a single cohort on a single occasion, then they would have the same correlations with height? The same goes for the mathematics based tests. I note the tests' somewhat different distributions in Figure 1, but that is not the only thing that could lead to different correlations with, say, height. I am aware that all cognitive tests tend to correlate positively and that they all have loadings on general intelligence; however, different tests will not necessarily have the same correlations with outside variables (e.g. height). This will depend on things such as their content, their reliability/internal consistency etc.

      In the Results the authors state: "Cognitive test scores were strongly-moderately positively correlated with each other, with the size of the correlation weakening across time." That's true, but perhaps, also a major concern for this study. One possible reason for the decline in verbal-maths test correlations across cohorts (old to recent) is that the nature of these tests has changed across time, either/both in terms of content (what capabilities are assessed) or something such as reliability/internal consistency/ceiling-or-floor effects (how well the capabilities are assessed). That is, given that the height-cognitive test correlations show a similarly declining pattern of correlations over cohorts, it could be that the tests' contents (of the different tests) is partly or wholly responsible. I raise that as a possibility only, and I appreciate that it might be correct, as the authors prefer, that there is an inherent lowering of intelligence-height correlations over time, but I do not think that one can rule out-with the present study's design-that it might have been due to the change in tests. For example, a reading-math correlation of 0.74 in 1946 lowered to a correlation of .32 in 2001, in the face of different tests. To show that this is not due to the different tests being used would require more information. If this is a true result, it is big news.

      Please see our responses to Reviewer 1. This includes additional analysis and an expanded discussion of this possible cause of bias. We hope our manuscript now provides further evidence and discussion to inform the likelihood of this possibility.

      I have a suggestion: if the authors wish to rule out the possibility that the lowering intelligence-height correlations across cohorts are due to different cognitive tests being used, they should take all the cognitive tests used here and apply them cross-sectionally to single-year-born samples (of 11- and 16-year olds) that have also been measured for height. If the cognitive tests all correlate at the same level with height within each of these two samples (they needn't do so across the 11- and 16-year olds), then one could proceed more safely with between-cohorts (1946, 1958, 1970, 2001) comparisons of the correlations.

      We thank the reviewer for this suggestion. However we are unsure that we understood the suggested analysis or whether it was tractable given our data—the cohorts we used were born in either 1946, 1958, 1970, or around 2000. We do not have cross-sectional samples of 11 and 16 year olds at the same time.

    1. We further evaluated the pipeline with a genome containing simulated HGT regions. Since our78HGT identification pipeline has two main steps, sequence composition-based filtering step and79genome comparison step. The evaluation was done for the two steps (Figure S3, Table S1). While80top 1% fragments were input to the pipeline, 20.6% correct results would be identified after81sequence composition-based filtering and 14.3% correct results identified after genome comparison.82When the percentage of fragments input was up to 50%, 83.4% and 77.7% correct results were83identified after two steps respectively. It can be seen that the precision of prediction was higher than8460% for all cases. This indicated that we may have underestimated the number of HGTs (low recall85rate) but majority of the identified HGTs were highly reliable.

      This paragraph was a bit confusing to follow but I think I got the gist of it after a few passes through! I'm curious if you thought about controlling for natural variation in 4mer frequency throughout the genome, as some other methods have found that this helps reduce off target predictions (reviewed in https://doi.org/10.1371/journal.pcbi.1004095). It may not be necessary since you do a second step after the initial screen, but I was just curious if that was something you thought about putting in place, and if so, why you decided against it

  2. drive.google.com drive.google.com
    1. Role Card

      <span style="color: blue;">Policy Consideration: Role Cards</span>

      For more information about the Role Cards system see the annotation on page 7.

      As an alternative example, Civic Square structures their role relationships as follows:

      We think about relationships within the ecology of our team as first, second and third order connections. We are moving towards becoming teams of teams, so your everyday may involve a smaller focused team; week-to-week some others, and further month-to-month connections with the wider team. In this particular role we foresee you working with these key people initially:

      (a) Primary Connections [Primary connections]<br/> (b) Secondary Connections [Secondary connections]

      The job description is co-developed over an initial period, which looks to define responsibilities individually and collectively, with the understanding that it is not a static process, and they hope to revisit this together regularly.

    1. Author Response

      Reviewer #2 (Public Review):

      The manuscript "Optimal Cancer Evasion in a Dynamic Immune Microenvironment Generates Diverse Post-Escape Tumor Antigenicity Profiles" by George and Levine describes TEAL - a mathematical model for the dynamics of cancer evolution in response to immune recognition. The authors consider a process in which tumor cells from one clone are characterized by a set of neoantigens that may be recognized by the immune system with a certain probability. In response to the recognition, the tumor may adapt to evade immune recognition, by effective removal of recognizable neoantigens. The authors characterize the statistics of this adaptive process, considering, in particular, the evasion probability parameter, and a possibility of an adaptive strategy when this parameter is optimized in each step of the evolution. The dynamics of the latter process are solved with a dynamic programming approach. In the optimal case, the model captures the tradeoff between a cancer population's need for adaptability in hostile immune microenvironments and the cost of such adaptability to that population. Additionally, immune recognition of neoantigens is incorporated. These two factors, antitumor vs pro-tumor IME as quantified by the Beta penalty term, and the level of immune recognition as quantified by the rate q, form the basis of a characterization of tumors as 'hot' or 'cold'.

      I think this framework is a valuable attempt to formally characterize the processes and conditions that result in immunologically hot vs cold tumors. The model and the analytical work are sound and potentially interesting to a major audience. However, certain points require clarification for evaluation of the relevance of the model:

      1) Tumor clonality

      My main concern is about the lack of representation of the evolutionary process in the model and that the heterogeneity of the tumor is just glossed over.

      The single mention of the problem occurs in Section 2, p2: "Our focus is on a clonal population, recognizing that subclonal TAA distributions in this model may be studied by considering independent processes in parallel for each clone."

      I don't think this assumption resolves the impact of tumor heterogeneity on the immune evasion process. Furthermore, I would claim that the process depicted in Fig 1A is very rare and that cancers rarely lose recognizable neoantigens - typically it would be realized via subclonal evolution, with an already present cancer clone without the neoantigens picking up. Similarly, the adaptation of a tumor clone is an evolutionary process - supposedly the subclones that manage to escape recognition via genetic or epigenetic changes are the ones that persist. It is not clear what the authors assume about the heterogeneity of the adapting/adapted population between different generations, n->(n+1). Is the implicit assumption that the n+1 generation is again clonal, i.e. that the fitness advantage of the resulting subclone was such that the remaining clones were eliminated? Or does the model just focuses on the fittest subclone? A discussion on whether these considerations are relevant to the result would clarify the relevance of the result.

      We thank the reviewer for these helpful clarifying points. Empirical evidence in lung cancer exists for genomic changes manifesting as lost neoantigens in treatment-resistant clones (and Anagnostou et al. Cancer Discovery 2017) showed that those lost antigens were also shown to generate functional immune responses). Similar results for melanoma have also been shown (Verdegaal et al. Nature 2016), with loss of neoantigens associated with reactivity in TILs. Recent observations (Jaeger et al. Clinical Cancer Research 2020) even show that mutated peptides may be hid by protein stabilization, in addition to reduced expression patterns. We however do wish to clarify that our model implicitly equates antigen loss and the progression of a subpopulation currently adapted to evade immune targeting – either by direct pruning of the fittest subclone or by stochastic emergence and subsequent growth of a new one lacking the targeted antigens – as equivalent.

      Because we for foundational understanding studied the case where a single clonal signature was tracked in time, we under-explained the implementation of such a model in more complicated cases. As mentioned previously, the next most complicated scenario involves a heterogeneous population of cancer cells with disjoint neoantigen profiles. In this case, a parallel process can be studied wherein the effects of recognition in one environment are decoupled from the other (relevant to, for example, spatially distinct sub-populations). This description however misses the case where such disparate populations evolve to express shared antigens, or in the case where there are both clonal and subclonal antigen targets. Here, our model can still be applied in parallel to study distinct clones but requires additional structure. Namely, in this case we would need to incorporate non-trivial coupling between the possible recognition/selection against certain antigens shared across clones. For example, control of a population with clonal antigens {a,b} but having unique subclones having either antigens {w,x} or {y,z} could be considered by studying the process in parallel, and control in the next periods would require recognition/selection against either 1) at least one of {w,x} and at least one of {y,z}, or 2) at least one of {a,b}. In this more general framework, the arrival of new subclones with distinct features from the parent clone in question could also be incorporated and studied across time periods. This strategy of subdividing more complicated evolutionary structures has now been further elaborated on in the Methods section, and we have expounded these points in the discussion (see additions given under Editor Comment 2).

      2) Time scales

      Section 2, p2: "We assume henceforth that the recognition-evasion pair consists of the T cell repertoire of the adaptive immune system and a cancer cell population, recognizable by a minimal collection of s_n TAAs present on the surface of cancer cells in sufficient abundance for recognition to occur over some time interval n.".

      How do the results depend on the duration of interval n? The duration should be long enough to allow for recognition and, up to some limiting duration, proportional to the TAA recognition probability q. However, it should not be so long that the state of the system can change significantly. A clarification on this point is needed.

      We agree with the reviewer that these points should be elaborated upon when discussing the time interval. Very briefly, we opted for a discrete-time model tracking a cancer population under selective immune pressure. In order for 𝒒 to represent the total recognition probability of an immune system against a particular TAA, the time interval 𝚫𝒏 in question is a coarse-grained feature representing the time between the earliest chance that the adaptive immune system may identify a cancer clone and the latest point after which such a recognition event would no longer be able to prevent cancer escape. This time period may vary substantially across cancer subtypes and depends on the cancer per-cell division rate, for example (George, Levine. Can Res 2020). As the reviewer pointed out, in implementing such a model there is an asymmetric risk to considering 𝚫𝒏 too large, as the future state of the system may not be well-reflected by the simple loss and addition of new TAAs. On the other hand, considering small time intervals 𝚫𝒏, while possible, would require the incorporation of additional intermediate states ending in neither cancer elimination nor cancer escape.

      We have clarified the points that the reviewer has brought up by adding them to the discussion section: In this discrete-time evolutionary model, the intertemporal period considered represents the time period between the earliest moment that the adaptive immune system may identify a cancer clone and the latest point after which such a recognition event would no longer be able to prevent cancer escape (George, Levine. Can Res 2020). This effectively gives 𝒒 a probabilistic representation for the total rate of opportunity to recognize a given TAA during cancer progression. Implementing this model in cancer subtype-specific contexts thus requires a consideration of the per-cell division rates, for example.

      Reviewer #3 (Public Review):

      Cancer cell populations co-evolve under the pressure exerted by the recognition of tumor-associated antigens by the adaptive immune system. Here, George and Levine analyze how cancers could dynamically adapt the rate of tumor-associated antigen loss to optimize their probability of escape. This is an interesting hypothesis that if confirmed experimentally could potentially inform treatments. The authors analyze mathematically how such optimally adapting tumors gain and lose tumorassociated antigens over time. By simplifying the complex interplay of immune recognition and tumor evolution in a toy model, the authors are able to study questions of practical interest analytically or through stochastic simulations. They show how different model parameters relating to the tumor microenvironment and immune surveillance lead to different dynamics of tumor immunogenicity, and more immunologically hot or cold tumors.

      Simple models are important because they allow an exhaustive study of dynamical regimes for different parameters, such as has been done elegantly in this study. However, in this quest for simplification, the authors have not considered biological features that are likely to be of importance for understanding the process of cancer immune co-evolution in generality: tumor heterogeneity and immune recognition that only stochastically results in cancer elimination. In this sense, this paper might be seen as the opening act in a series of more sophisticated models, and the authors discuss avenues towards such further developments.

      We share the reviewer’s credence in foundational modeling for comprehensive predictions on available dynamical behavior for the important problem at hand. The reviewer also correctly points out that that future model refinement will be needed to further develop the foundational model developed in this work. In an attempt to illustrate one of the more reasonable generalizations, which is to include nontrivial sub-clonal heterogeneity in tumor antigens, we now describe how one would go about enhancing the existing model to address this, which has been added to the Methods and Discussion sections (see additions given under Editor Comment 2).

    2. Reviewer #2 (Public Review):

      The manuscript "Optimal Cancer Evasion in a Dynamic Immune Microenvironment Generates Diverse Post-Escape Tumor Antigenicity Profiles" by George and Levine describes TEAL - a mathematical model for the dynamics of cancer evolution in response to immune recognition. The authors consider a process in which tumor cells from one clone are characterized by a set of neoantigens that may be recognized by the immune system with a certain probability. In response to the recognition, the tumor may adapt to evade immune recognition, by effective removal of recognizable neoantigens. The authors characterize the statistics of this adaptive process, considering, in particular, the evasion probability parameter, and a possibility of an adaptive strategy when this parameter is optimized in each step of the evolution. The dynamics of the latter process are solved with a dynamic programming approach. In the optimal case, the model captures the tradeoff between a cancer population's need for adaptability in hostile immune microenvironments and the cost of such adaptability to that population. Additionally, immune recognition of neoantigens is incorporated. These two factors, anti-tumor vs pro-tumor IME as quantified by the Beta penalty term, and the level of immune recognition as quantified by the rate q, form the basis of a characterization of tumors as 'hot' or 'cold'.

      I think this framework is a valuable attempt to formally characterize the processes and conditions that result in immunologically hot vs cold tumors. The model and the analytical work are sound and potentially interesting to a major audience. However, certain points require clarification for evaluation of the relevance of the model:

      1) Tumor clonality

      My main concern is about the lack of representation of the evolutionary process in the model and that the heterogeneity of the tumor is just glossed over.

      The single mention of the problem occurs in Section 2, p2: "Our focus is on a clonal population, recognizing that subclonal TAA distributions in this model may be studied by considering independent processes in parallel for each clone."

      I don't think this assumption resolves the impact of tumor heterogeneity on the immune evasion process. Furthermore, I would claim that the process depicted in Fig 1A is very rare and that cancers rarely lose recognizable neoantigens - typically it would be realized via subclonal evolution, with an already present cancer clone without the neoantigens picking up. Similarly, the adaptation of a tumor clone is an evolutionary process - supposedly the subclones that manage to escape recognition via genetic or epigenetic changes are the ones that persist. It is not clear what the authors assume about the heterogeneity of the adapting/adapted population between different generations, n->(n+1). Is the implicit assumption that the n+1 generation is again clonal, i.e. that the fitness advantage of the resulting subclone was such that the remaining clones were eliminated? Or does the model just focuses on the fittest subclone? A discussion on whether these considerations are relevant to the result would clarify the relevance of the result.

      2) Time scales

      Section 2, p2: "We assume henceforth that the recognition-evasion pair consists of the T cell repertoire of the adaptive immune system and a cancer cell population, recognizable by a minimal collection of s_n TAAs present on the surface of cancer cells in sufficient abundance for recognition to occur over some time interval n.".

      How do the results depend on the duration of interval n? The duration should be long enough to allow for recognition and, up to some limiting duration, proportional to the TAA recognition probability q. However, it should not be so long that the state of the system can change significantly. A clarification on this point is needed.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements We thank the reviewers for their thoughtful comments and suggestions, which have improved the manuscript. We are particularly gratified by their positive comments about the significance of the findings. Our point-by-point responses to the reviewer comments and suggestions are summarized below. Line numbers have been added to the revised manuscript to make it easier to locate the changes.

      Point-by-point description of the revisions

      Reviewer #1__ __

      *1) In the study there is a lack of consideration of other targets. In and of itself this is not a problem, but once the author's identified the T130A mutation as being a key for protection it would have been good to Sanger sequence the other T. gondii myosins - a quick alignment of the TgMyo's A, C, H (class XIV), along with D and E suggests that the motif is highly conserved. This raises the currently unexplored (and exciting!) prospect of a pan-myosin inhibitor, and that there might have been mutations at an equivalent position in the other four KNX002 resistant clones - for example, MyoC has been proposed to provide some level of functional redundancy in the absence of MyoA. *

      Because the goal of this work was to evaluate the druggability of TgMyoA, we specifically designed our experiments to identify resistance-conferring mutations in TgMyoA or its light chains, as described on lines 366-372. This strategy yielded the TgMyoA T130A mutation, which enabled us to rigorously determine that inhibiting TgMyoA, and TgMyoA only, was sufficient to slow the progression of disease in vivo. Because we took this targeted approach, our results did not address: (a) the basis of resistance in the 4 resistant clones that did not contain a mutation in TgMyoA or its light chains and (b) whether KNX-002 inhibits any of the other ten parasite myosins.

      The most informative way to address (a) would be to do whole genome sequencing on each of the mutants, since resistance might have nothing to do with the other parasite myosins or their light chains. Any potential resistance-conferring mutations identified would need to be regenerated in a non-mutagenized background and functionally characterized, as we have done here for the T130A mutation, to be certain that this particular mutation was responsible for resistance. The most direct way to address (b) would be to individually express each of the other ten parasite myosins together with its specific associated light chains and the myosin co-chaperone protein TgUNC, purify the motor and determine the effect of the compound on motor activity (as we have done for TgMyoA; Fig. 1). These are both major undertakings that are beyond the goals and scope of the current manuscript. Critically, the absence of such data does not impact the conclusions of our phenotypic studies, which used the CRISPR-engineered T130A parasite line.

      We nevertheless agree with the reviewer that these are both interesting questions that should be studied further, and we now discuss them on lines 416-422 and 451-453.

      2) The fact that T130 is not thought to be the binding site of KNX002 is only introduced quite late on - this also relates to the next point - is the binding pocket conserved??? It is intriguing that the residue and proximal amino acid environment are highly conserved with vertebrates, but that KNX002 does not have an effect on their activity in their screen assay. It would be useful to know if the differences in the structures of the myosins can provide an explanation for this - along the same lines, given that the crystal structure of TgMyoA is available (PMID: 30348763), it would be useful for the authors to provide a molecular model for the binding of the inhibitor to the proposed point of engagement.

      These are excellent questions that we unfortunately cannot yet answer. Docking simulations of KNX-002 to the published structure of TgMyoA in its pre-powerstroke state have thus far not yielded any promising results. The site of KNX-002 binding to P. falciparum MyoA was determined by X-ray crystallography (ref. 54); however, the PfMyoA in that study was in the post-rigor state and the coordinates of the co-crystal structure have not yet been made available in the PDB database for homology modeling.

      The lack of effect of KNX-002 on the vertebrate muscle myosins may not be surprising. Although the 3D structures of myosins are rather conserved, their primary structures are quite different, which likely contributes to the different effects of KNX-002 on the different myosins. TgMyoA and PfMyoA are more similar to each other than they are to the vertebrate muscle myosins, which may enable the specific targeting of MyoA in apicomplexan parasites (lines 308-310).

      *If this is an allosteric site it is possible that the mutation functions indirectly upon binding of KNX002 to the orthosteric site, but this would be useful to help the reader to understand this (and there are bioinformatic prediction tools that will score allostery - which would be interesting to include). This is explained somewhat in the discussion - but this should be introduced much earlier to clarify. What is known about allosteric regulation of Myo function? Is this a known site of regulation? *

      Allosteric modulation of human cardiac myosin by small molecules such as omecamtiv is well established, and allosteric effects of the T130A mutation are certainly a possibility. As molecular motors, myosins depend on a complex and highly interconnected network of allosteric interactions to perform their function (for example, see ref. 59). This complexity, combined with the fact that the data on the PfMyoA-KNX-002 structure have not yet been released, makes it very difficult to generate any sort of model that would support meaningful conclusions. A statement to this effect has been added to the Discussion (line 375-382), and the likelihood that T130 is not part of the actual binding site for the compound is now mentioned at the very beginning of the paragraph that discusses the potential mechanism of action of the T130A mutation (lines 375-376).

      *3) Introduction: 'Nearly one third of the world's population is infected with the apicomplexan parasite' - given that these data are extrapolated from serology, this should be reworded - it's fairer to say that they 'are or have been infected...' *

      Done – line 65.

      *4) Page 4: figure 1A - can you provide some explanation for incomplete inhibition in the screen - there seems to be a residual amount (15-20%) of activity that is not inhibited. *

      The compound inhibits 80-90% of the motor’s activity at 40uM. We did not test higher concentrations in the ATPase assay; presumably we would see incrementally more inhibition as we increase compound concentration further, but the concentrations used enabled us to construct a reproducible IC50 curve without adding potentially confounding amounts of DMSO (carrier) to the assay.

      *5) The authors demonstrate a general effect on growth over 7 days. It would be good to use a replication assay (e.g. parasites/vacuole over a single lytic cycle) to confirm that KNX002 does not affect cellular division. This would further strengthen the argument that the phenotypic effect is primarily via impacts on motility. *

      A figure showing the lack of effect of KNX-002 on replication has now been added (new Supplemental Figure 2) and a paragraph describing these data and their implications added to the Results section (lines 137-143).

      *6) Page 5: 'selected for parasites resistant to KNX-002 by growth in 40 μM KNX-002.' - could the authors add text to explain why that concentration was chosen. *

      40 μM is close to the compound’s IC90 of 37.6 μM and, although we tried a number of different compound concentrations and selection schemes, 40 μM yielded parasites with the greatest shift in IC50. We now include this rationale on lines 606-607, as well as a new figure showing the shift in IC50 curves for all 5 resistant lines (Suppl. Figure 8).

      *7) Page 6: 'suggesting that the effects of the T130A mutation on motor function are due to more subtle structural changes' - it's fair to say that there are not gross structural changes based on the data presented, but that does not mean it is therefore a 'more subtle structural change' - surely the mutation could prevent KNX002 binding without effecting TgMyoA structure? *

      Based on the residues within P. falciparum MyoA that participate in binding to KNX-002 (ref. 54), it is unlikely that T130 of TgMyoA participates directly in compound binding. Mutation of T130 to alanine therefore seems most likely to impact compound binding through a change in protein structure, discussed more fully now on lines 375-382.

      *8) Page 6: 'the proportion of filaments moving' - in the figure it's referred to as the 'fraction of filaments', which makes more sense for the data presented. Please correct to 'fraction' throughout the manuscript (discussion, page 9 - possibly other instances!). Along the same lines, in figure 6 it would be good to change the y axes on the '% moving' to be 'fraction moving' and change the numbers - this would make it easier for the reader to understand the index values presented in the lower panels - if you do the calculation with the % values presented the numbers don't make sense (as fractions they do). The axes for motility also go up to 125% - please correct - based on the data presented there is no need for this to be above 100% (or 1 - see above). *

      We thank the reviewer for this suggestion; “percent moving” and “proportion moving” have been changed throughout the manuscript and figures to “fraction moving”. The y-axis labels on the motility and IC50 curves have also been modified as suggested.

      *9) Page 7: 'tested whether KNX-002 (20mg/kg, administered intraperitoneally on the day of infection and two days later' - please provide some rationale for the concentration used. *

      A preliminary dose tolerance study was conducted prior to the infection experiments, with doses ranging from 5-20 mg/kg. The study showed that two doses of 20 mg/kg, administered two days apart, resulted in minor hepatoxicity without signs of pain or distress. 20mg/kg was therefore considered the maximal tolerated dose. This rationale is now included on lines 715-721.

      *10) Page 10: 'the T130A mutation is likely to have long range structural impact that could alter the KNX-002 binding pocket' - this is particularly interesting, and should be addressed with a model - do the authors think that the T130 region be a conserved site of allosteric regulation? This would be good to expand upon in the discussion - mutation of an allosteric site as a mechanism of resistance is unusual, and typically described as being unlikely - and used as justification for the targeted drugging of allosteric sites. *

      See response to comment #2 above and the new text on lines 375-382.

      Reviewer #2

      *1) Considering (i) the moderate effect of KNX-002 on the acute infection process in CBA mice that received tachyzoites intraperitoneally, (ii) the fact that the drug application cannot be envisaged outside of the context of reactivation of cystogenic strains (in particular with respect to cerebral toxoplasmosis as emphasized in the introductive section), which implies the drug would have to be delivered and active in the brain parenchyma, a condition not analyzed here, it would be appropriate to modify the current title. It would be more relevant to highlight the solid body of data on the identification and functional characterization of the compound and derivatives in vitro and in the host mouse model. Apart from the title, the discussion should also recontextualize the in vivo assays and the information these assays bring on the slight delay of the "mortality" of some but not all mice. *

      We agree that the major clinical application of any new anti-Toxoplasma chemotherapy would be treatment of a reactivated infection, particularly in the brain (although there could also be a role for treatment of pregnant women), and that the data we present with this compound do not speak directly to clinical efficacy in this context. That said, reactivation leads to an active infection whose pathogenesis requires TgMyoA-dependent motility, invasion and egress, like the active infections analyzed here. The KNX-002 scaffold would likely need to be modified to enable it to cross the blood-brain barrier and access parasites in the brain, but that would be a normal step in any campaign to develop new drugs for toxoplasmosis (which is well beyond the scope of this study; see response to comment #11).

      Given these considerations, we gave much thought to how to accurately describe the results from the animal experiments – and we therefore appreciate the reviewer’s comment. For the title, we arrived the word “druggable”, because it has the very specific meaning described on lines 100-101: a protein whose activity is amenable to inhibition by small molecules. In our experiments with mice infected with wild-type parasites, nine of the ten compound-treated animals survived longer than the untreated controls, and 40% of the treated mice were still alive at the end of the experiment. Nevertheless, we stayed away from terms like “therapeutic” or “treatment”, for exactly the reasons the reviewer raises. We believe that the current title is an accurate summary of what we found, since we have indeed shown that MyoA is amenable to inhibition by a small molecule in a well-established animal model of infection (CBA mice infected intraperitoneally). Showing for the first time that the MyoA is druggable, in vivo, provides the rationale for identifying more potent compounds that can access the brain and serve as bona fide leads for drug development.

      To the reviewer’s point, we also reviewed all sections of the text where we described the animal experiments, and in the revised manuscript we replaced all instances in the text of “ameliorate disease”, “prevent disease” and “decrease the susceptibility of mice to a lethal infection” with the more circumspect phrases “alter disease progression” or “slow disease progression” (lines 46, 56, 110, 297, 315, Figure 9 legend). We also changed the title of the Results section describing these data from “KNX-002 treatment decreases the susceptibility of mice to lethal infection with T. gondii” to “KNX-002 treatment slows disease progression in mice infected with a lethal dose of T. gondii” (line 284).

      *2) Motility analysis: This comment concerns the Figure 7. It seems to the reviewers that the major hypothesis to test in data presented in panel B is that the wild type and the T130A mutant tachyzoite respond differently under similar drug conditions rather than the two populations without drug. These statistics could be added easily, hence it would validate that the proportion of motile mutant parasites is not affected by the drug when compared to vehicle. *

      These statistical comparisons have now been added to revised figure 7, as suggested. Since this comparison was between different parasite lines, it required the use of unpaired t-tests (vs. the paired t-tests used for different compound treatments of the same parasite line). We have therefore revised all 3D motility figures (Figures 4 and 7, Suppl. Figures 7 and 12) and their legends to clearly indicate which samples are being compared to which and whether paired or unpaired statistical tests are being used.

      However, the statistics shown panel C rather suggest that the drug does impact on the speed of the moving parasites, including when these carry the "resistance" T130 A mutation. It is not clear what we can gain in terms of messages with the motility index except to "slightly reverse" the analysis on panel B and to favor a no-effect of KNX-002 on the mutant parasite motile skills, on which the author might give more explanation. When comparing these quantitative tests with the panel presented above (panel A) it seems that the mutant parasite is still impacted by the MyoA inhibitor. Although there is no doubt for the reviewers that the T130A mutant emerging from the selected T. gondii resistant clones is a valuable probe for assessing drug selectivity: indeed the assays validate KNX-002 as a direct TgMyoA ATPase inhibitor, it might be good to rephrase some sentences and to have a harmonized definition of the parasite motility index throughout the text (Figure 7 legend, result and discussion sections).

      The reviewer is correct that there is a decrease in the speed of compound-treated T130A parasites, as the p-values on Figure 7C indicate. This is why we state in the text that “the mutant parasites retain some sensitivity to the compound” (line 263). We were careful throughout the manuscript to refer to the resistance provided by the mutation as “partial”, or to describe it as a “reduced sensitivity”. Partial resistance is still sufficient to establish compound specificity, as noted by the reviewer in this comment.

      We present the motility index not to try to “reverse” the effect of the compound on the mutant’s speed, but because the compound has two simultaneous effects on motility -- a decrease in the fraction moving and a decrease in speed of those that do move. Combining these two effects into one value (while still showing each component individually, as we have) enables comparison to the analogous actin filament motility index from the in vitro motility assays, and provides a more complete picture of the impact of compound treatment on parasite motility. This is a similar approach to that used in studies of e.g., phagocytosis, where the widely reported “phagocytic index” corresponds to the fraction of cells that have internalized at least one particle multiplied by the average number of beads internalized. The motility index of the mutant parasites is significantly less impacted by KNX-002 than the motility index of wild-type parasites (Figure 7D).

      We have further clarified the definition and rationale for using the parasite motility index throughout, as suggested (lines 233-235, 264-267, 345-348).

      *This reviewer's concern was accentuated by the comparison between the actin filament sliding index and the parasite motility index which appears as such far stretch; Aside from the "far stretched claims" easy to re-address in a revised version, the readers have appreciated the writing quality and most figure illustration. The discussion nicely synthetizes the whole dataset, including those related to the 4 T. gondii clones that resisted to KNX-002 but not through mutations targeting any of the myosinA chains. *

      We have added additional text to the discussion listing possible reasons for the differential effects of the mutation on the filament and parasite motility indices (lines 403-406).

      4) Ab*stract: the concept of "ameliorate disease" in this framework is odd and the objective of the work can be rephrased in a simple way (see below) *

      See response to Comment #1; “ameliorate” has been replaced with “alter disease progression” (line 46).

      *5) Introduction section: we think that the references on the impairment of invasiveness for the KoMyoA should be included (Bichet et al., BMC Biology 2016) as it has provided proof of an alternative and suboptimal mode of entry in many different cell types, thereby arguing that in absence of MyoA function, parasite invasiveness is not fully abolished and this without considering any MyoC-driven MyoA compensation. *

      We thank the reviewer for catching this oversight; the Bichet citation has been added (line 93).

      6) Introduction, third paragraph: in the sentence "Because the parasite can compensate for the loss or reduced expression of proteins important to its life cycle [29-31], small-molecule inhibitors of TgMyoA would serve as valuable complementary tools for determining how different aspects of motor function contribute to parasite motility and the role played by TgMyoA in parasite dissemination and virulence ». We definitively agree with this view but saying that, we think it would be worth evaluating (or simply discussing) the potency of the KNX-002 against MyoC, which compensatory contribution has been debated and remains questionable (at least to the reviewers) with respect to cell invasiveness restoration (related to the comment above).

      We have included a discussion of a potential compensatory role for MyoC and the value of determining in future studies whether KNX-002 (or its more potent downstream analogs) inhibit any of the other parasite myosins (lines 419-423). Whether or not MyoC can functionally compensate for a lack of MyoA – we agree this is a controversial question – it is important to note (as we do on line 440-442) that “T. gondii engineered to express low levels of TgMyoA … are completely avirulent [28], arguing that sufficiently strong inhibition of TgMyoA is likely, on its own, to be therapeutically useful”.

      *7) If we are correct, the screen and the characterization study have been performed with two different products (CK2140597 and KNX-002 the compound library and the re-synthetized one, respectively). Could we make sure that the two have the same potency? *

      The source of compound used in each of the assays is now explicitly described on lines 481-490). Commercially obtained compound yielded an IC50 in growth assays of 16.2 and 14.9 μM (Figures 2 and 5, respectively), and compound synthesized by us yielded an IC50 value of 19.7 μM (Figure 3). The 95% confidence intervals of these three independent IC50 determinations with two different sources of compound overlap (lines 484-486).

      8) We understood how the authors came to the conclusion that the KNX-002 impact on growth of the parasite and they stated "growth in culture" in the subsection title but then refers to parasite growth. Therefore, it looks a bit confusing for the reader since intracellular growth per se is probably not modified but this feature was not looked at it in this study (we would expect no impact based on published data on MyoA- genetically deficient tachyzoites, except if the drug impacts host cell metabolism for instances). Instead, it is the overall expansion of the parasite population that is analyzed here and clearly shown to be impacted. This decrease in population expansion on a cell monolayer likely results from impaired MyoA-dependent egress and invasiveness upon chemical inactivation of MyoA. Accordingly, it appears difficult to understand what is an IC50 for the "overall" growth in the context of this study. The authors should rephrase for better accuracy when necessary, including in the graph Fig2 legend axis.

      While assays that measure parasite expansion in culture are by convention called “growth” assays (e.g., see Gubbels et al, High-Throughput Growth Assay for Toxoplasma gondii Using Yellow Fluorescent Protein AAC 47 (2003) 309, the paper on which our assay was based), we take the reviewer’s point that a reader may incorrectly ascribe the inhibition to some other aspect of the lytic cycle (e.g., intracellular replication), rather than a myosin-dependent motility-based process. We have therefore now: (a) more clearly defined the growth assay as one that measures parasite expansion in culture (lines 132-138); (b) described the myosin-dependent and -independent steps of the lytic cycle (lines 137-140); and (c) added a new figure (new Suppl. Figure 2, lines 140-143) showing that the compound has no effect on intracellular replication.

      *9) The authors should clarify for the reader (i) why they use in some case myofibrils and other muscle F actin when measuring the Myosin ATPase activity, (ii) what does mean XX% calcium activation and (iii) why using 75% in these assays which is 3 times higher from the original assays. (iv) Why they did not include non muscle actins in their study since Myosins also extensively work on non muscle actins. *

      (i) For both striated and smooth muscle myosins, the assays used here are well established and have identified compounds that have translated into animal models of disease. To assay the activity of myosins from striated muscle types, particularly to determine compound selectivity, myofibril assays are preferred as they recapitulate more of the biology as a more "native", membrane-free preparation and respond cooperatively to calcium activation. For cardiac, fast and slow skeletal muscle it is possible to derive high quality myofibril preparations that can be activated by calcium. A reference describing the value of using myofibrils in assays of striated muscle myosin ATPase activity has been added (ref. 71, line 517).

      Smooth muscle, a non-striated tissue, is regulated differently and calcium exerts an effect not through binding to troponin as in the striated muscle but through g-protein signaling, with phosphorylation as an end result, making the contraction slower and also much slower to reverse - in line with the physiological role of the muscle. The only way to reliably reconstitute smooth muscle ATPase activity has been through purification and reconstitution of a more reductionist system. The SMM S1 needs to be crosslinked to the actin to achieve high enough local concentrations to generate robust ATPase activity. A reference describing the use of this assay to identify small molecule inhibitors of SMM is now included (ref. 73, line 522).

      (ii, iii) Striated muscle myofibrils are responsive to calcium, as muscle contraction is mediated in vivo through calcium release from the sarcoplasmic reticulum. Titrating calcium can activate the myofibril ATPase activity up to the plateau (100%) and provide optimal signal to noise and sensitivity for the particular activity being assayed. For counterscreening to determine selectivity, we adjusted the assay conditions to a high basal ATPase activity (75% calcium) to provide high sensitivity for detecting inhibition. A sentence explaining this rationale has been added on lines 519-520.

      (iv) We used skeletal muscle actin in all of our in vitro assays since we have shown skeletal muscle actin to be a good substrate for TgMyoA (ref 33, cited on line 536) and skeletal muscle actin can be purified in larger quantities than native actin from parasites or functional recombinant protein from insect cells. Others have also shown that the closely related MyoA from P. falciparum moves skeletal muscle actin at the same speeds as recombinant P. falciparum actin (Bookwalter et al [2017] JBC 292:19290).

      *10) The protocol of image analysis of the 3D motility assay was increased to 80 seconds for the test of KNX-002 selectivity using wild type and mutant parasites (Fig 7) when compared to the test of KXN-002 concentration effect on wild type tachyzoites (60 sec in the result section, in Fig 4 legend and in the Methods' section). Is there any specific reason? *

      The data for Figure 4 were captured earlier in the project than those of Figure 7 and Suppl. Figure 7. In the intervening time we upgraded our Nikon Elements software from v.3.20 to v.5.11 (as already described on lines 583 and 588). With the upgrade to v.5.11, we also began using Nikon’s Illumination Sequence (IS) module, a graphical user interface that provides greater time resolution through a more efficient approach to building the z-stacks and saving the data. With the addition of v.5.11 and the IS module we were able to capture twice the number of image volumes in 80 sec than we were in 60 sec using v.3.20, and that became our standard operating procedure. Other than the improved time resolution, the 60s and 80s assays give indistinguishable relative results. We have now clarified in the methods (line 588-589) that we used the IS module to acquire the data in Figure 7 and Suppl. Figure 7.

      *11) In the mouse infection experimental design (Method section), it seems that they were no biological replicates in the case of the drug-treated (parasites + mice) which is not the case for the comparison of virulence between MyoA wild type and T130 mutants. If true, and considering what the authors wish to emphasize as a main message, it is fairly complicated to convincingly conclude about the KNX-002 effectiveness in vivo. Maybe the authors could explain their limitations. *

      Since we did not know how the compound and parasites would interact in mice – and in keeping with animal welfare standards – we decided that rather than doing multiple replicates with smaller numbers of infected mice we would do a single experiment with a large enough number of mice per treatment condition to ensure that if any animals died unexpectedly or had to be euthanized prematurely we would still have sufficient numbers for robust statistical comparison. Single experiments with ten treated and ten untreated mice are a generally accepted approach in early studies of drug effectiveness (e.g., Ferrreira et al Parasite 2002, 9:261; Rutaganira et al, J. Med. Chem. 2017, 60: 9976; Zhang et al IJP Drugs and Drug Resistance 2019, 9:27), and power analysis shows that if mortality is 100% in untreated mice and 50% in treated mice, 10 mice per group will provide an 80% probability of detecting the difference with a p value<br /> *We are also not sure why the compound has been injected only twice, at the time of parasite injection and two days after whereas the mice succumbed after 8 to 9 days even without MyoA inhibitors. Although quite difficult to measure, do the authors have any knowledge (based on the chemistry for example) of the compound stability and lipophilicity in blood and tissues? Because the IC50 on free tachyzoites appears significantly higher (5.3 uM, Fig4) than the in vitro molecular assay, when assessed in motility tests, and is increased for intracellular growth (Fig 8), it is somehow expected that the current compound would not work that great in vivo. Did the author try to provide the inhibitor intravenously every day? *

      IP injection is a standard method of administration for early drug treatment studies, and two considerations contributed to our decision to inject on days 0 and 2 post-infection: (a) the preliminary dose-tolerance studies, which were done with two IP doses of compound two days apart, showed evidence of mild toxicity so we were hesitant to inject more frequently, inject IV, or use more compound/injection; (ii) we expect the compound to work primarily on egressing and extracellular parasites, and since the parasite’s lytic cycle takes approximately 48 hours, this two-day injection schedule was chosen to maximize exposure of the extracellular parasites to freshly injected compound early in establishment of the infection. This rationale has now been added to the Methods section (lines 715-721).

      In terms of the doing systematic studies of dosing, stability, PK/PD, drug partitioning etc., it is important to restate that the primary goal of this work was to test whether inhibiting TgMyoA activity in vivo alters the course of infection. The data reported in the manuscript demonstrated this to be the case. As we state on lines 454-457, “While KNX-002 provided the means to rigorously test the druggability of TgMyoA, it caused weight loss and histological evidence of liver damage in the treated infected mice. Before further animal work, it will therefore be necessary to develop more potent and less toxic analogs that retain specificity for parasite myosin.” Our colleagues at Kainomyx have in fact initiated a drug development campaign based on the KNX-002 scaffold and have already identified a derivative named KNX-115, that is 20-fold more potent against recombinant P. falciparum MyoA (described on lines 356-361). Given Kainomyx’s ongoing efforts we do not believe it makes sense to do any further animal experiments at this time with KNX-002. It will be more informative and ethical to undertake, e.g., dosing, PK and PD studies with the more potent and less toxic derivatives that emerge from the Kainomyx drug development program, once these compounds become publicly available. This does not diminish the importance of the proof-of-principle experiments reported here, which as the reviewer stated, “provide a strong rationale for developing new therapeutic strategies based on targeting MyoA”; rather, it makes it hard to justify doing additional animal studies with a compound that we know will soon be replaced with more potent and less toxic derivatives.

      12) Figure 4: 2D and 3D Motility- the authors should comment on the fact that in 2D conditions with 10 uM of KNX-002, circular trajectories (one complete circle so at least 2 parasite lengths but sometimes more) largely dominate over others, whereas in absence of KNX-002 these circular trajectories are barely detectable and helical trajectories predominate. What could that mean as regard to the MyoA functional contribution to either process?

      This is an interesting question that we cannot currently answer. Perhaps helical 2D gliding requires more myosin-generated force than circular 2D gliding, but this is pure speculation at this point. Whatever the explanation, the observation is striking and we believe should be reported as it shows a clear effect of the compound on motility in the widely-used 2D trail deposition assay.

      *13) Figure 7: Besides the major point raised above for panel C, the information carried by the Figure could be stronger if an additional panel is introduced regarding the interesting assay on the preserved structural stability of the MyoA mutant over the WT MyoA (currently in SupFig7) *

      Former Suppl. Figure 7 (now Suppl. Figure 9) addresses one particular explanation for the differential effects of the mutation in the in vitro motility assay (Figure 6) and the parasite 3D motility assay (Figure 7). The data in Suppl. Figures 14A and 14B address two other possibilities. For consistency with the other figures and clarity of the narrative, we would prefer to leave the data in Suppl. Figure 9 as a supplemental figure.

      14) Material and Methods - Parasite motility assays: remove the duplicated [16] reference.

      Done.

      *15) The discussion starts with the ongoing debate on mechanisms underlying zoite motility; We found that the work of Pavlou et al. (ACS nano, 2020) should be part of the references listed there, as it brings evidence that a specific traction polar force is required probably in concert with microtubule storage energy at the focal point, a result that questions the prevailing model. *

      This was another oversight; the citation has been added (line 307).

      *16) Concerning the C3-20 and C3-21 compounds, the sentence "they have no effect on the activity of the recombinant TgMyoA (AK and GEW, unpublished data)" in the paragraph starting by "There have been only two previously..." should be refrained unless showing the results. *

      We have removed reference to this unpublished work, as suggested (lines 338-340).

      17) If possible, the authors should expand more on the effect of KNX-002 on Plasmodium falciparum and its homolog PfMyoA.

      We have expanded our discussion of these preprint data from others on lines 356-361.

      Reviewer #3

      *1) The T130A IC50 was done on the mutagenized clone 5. The authors currently don't have data showing IC50 on the independently generated T130A mutant, to see if the IC50s are similar to one another, or if there were additional resistance mutations present in clone 5. *

      Because we did not insert the T130A mutation into a fluorescent parasite background, we cannot directly compare its IC50 in the fluorescence-based growth assay to that of the line generated by chemical mutagenesis. Plaque assays do not require fluorescent parasites but, in our hands, these assays lack the sensitivity to reproducibly detect the expected subtle (50. While we agree that it would be interesting to know if the mutant generated by chemical mutagenesis contains any additional resistance-conferring mutations, not having this information does not alter the conclusion that the T130A mutation alone reduces the sensitivity of the motor to KNX-002 (Figures 6-9). See also response to Reviewer 1, comment 1; a discussion of the value of determining what other resistance mechanisms are available to the parasite for this class of compounds is now included on lines 416-422.

      *2) For 3D motility assays, it is currently unclear from the data and text what the expected maximal inhibition of motility would be; e.g., would parasites depleted of MyoA display 0% motility. Understanding the dynamic range of this assay could help clarify whether this residual 5% motility explains why parasites treated with 20 uM KNX-002 can still form small plaques. This could be achieved by referencing previous work that assesses 3D motility after depletion of a critical motility factor. *

      A small fraction of TgMyoA knockout parasites are still capable of motility in 3D (13% when normalized to wildtype for displacements > 2 μm [ref. 46]), so the dynamic range of the 3D assay for TgMyoA-deficient parasites compared to wild-type parasites is 0.13-1.0. The 13% residual motility of the TgMyoA parasites is now referred to on lines 419-420. Treatment of wild-type parasites with 20μM KNX-002 results in a fractional motility of ~0.24 compared to untreated controls (Figures 4 and 7). This less than complete inhibition compared to the knockout is not surprising, since motor activity is not completely inhibited at 20 μM compound (Figures 1 and 6) and parasite growth as assayed either by the fluorescence-based method (Figure 2A) or plaquing (Figure 8) shows greater inhibition at 40 and 80 μM compound than at 20 μM.

      *3) It would be informative for the authors to discuss the rationale for the selected treatment regime. Since many drug-treatments involve daily dosing, was the two-dose regime based on poor tolerance of the compound in mice or other considerations? *

      See response to reviewer 2, comment 11; the rationale for this dosing regimen has now been added to the Methods (line 715-721).

      *4) Track length is not considered as a parameter in the filament sliding assays (Fig. 6) or the 3D motility assays (Fig. 7). These may be valuable parameters for the authors to examine; however, the time frames analyzed might be insufficient to capture track lengths. Could the authors include analyses of track lengths or discuss the technical limitations of their assays? *

      In the in vitro motility assays, almost all of the actin filaments move for the entire 60s of video recording so trajectory length is directly proportional to speed and therefore does not provide any additional information. For the parasite 3D motility assay, we have added a new figure (Suppl. Figure 12) showing the effect of the compound on the displacement of wild-type and T130A parasites, along with new text describing these data (lines 269-273).

      *5) When discussing the minor discrepancies between the results with recombinant protein and parasite motility, the authors could consider the relative concentration of motors in the pellicle; i.e. it might be necessary to inhibit a greater % of all the motors to truly block motility, perhaps consistent with the higher compound concentrations needed to affect parasite motility. *

      This possible explanation has been added to the Discussion (lines 403-406).

      6) The authors should include the IC50 data for all 5 KNX-002 resistant clones in the supplementary data. While the 5/26 clones showed >2.5-fold increase in IC50 for KNX-002, it's unclear how the IC50 of the single clone harboring the T130A MyoA mutation compared to the other resistant clones.

      A figure has been added showing these data (new Suppl. Figure 8).

      *7) For plaque assays, the authors should indicate how much DMSO was used for 0 KNX-002 conditions. It should presumably be the corresponding concentration at 80 µM drug and if not, that control should be performed to account for effects of DMSO at higher concentrations at all drug concentrations tested. *

      In all experiments involving treatment with compound, the compound was serially diluted in DMSO to the appropriate range of concentrations prior to dissolving it in aqueous buffer for the experiment itself, enabling an equivalent amount of DMSO to be added to all samples in that experiment, including the DMSO only vehicle controls. This clarifying statement and the final range of DMSO concentrations in each of the different types of experiments has been added to the Methods section (lines 486-491). * *

      *8) Authors should indicate the origins of their hexokinase for counter screens. *

      The hexokinase used was from Millipore Sigma (#H6380); the supplier has been added on line 507.

      *9) Authors should indicate µM on graphs. *

      The μM label has been added to the graphs where it was missing (Figures 2 and 5, Suppl. Figures 4, 6, 8).

      *10) In Figures 2A, 5A, and 5B, the use of colored lines (e.g., of different hues) could make the graphs more legible. *

      We have experimented with color as a way to discriminate between the different doses on these graphs, but found the use of 8 different colors to be more distracting than helpful. The color-coding approach would be even less useful for readers who have color vision deficiency (including one of our authors). Symbol groupings have been added to the right of all growth curves to improve the legibility of the graphs.

      *11) In Figure 2C, it isn't clear which cell line was treated with sodium azide to generate the positive control. *

      It was the HFF cells that were treated with azide as a positive control; Figure 2C has been modified to make this clear.

      *12) In the discussion, "a" is missing in the phrase "...mutation is likely to have long range structural impact..." *

      Done. * *

      *13) The abbreviation of species (spp.) should be followed by a period. *

      Done.

      Other

      Further SAR analyses using an optimized actin-dependent myosin ATPase assay resulted in minor changes to Suppl. Figure 3 and Figure 3, with no significant changes to the conclusions. The text has been modified accordingly (lines 155-161, 178-180).

      All other changes to the manuscript not noted above were editorial in nature, made to either improve clarity or correct minor errors in the previous version.

    1. Reviewer #2 (Public Review):

      The authors aim to make a reliable plate-based system for imposing drought stress (which for experiments like this would be better referred to as low water potential stress). This is an admirable goal as a reliable experimental system is key to conducting successful low-water potential experiments and some of the experimental systems in use have problems. They compare several treatments but seem to be unaware that such comparisons need to be based on the measurement of water potential as the fundamental measure of how severe the level of water limitation is. Only by comparing things at the same water potential can one determine if the methods used to impose the low water potential are introducing confounding factors. In this manuscript, they compare several agar-plate-based treatments to what they view as a baseline experiment of plants subjected to soil drying. However, that baseline soil drying (vermiculite drying, to be precise) experiment illustrates many of the problems present in the molecular drought literature in that they give no information on plant or soil water potential or water content. Thus, there is no way to know how severe the drought stress was in that experiment and no way for any other lab to reproduce it. It is directly akin to doing a heat stress experiment and not reporting the actual temperature.

      They compare transcriptome data from this soil drying experiment to transcriptome data from agar plates with PEG, mannitol or salt added. However, this comparison is problematic, because none of the treatments being compared are at the same water potential (as mentioned above). Also, the PEG-infused agar plates have limitations in that no buffer is added and it is not clear that anything is done to check or control the pH. Adding PEG to the solution will reduce the pH. Thus, in their unbuffered PEG plates, the plants are almost certainly exposed to low pH stress and this can explain the supposed difference they observe between PEG and other treatments, especially since the plants are left on such stressful pH conditions for a relatively long period. It is also problematic that the comparison between soil drying and plate-based treatments is at different times (5 vs 14 days). They also show an over-reliance on the GO annotations of drought-induced gene expression. This GO annotation is based on experiments using very severe stress for a short time period. It is notorious for not accurately reflecting what happens on longer-term exposure to more moderate levels of low water potential stress. Thus, for example, we would not expect many of the canonical drought regulation genes (RD29A and similar genes) to be upregulated in the longer-term treatments as its expression is induced rapidly but also rapidly declines back to near baseline at the plant acclimates to the low water potential stress.

      The authors have not always considered literature that would be relevant to their topic. For example, there is a number of studies that have reported (and deposited in the public database) transcriptome analysis of plants on PEG-plates or plants exposed to well-controlled, moderate severity soil drying assays (for the latter, check the paper of Des Marais et al. and others, for the former, Verslues and colleagues have published a series of studies using PEG-agar plates). They also overlook studies that have recorded growth responses of wild type and a range of mutants on properly prepared PEG plates and found that those results agree well with results when plants are exposed to a controlled, partial soil drying to impose a similar low water potential stress. In short, the authors need to make such comparisons to other data and think more about what may be wrong with their own experimental designs before making any sweeping conclusions about what is suitable or not suitable for imposing low water potential stress.

      To solve the problem of using these other systems to impose low water potential stress, the authors propose the seemingly logical (but overly simplistic) idea of adding less water to the same mix of nutrients and agar. Because the increased agar concentration does not substantially influence water potential (the agar polymerizes and thus is not osmotically active), what they are essentially doing is using a concentrated solution of macronutrients in the growth media to impose stress. This is a rediscovery of an old proposal that concentrated macronutrient solutions could be used to study the osmotic component of salt stress (see older papers of Rana Munns). There are also effects of using very hard agar that is of unclear relationship to actual drought stress and low water potential. Thus, I see no reason to think that this would be a better method to impose low water potential.

    2. Author Response:

      eLife assessment

      This work is an attempt to establish conditions that accurately and efficiently mimic a drought response in Arabidopsis grown on defined agar-solidified media - an admirable goal as a reliable experimental system is key to conducting successful low water potential experiments and would enable high-throughput genetic screening (and GWAS) to assess the impacts of environmental perturbations on various genetic backgrounds. The authors compare transcriptome patterns of plant subjected to water limitation imposed using different experimental systems. The work is valuable in that it lays out the challenges of such an endeavor and points out shortcomings of previous attempts. However, a lack of water relations measurements, incomplete experimental design, and lack of critical evaluation of these methods in light of previous results render the proposed new methodology inadequate.

      We thank eLife for the initial assessment and comments to our work. In our revised manuscript we plan to address the main concerns raised by reviewers. Specifically, we plan to perform water relations measurements for all our treatment assays, as well as explore the separate effects agar hardening and nutrient concentration have in our low-water agar assay. We will also provide a more in depth critical review of our results compared to previously published results.

      Reviewer #1 (Public Review):

      High-throughput genetic screening is a powerful approach to elucidate genes and gene networks involved in a variety of biological events. Such screens are well established in single-celled organisms (i.e. CRISPR-based K/O in tissue culture or unicellular organisms; screens of natural variants in response to drugs). It is desirable to extend such methodology, for example to Arabidopsis where more than 1000 ecotypes from around the Northern hemisphere are available for study. These ecotypes may be locally adapted and are fully sequenced, so the system is set up for powerful exploration of GxE. But to do so, establishing consistent "in vitro" conditions that mimic ecologically relevant conditions like drought is essential. 

      The authors note that previous attempts to mimic drought response have shortcomings, many of which are revealed by 'omics type analysis. For example, three treatments thought to induce osmotic stress; the addition of PEG, mannitol, or NaCl, fail to elicit a transcriptional response that is comparable to that of bonafide drought. As an alternative, the authors suggest using a low water-agar assay, which in the things they measure, does a better job of mimicking osmotic stress responses. The major issues with this assay are, however, that it introduces another set of issues, for example, changing agar concentration can lead to mechanical effects, as illustrated nicely in the work of Olivier Hamant's group.

      We thank the reviewer for their comments. We hypothesize that our low-water agar assay is able to replicate drought gene expression patterns through a combination of hardened agar and higher nutrient concentration. However, we did not explore the separate effects each of these factors may play in eliciting such responses. Thus, in our revised manuscript, we will explore what role the mechanical effects of changing agar concentration has on root gene expression. However, we suspect that the mechanical effects introduced by hard agar does not introduce another issue per se, but in fact may help with replicating the transcriptional effects seen under drought.

      Reviewer #2 (Public Review):

      […] The authors have not always considered literature that would be relevant to their topic. For example, there is a number of studies that have reported (and deposited in the public database) transcriptome analysis of plants on PEG-plates or plants exposed to well-controlled, moderate severity soil drying assays (for the latter, check the paper of Des Marais et al. and others, for the former, Verslues and colleagues have published a series of studies using PEG-agar plates). They also overlook studies that have recorded growth responses of wild type and a range of mutants on properly prepared PEG plates and found that those results agree well with results when plants are exposed to a controlled, partial soil drying to impose a similar low water potential stress. In short, the authors need to make such comparisons to other data and think more about what may be wrong with their own experimental designs before making any sweeping conclusions about what is suitable or not suitable for imposing low water potential stress. 

      To solve the problem of using these other systems to impose low water potential stress, the authors propose the seemingly logical (but overly simplistic) idea of adding less water to the same mix of nutrients and agar. Because the increased agar concentration does not substantially influence water potential (the agar polymerizes and thus is not osmotically active), what they are essentially doing is using a concentrated solution of macronutrients in the growth media to impose stress. This is a rediscovery of an old proposal that concentrated macronutrient solutions could be used to study the osmotic component of salt stress (see older papers of Rana Munns). There are also effects of using very hard agar that is of unclear relationship to actual drought stress and low water potential. Thus, I see no reason to think that this would be a better method to impose low water potential. 

      We thank the reviewer for their comments. In our revised manuscript, we will address points regarding plant and soil water potential; similar concerns were also raised by Reviewer 1 and 3. We note that we report vermiculite water content in Supplementary Table 4.

      We would like to clarify that both the PEG media and overlay solution were buffered - we did not include this within the written description in the methods, but will do in our revised manuscript.

      We agree with the reviewer’s concern that it may be problematic to compare the transcriptomic profiles of seedling and mature plants. In light of this, we plan to explore what effects our treatment media has on mature rosettes.

      We note that we do not claim that PEG is unable to produce low-water potential responses similar to partial soil drying. Indeed, we indicate that it is a good technique for eliciting phenotypes comparable to drought at the physiological level (line 48). Rather, we claim that PEG is unable to produce gene expression responses that are sufficiently similar to partial vermiculite drying.

      Reviewer #3 (Public Review):

      […] The authors observed that gene expression responses of roots in their 'low-water agar' assay resembled more closely the water deficit in pots compared to the PEG, mannitol, and salt treatments (all at the highest dose). In particular, 28 % of PEG led to the down-regulation of many genes that were up-regulated under drought in pots. Through GO term analysis, it was pointed out that this may be due to the negative effect of PEG on oxygen solubility since downregulated genes were over-represented in oxygen-related categories. The data also shows that the treatment with abscisic acid on plates was very good at simulating drought in roots. Gene expression changes in shoots showed generally a high concordance between all treatments at the highest dose and water deficit in pots, with mannitol being the closest match. This is surprising, since plants grow in plates under non-transpiring conditions, while a mismatch between water loss by transpiration on water supply via the roots leads to drought symptoms such as wilting in pot and field-grown plants. The authors concluded that their 'low-water agar' assay provides a better alternative to simulate drought on plates. 

      Strengths: 

      The development of a more robust assay to simulate drought on plates to allow for high-throughput screening is certainly an important goal since many phenotypes that are discovered on plates cannot be recapitulated on the soil. Adding less water to the media mix and thereby increasing agar strength and nutrient concentration appears to be a good approach since nutrients are also concentrated in soils during water deficit, as pointed out by the authors. To my knowledge, this approach has not specifically been used to simulate drought on plates previously. Comparing their new 'low-water agar' assay to popular treatments with PEG, mannitol, salt, and abscisic acid, as well as plants grown in pots on vermiculite led to a comprehensive overview of how these treatments affect gene expression changes that surpass previous studies. It is promising that the impact of 'low-water agar' on the shoot size of 20 diverse Arabidopsis accessions shows some association with plant fitness under drought in the field. Their methodology could be powerful in identifying a better substitute for plate-based high-throughput drought assays that have an emphasis on gene expression changes. 

      Weaknesses: 

      While the authors use a good methodological framework to compare the different drought treatments, gene expression changes were only compared between the highest dose of each stress assay (Fig. 2B, 3B). From Fig. 1F it appears that gene expression changes depend significantly on the level of stress that is imposed. Therefore, their conclusion that the 'low-water agar' assay is better at simulating drought is only valid when comparing the highest dose of each treatment and only for gene expression changes in roots. Considering how comparable different levels of stress were in this study leads to another weakness. The authors correctly point out that PEG, mannitol, and salt are used due to their ability to lower the water potential through an increase in osmotic strength (L. 45/46). In soils, water deficit leads to lower water potential, due to the concentration of nutrients (as pointed out in L. 171), as well as higher adhesion forces of water molecules to soil particles and a decline in soil hydraulic conductivity for water, which causes an imbalance between supply and demand (see Juenger and Verslues, The Plant Cell 2022 for a recent review). While the authors selected three different doses for each treatment that are commonly used in the literature, these are not necessarily comparable on a physiological level. For example, 200 mM mannitol has an approximate osmotic potential of around -5 bar (Michel et al. Plant Physiol. 1983) whereas 28 % PEG has an osmotic potential closer to -10 bar (Michel et al. Plant Physiol. 1973). It also remains unclear how the increase in agar concentration versus the increase in nutrient concentration in the 'low-water agar' affect water potentials. For these reasons it cannot be known whether a better match of the 'low-water agar' at the 28% dose to water deficit in pots for roots in comparison to the other treatments is due to a good match in stress levels with the 'low-water agar' or adverse side-effect of PEG, mannitol, or and salt on gene regulation. Lastly, since only two biological replicates for RNA sequencing were collected per treatment, it is not possible to know how much variance exists and if this variance is greater than the treatments themselves. 

      We thank the reviewer for their comments. In our statistical analyses, we found that dose-responsive genes (as fit by a linear model) were very similar to those genes found differentially expressed at the highest dose. Thus, for clarity, we decided to simply present the genes differentially expressed at the highest dose. We see now that this might have been an oversimplification. In our revised manuscript, we will present genes that are dose responsive across the range of treatment doses, thus providing more evidence that lower doses of low-water agar are also capable of simulating drought (as is suggested by overlap analysis of Figure 2A).

      Additionally, we will also explore the osmotic potential of each of our different assays to provide a better benchmark of how comparable each of our treatments are (as similarly requested by Reviewer 1 and 2). Lastly, to address concerns regarding the size of variance in gene expression, we will sequence a 3rd replicate of RNA.

    1. Plyometric Training

      Numerous books and articles have been written about plyometric training for athletes. However very few offer detailed progressive programs that take into account the need for a system of training that can be applied to a broad range of athletes. Instead you get a smorgasbord of exercises and opinions. Although the works of Chu, Radcliffe and Gambetta were outstanding at the time of their writing, very little has been written in the last ten years that connects our current knowledge of functional training with how to design and implement a system of plyometric exercises. In order to fully understand plyometrics, we must look at basics like terminology, volume and frequency.

      Terminology

      The first area that needs to be addressed in the area of plyometric training is terminology. The language of plyometrics must be universal so that any coach or athlete can view the program of any other coach or athlete and understand the exercises ideally without photos or video. The discrepancies in terminology were first brought to my attention by Mike Clark of the National Academy of SportsMedicine. Clark pointed out in a 2000 lecture that many coaches currently used names to describe plyometric exercises that were not properly descriptive of the movement.

      Skip- single leg takeoff with two foot contacts

      Although many might view these descriptions as simple and common sense, I realized that I inadvertently had misclassified exercises. We had always referred to two legged jumps over hurdles as hurdle hops. I believe that this was and still is a common error among many strength and conditioning and track coaches. Clark made the facetious point that "bunnies don't hop, they jump".

      Many might view this as a minor discrepancy but, a call from a coach in California made me realize the cost of "minor discrepancies".

      The coach in question called me and said "Boy, are your guys great athletes, I can't get one guy on my team to do those thirty inch hurdle hops you guys do." I quickly realized that my "minor discrepancy" had caused this coach to try to perform an exercise with one leg that we had been doing with two. He had his athletes hurdle hopping as the program indicated while I had mine hurdle jumping.

      A small detail? Maybe.

      The reality is that an athlete could have been badly injured because of my incorrect use of descriptive terminology.

      Categories of Exercises

      After looking at terminology, the next area to examine is the categories of the different types of jumps, hops and bounds. I believe that this is the major failing of the most popular commercially available ACL injury prevention programs.

      The two most popular, The Santa Monica PEP program and the Sportsmetrics program focus almost exclusively on jumps with no emphasis on bounds or hops. The reality is that the mechanism of the ACL tear is most frequently in a single leg hop (actually a redundancy as the term hop denotes single leg) or bound scenario, not a double leg jump.

      A sound plyometric program must include a balance of exercises from each terminology category. Athletes must perform a balance of jumps, hops and bounds. In addition, hops must be done both forward, at 45 degree angles and potentially side to side. It should be noted that hopping medially and laterally are entirely different in both the muscles stressed and the injury prevention potential.

      Medial hops ( hops toward the midline) are more difficult and provide much needed stress to the hip stabilizers.

      Volume

      One question that begs to be answered revolves around the volume of jumps. Volume is measured by the number of jumps per session and has most frequently been measured by the number of foot contacts. Recently we have seen lots of recommendations for what are being referred to as extensive plyometrics. The concept basically advocates a high volume of "little jumps" to build up to more intense plyometrics.My feeling is that the term extensive plyometrics is a bit of an oxymoron. The whole idea of plyometrics is facilitate explosive contractions with the eventual goal of reduced ground contact time

      One of the major failings of many plyometric programs is too high a number of foot contacts. Extensive plyometrics not only doesnt solve this problem but, more than likely exacerbates it. We also have to distinguish if some of the "extensive plyometrics" recommended are valuable, necessary or even really plyometrics ? Although in a technical sense all movements involve strech shortening I'm not sure that jump rope or line hops prepares the tissue properly for the more intense activity to follow?

      We try to keep the number of jumps, hops and bounds at roughly 25 per day and 100 per week and never use extensive plyometrics in a preparatory phase.

      Intensity

      Instead, we use a realtively constant volume of drills that progress in intensity. The intensity of plyometric training is difficult to measure and really involves understanding the difference between a program of controlled jump training and a true plyometric program. Many exercises that we consider to be plyometric in nature are actually simply jumping exercises. A box jump is really just a jump. In order to be "truly" plyometric there needs to be a reactive component. However, our program is probably better described as a "progression to plyometrics" program.

      Controlling the intensity of plyometric exercises is actually based on controlling how gravity is allowed to enter the picture and to act on the body. Jumps up to a box or hops up to a box are the lowest intensity as they involve a strong concentric contraction but minimize eccentric stress by not allowing the body to "in effect" come down. With box jumps and box hops, what goes up does not really come down. The body is accelerated up to a height but not allowed to travel back down. The athlete jumps up and steps down, thereby effectively negating the effect of gravity as an accelerating force.

      What we do know is that mistakes in plyometric progressions will manifest themselves primarily as patella femorall issues. This could be due to tendon loading issues or to overstress of the patella femoral joint but in either case the issue is too much jumping ( or hopping)and or drills that are not properly progressed in intensity. Volume is frequently the enemy, particularly in atheltes that already experience a high volume of foot contacts in practice or training. Professor Jill Cook points out that peak tendon stress is at the point of switching from eccentric to concentric contraction. The goal of our progression is prepare for that point in a more controlled and thoughtful manner.

      Chu's early work classified intensity of jumps based on whether the jumps were done in place or, covered horizontal distance. Although this early quantification system of in-place, short, and long was state of the art in the eighties, our increased analysis of the effects of physics on the body leads us to a system that I believe better describes the effect of jumps. I prefer classifying jumps as gravity reduced or gravity enhanced and then move to semi-elastic ( bounce) and elastic (rebound or continuous). Early plyometric descriptions left no room for jumps that were actually not plyometric in nature.

      The following videos illustrate our progression

      Frequency-

      One of the first questions when discussing frequency and plyometrics relates to the NSCA position statement. I find it intriguing that the NSCA once published such a short sighted piece. In the initial position statement the NSCA took the position that plyometrics should only be done twice per week. This has since been amended to read that the same joints should not be worked on consecutive days. The NSCA takes no position on intensity or volume other than to indicate that depths jumps may be too intense for larger athletes. My feeling is that plyometrics can be performed up to four times per week but, must be divided into linear and multi-directional days. Linear plyometrics involve pure sagittal plane jumps and hops, while multi-directional plyometrics work in the frontal and transverse planes.

      Transverse Plane Plyometrics

      I believe that athletes must do decelerative work in the transverse plane but, think that transverse plane jumps and hops must be approached with great care. It must be noted that in many cases the transverse plane exercises recommended look very much like the injury mechanisms we are trying to avoid.

      Age/ Level of Experience

      Another interesting point in the NSCA statement relates to the development of a proper strength base for plyometrics. No one has defined what proper is. Previously foolish, short-sighted recommendations were made relative to strength base. Some writers recommended a certain number of weeks of strength training prior to beginning a plyometric program, others recommended a certain strength level prior to undertaking a plyometric program. It is my feeling that strength training and plyometric training can be done concurrently providing common sense is used.

      The reality is that young athletes begin intense plyometric programs without a strength training base or a required strength level every day. Both gymnastics and figure skating involve intense plyometric type activity from very young ages. The key is to manage the effect of gravity on the body. The keys to a plyometric program are simple:

      Good plyometrics are quiet. Failure to land quietly indicates that the athlete lacks eccentric strength and that the exercise is inappropriate. All that may be necessary is to decrease the height of the obstacle involved. Athletes should only jump onto boxes that they can land on quietly.

      PS- Athletes should always jump from and land in from the same position.

    1. Reviewer #3 (Public Review):

      This manuscript reveals opioid suppression of breathing could occur via multiple mechanisms and at multiple sites in the pontomedullary respiratory network. The authors show that opioids inhibit an excitatory pontomedullary respiratory circuit via three mechanisms: 1) postsynaptic MOR-mediated hyperpolarization of KF neurons that project to the ventrolateral medulla, 2) presynaptic MOR mediated inhibition of glutamate release from dorsolateral pontine terminals onto excitatory preBötC and rVRG neurons, and 3) postsynaptic MOR-mediated hyperpolarization of the preBötC and rVRG neurons that receive pontine glutamatergic input.

      This manuscript describes in detail a useful method for dissecting the relationship between the dorsolateral pons and the rostral medulla, which will be useful for various researchers. It's also great to see how many different methods have been applied to improve the accuracy of the results.

      1. Relationship between the dorsolateral pons and rostral ventrolateral medulla.

      The method of this paper is a good paper to show a very precise relationship between the presence of opioid receptors and the dorsolateral pons and rostral ventrolateral medulla, and for opioid receptors, based on the expression of Oprm1, the use of genetically modified mice with anterograde or retrograde viruses with additional fluorescent colors showed both anterograde and retrograde projections, revealing a relationship between the dorsolateral pons and rostral ventrolateral medulla.

      For example, to visualize dorsal pontine neurons expressing Oprm1, Oprm1Cre/Cre mice were crossed with Ai9tdTomato Cre reporter mice to generate Ai9tdT/+ oprm1Cre/+ mice (Oprm1Cre/tdT mice) expressing tdTomato on neurons that also express MOR at any point during development, and the retrograde virus encoding Cre-dependent expression of GFP (retrograde AAV-hSIN-DIO-eGFP was injected into the respiratory center of Oprm1Cre/+ mice and into the ventral respiratory neuron group, showing that KF neurons expressing Oprm1 project to the respiration-related nucleus of the ventrolateral medulla.

      However, although the authors have also corrected it, the virus may spread to other places as well as where they thought it would be injected, and it is important to note that it is injected accordingly to mark the injection site with an anterograde virus encoding a different fluorescent color mCherry, and the extent of the injection is quantified, which is excellent as a control experiment.

      In addition, the respiratory center seems to be related not only to preBötC but also to pFRG recently, so if the relation with it is described, it is important from the viewpoint of the effect on the respiratory center and the effect on the rhythm.

      2. Electrophysiological approaches and useful methods for target neurons

      Oprm1Cre/+ mice), the authors found abundant Oprm1 + projections in the preBötC region of the medulla oblongata (respiratory center) and sought to determine whether presynaptic opioid receptors inhibit glutamate release from KF terminals to excitatory preBötC and rVRG neurons, since KF neurons in the dorsolateral pons projecting to the ventrolateral medulla oblongata had been shown to be glutamatergic and to have opioid receptors. The authors injected a channelrhodopsin-2-encoding virus (AAV2-hSin-hChR2 (H134R) -EYFP-WPRE-PA) into the dorsolateral pontine KF of vglu2Cre / tdT mice and performed whole-cell voltage-clamp recordings from td tomato-expressing, excitatory vglu2-expressing preBötC and rVRG neurons, contained in acute brain slices. Moreover, both opioid-sensitive and opioid-insensitive KF neurons that project to preBötC and rVRG were visible and recorded using FluoSpheres which are much more visible in acute brain sections than retrograde tracers of viruses.

      1) Optogenetic stimulation of the KF terminus was blocked by the AMPA-type glutamate receptor antagonist DNQX. In excitatory pre-BötC and rVRG neurons, the terminals from the dorsal pontine KF were activated by optogenetic stimulation, and the KF synapses to the medullary respiratory neurons were found to be monosynaptic because oEPSCs(optical stimulated EPSCs) were removed by TTX but were subsequently restored by the application of K-channel blocker 4AP. Thus, KF neurons have been shown to send monosynaptic glutamatergic projections to excitatory ventrolateral medullary neurons using terminal optogenetic stimulation and receptor and channel inhibitors.

      2) To determine whether opioids inhibit glutamate release from KF terminals to medullary respiratory neurons, we recorded a pair of oEPSCs (50 ms stimulus interval) from excitatory preBötC and rVRG neurons and applied an endogenous opioid agonist, [Met5] enkephalin (ME), to the perfusion solution. ME is preBötC and rVRG neurons, indicating inhibition of glutamate release by presynaptic MOR PPR. Thus, presynaptic opioid receptors have been shown electrophysiologically to inhibit glutamate release from KF terminals to excitatory pre-BötC and rVRG neurons.

      3) Whether excitatory pre-BötC or rVRG neurons themselves receiving opioid-sensitive glutamatergic synaptic inputs from KF are hyperpolarized by opioids can be determined by monitoring their retention currents.

      4) Since FluoSpheres are much more visible in acute brain sections than retrograde tracers of viruses and do not spread to injection sites, they chose to record from retrogradely labeled KF neurons with FluoSpheres injected into preBötC or rVRG in wild-type mice, allowing us to label KF neurons regardless of Oprm1 expression status and determine the projection patterns of both Oprm1 + and Oprm1- neurons. Whole-cell voltage-clamp recordings from fluorescent KF neurons contained in acute brain slices show that the presence of ME-mediated outward current can identify KF neurons that express functional MORs and are opioid-sensitive compared to neurons that lack ME-mediated outward current (insensitive). This suggests that both opioid-sensitive and opioid-insensitive KF neurons project to preBötC and rVRG.

      Although much has been written about the relationship between KF neurons and medulla oblongata neurons and their being glutaminergic neurons, detailed descriptions of the recorded neuronal firing patterns are lacking. You should describe what firing pattern the recorded neurons had. If we don't do that, we won't be able to tell whether it's a respiratory neuron or another tonic firing neuron, so I don't think we can discuss whether it's involved in the respiratory rhythm.

      3. Compare the distribution of neurons

      To examine the distribution of Oprm1 + and Oprm1- dorsolateral pontine neurons projecting to the ventrolateral medulla, we injected retrograde AAV-hSin-DIO-eGFP and retrograde AAV-hSin-mCherry into preBötC and rVRG of Oprm1Cre/+ mice and found a neuronal distribution in which Oprm1-expressing projection neurons expressed GFP and mCherry, but not Oprm1-expressing projection neurons expressed only mCherry.

      In addition, rostral glutamatergic KF neurons express FoxP2, while MOR-expressing glutamatergic neurons in the lateral parabrachial region that project to the forebrain express the CGRP-encoding gene, Calca. In view of this, the authors performed immunohistochemistry for FoxP2 and CGRP on Oprm1 + KF neurons projecting to the ventrolateral medulla, and Oprm1 + medulla oblongata projecting KF neurons expressed FoxP2 but not CGRP. The expression of CGRP was not observed in rostral KF and medullary projection Oprm1 + neurons and neurites but was strong in lateral parabrachial neurons and their axonal fiber projections. Can you describe the relationship between CGRP and FoxP2 and recorded neurons?

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank both reviewers for their constructive criticism and the insightful comments on our manuscript. Reviewer 1 states that:

      „The strength of this manuscript lies in its comprehensive analysis of Bim1 function, the quality of the results and that the experiments are generally well controlled and interpreted. „

      And „the findings of this comprehensive analysis are of great value to the microtubule field, especially for people working in budding yeast. „

      • *

      While Reviewer 2 adds:

      „The current study is indeed rich with new insights into the mechanisms by which these molecules function, and will no doubt prove valuable to a number of people in the microtubule/motor/yeast mitosis fields. As someone who is interested in and studies mitosis in budding yeast, I found the study to be interesting.

      • *

      Both reviewers conclude that:

      “…there are useful data in the manuscript that make this an important contribution and that it should definitely be published”

      • *

      • *

      Both reviewers raised two major areas of concern: 1. A confusing overall structure makes the study hard to follow. 2. A clearer distinction needs to made between what has already been reported in the literature, and what are new insights provided in this study. In this regard, the appropriate citations need to be made at various positions throughout the manuscript.

      In this full revision, we have addressed these major points of criticism of the reviewers as follows:

      We have re-organized and re-focused the manuscript to make it more accessible and easier to follow for the reader. We have followed a suggestion from reviewer 1 and now present all experiments characterizing mitotic spindle phenotypes and how they can be suppressed consecutively in Figures 2-5 and then finish the manuscript with the characterization of the spindle orientation phenotype. This way of ordering by biological pathway allows for a better flow of the manuscript.

      Throughout the text, we have added citations to better indicate the previous state of knowledge and how the presented experiments either confirm or extend the previous findings in the field. This helps to put our current study better into and overall perspective.

      In addition, we have addressed the specific points raised by both reviewers in full. Please see below our point-by-point answer.

      Reviewer1

      There is already a huge body of published information on mitotic spindle positioning via the Kar9 and dynein pathways that grew since the late 1990s. The genetic relationships and molecular interactions between the components of these 2

      pathways are well studied (many studies, including Liakopoulos et al. 2003, are not cited by the authors). The authors

      should make sure to cite and compare to the relevant primary literature when they report findings that have been

      described before. This will help to distinguish novel findings from validation of previous results.

      We have added relevant citations throughout the manuscript, please see below.

      "The strict dependence of Kar9 and Cik1-Kar3 on the presence of Bim1, as well as the different effects of bim1Δ on

      nuclear and cytoplasmic Bik1, may reflect the formation of stable complexes between Bim1 and these binding partners in

      cells." I believe this has already been shown (Kumar et al., 2021 and Manatschal et al., 2016). There are several other

      instances as well where additional literature should be cited, for example Gardner et al., 2008 and Gardner et al. 2014.

      We have now cited the Manatschal and Kumar papers in this section of the revised manuscript. We have also cited the mentioned Gardner papers later in the manuscript.

      The selection of targets to study in figure 1 doesn't seem to follow the listed criteria. Many proteins included in the

      study were not found by IP-MS, but some perfect targets according to the listed criteria like Duo1 were not included in the

      study. In addition, there are more sophisticated ways of finding Bim1 binding motifs in the literature

      (https://doi.org/10.1016/j.cub.2012.07.047). I suggest, the authors declare that they rationally chose to study 21 proteins

      of interest but remove the claim that their approach was systematic.

      We have changed the wording accordingly and removed the claim of systematic target selection.

      Much of the microscopy data was acquired after release from alpha factor arrest. What is the reason for this

      perturbation? An exponentially growing culture should mostly consist of mitotic cells anyway. Since this treatment affects

      cell size and potentially protein levels/concentrations, testing its influence on spindle position as well as levels on MTs for

      the most relevant proteins of interest would be important to exclude introduction of artifacts.

      In principle that’s correct, but using synchronized cultures has the great advantage that mitotic timing and all the parameters associated related to it (spindle length etc.) can be quantified much better and we obtain larger N and thus get better statistics using this approach. In a typical log culture only one third of the cells are in mitosis and this entails very different states of mitosis. Observations times are limited due to fluorescent bleaching and low signal intensity. We therefore feel the benefits of alpha-factor release outweigh the problems and we compare all mutants under the same conditions.

      Some of the results obtained from bim1Δ cells are a challenge to interpret due to the wide range of processes that

      involve Bim1 and therefor the potential for many off-target effects- including a global change in microtubule dynamical

      behavior in both the cytoplasm and the nucleus that will influence the length distributions and microtubule lifetime (and

      thus number). The authors must carefully consider these caveats.

      We agree in principle and have therefore not only characterized the bim1 deletion, but also more specific bim1 mutants. We also show that some aspects of the bim1 delta phenotypes, but not others, can be rescued by different strategies.

      The results section on page 12 refers to phenotypes of kar9 delete cells with respect to Bim1-GFP on cytoplasmic

      microtubules. In the figure 3D,F I only found data for Kar9-AID, though. The authors should refer to supplementary figure

      5A or even better include quantification similar to figure 3F.

      We have corrected this in the revised text. We refer to the Kar9-AID, for which we have the quantification.

      The observation that cytoplasmic Bim1 localization depends on interaction with its cargo Kar9 (figure 3 + 7) fits into the

      model that Kumar et al (https://doi.org/10.1016/j.str.2021.06.012) proposed in which Kar9 oligomerization is required for

      its Bim1 dependent localization to microtubules. It would be valuable to point that out.

      We have now included a sentence that our findings support this model and added the respective citation.

      I don't fully understand the model proposed in Figure 5H and discussion page 26. Based on figure 5E, it does not look

      like there is a higher concentration of Bik1 along the lattice in bim1 delete. So how would Bik1 increase Kip2 processivity

      if its levels are only increased due to a MT length change? If Kip2 was not fully processive, you would rather expect to

      see less of it at the tip of a longer microtubule in bim1 delete. The model suggested by Chen et al

      (https://doi.org/10.7554/eLife.48627.001) suggests that Kip2 only gets loaded at the minus-end and processively walks

      towards the +end without falling off. Are the authors suggesting that bim1 deletion changes this behavior?

      We have rephrased this section in results and discussion and more clearly state that there is no increase in Bik1 per MT length unit in the bim1 deletion. We have amended the discussion and grant that we currently cannot explain by which molecular mechanisms Bik1 may contribute to the observed increase in Kip2 plus-end localization under conditions of a bim1 deletion.

      I don't see evidence for independent pools of Bik1 in the cytoplasm and nucleus as claimed on top of page 21. Total

      Bik1 levels on cytoplasmic microtubules seem to be well explained by their length. Please explain better or remove the

      statement.

      We have removed the respective statement from the revised manuscript.

      The experiments in supplementary figure 7B are difficult to interpret. The localization on cytoplasmic microtubules is

      different, but probably explained by the formation of Bim1 heterodimers. Therefore this experiment is difficult to interpret

      and should be removed.

      As requested, we have removed this experiment from the revised manuscript.

      top of page 24: Kar9 localization in metaphase depends exclusively on SxIP, not on LxxPTPh (Manatschal 2016). The

      paragraph should be removed as it is not supported by published data or sufficiently by the authors to merit the

      conclusion.

      We have reformulated this to avoid a misunderstanding. We merely show that in the context of the artificial GCN4 construct a fragment just including the LxxPTPh motif is sufficient for Bim1-dependent localization to microtubules in nucleus and cytoplasm. This makes no statement about localization determinants of the authentic Kar9 protein.

      Top of page 26: The genetic interactions between the Kar9 pathway and the dynein pathway were already well known

      before this work. Please reformulate accordingly.

      We have re-written this section and introduce the two pathways with the respective citations in the very beginning of the section before describing the experiments.

      page 27 second paragraph: There is no selective pressure to evolve compensation mechanisms for gene deletions. I

      suggest the authors consider that Kar9 and dynein partially redundant, with Kar9 acting to position the spindle prior to

      metaphase and dynein to maintain spindle position in the mother and bud compartments in late metaphase and

      anaphase. The authors should consider the quantitative analysis of Kar9 and dynein dependent spindle positioning

      reported in Shulist et al. 2017 and the method for analysis of spindle length and position in 3D in Meziane et al. 2021.

      We have rephrased the section on the partially redundant Kar9 and Dynein pathways. See below our answer for measuring spindle length.

      In addition, it is not clear to me which results suggest that the relocalization of Bik1 is required in the bim1 delete. Why

      would wild type levels not be sufficient for dynein pathway function? The authors have not conclusively shown that

      nuclear migration relies on upregulating the dynein pathway in bim1Δ cells. If there is no supporting data, the paragraph

      should be removed.

      In this revised manuscript we have phrased our observations more carefully and acknowledge the limitations regarding molecular insights. We present indications for increased levels of Dynein-Dynactin pathway components at plus-ends in the bim1 deletion cells, but it is indeed unclear, whether an increased Bik1 level in the cytoplasm is required to achieve this.

      Please provide more details about intensity quantification on page 35. Were these measured on sum or max

      projected stacks? What was the method of background subtraction?

      Analysed images are optical axis integration scans over 3 μm taken on a Deltavision microscope. This procedure gives a sum projection. Local background was determined for every cell by drawing a line under a signal curve derived by line scan. The background line connects regions that are still within the cell but are outside of spindle (or microtubule). We added a sentence in the materials and methods section under point 2.

      Are the spindle lengths in Figure 2E measured in 2D or 3D? Bim1 deletion might lead to more misalignment of the

      spindles in z due to inactivation of the Kar9 pathway and thus partially explain the shorter spindles. The measurements

      should therefore be performed in 3D.

      As we have used optical axis integration (OAIs) on the Deltavision microscope and obtained a sum projection of this virtual stack, the spindles were measured in 2D and we don’t have the information to measure in 3D (this would require a regular stack). We show that there are different ways to restore different aspects of spindle length with alternative strategies. These are unlikely to influence just spindle orientation. In addition, we see that Bim1 deletion has an effect on the size of a nascent bipolar spindle when spindle orientation is similar to wild-type cells. We agree that z-misalignment may affect absolute values of spindle size of Bim1 deletion in late metaphase and it would be better to measure in 3D. However, we think in this case it is unlikely to affect our conclusions in this study.

      The authors should try to shorten the text. There is a lot of redundancy between results and discussion sections.

      We have to shortened the text to avoid redundancy (before >43000 characters, now around 41000 characters, and we have decreased the number of main figures from 9 to 8.

      Data is shown that leads to conclusions that are already supported by the literature should be moved to the

      supplementary material.

      In the course of re-organizing the manuscript we have tried to do this.

      Reviewer 2:

      "Robustness of Ndc80 loading might be achieved by the coexistence of multiple kinetochore assembly pathways or

      alternatively determined by intrinsic Ndc80 properties." Wouldn't Ndc80 levels be determined by Ndc80 kinetochore

      loading, and not by end-binding proteins? This seems to be the more likely means to regulate Ndc80 levels.

      We have removed this statement from the revised manuscript.

      "Upon analyzing the associations in the cytoplasm, we found that Kar9-3xGFP foci on bud-directed cytoplasmic

      microtubules were abolished in the bim1Δ strain, consistent with earlier reports." It would be helpful if the authors

      commented on the how the localization of some of these proteins are affected by bim1Δ on the mother-directed plus

      ends. Although I understand the need to account for one class of plus end for the sake of consistency (and the distinct

      behaviors of the mother vs bud-directed plus end), the text as written leaves me wondering about the other plus end.

      We have noticed that the bim1 deletion led to the loss of asymmetric distribution on cytoplasmic microtubules for a number of components. Most prominent are Bik1, Kip2 and proteins of dynein-dynactin complex. We felt that further analysing this phenotype was beyond the scope of this study.

      "The CAP-Gly domain construct, expressed from a BIM1 promoter, almost exclusively localized to the spindle of yeast

      cells." For clarity, the authors should explicitly state that the CAP-Gly domain in question is from Bik1. Although this can

      be deduced, this was not abundantly clear.

      We have clarified this in the text and in the figure.

      "In addition to Ase1, we followed the kinetochore proteins Ndc80-GFP and Sgo1-GFP which specifically marks

      kinetochores that lack tension." This sentence should add "the latter of which..." to clarify that SgoI, but not Ndc80

      exhibits this behavior.

      We have added the phrase “the latter of which” to clarify this point.

      "We observed that bim1Δ cells had mispositioned kinetochores with a bright Sgo1-GFP signal that was much stronger

      than in wild-type cells." I don't see the mispositioned kinetochores described here. Are the authors referring to the fact

      that Sgo1 is brighter, which suggests tension-free KTs? If so, this should be clearly stated as such, since the authors are

      not explicitly assessed kinetochore "positioning".

      We have rephrased the sentence to clarify. We refer to a lack of bi-lobed Ndc80 signal and a bright Sgo1-GFP signal as two aspects of the phenotype.

      "We speculate that Bim1-Bik1 in a complex with its cargo Cik1-Kar3 is active after bi-polar spindle formation but before

      late metaphase and Ase1 can partially substitute for nuclear Bim1 functions." I struggled to grasp the reasoning for these

      conclusions. I assume the former point (the timing for Bim1-Bik1-Cik-Kar3) is due to the localization dynamics of Bim1

      and Bik1, while the latter (Ase1 can substitute for Bim1) is due to the synthetic interaction between Bim1 and Ase1 (I

      needed to look this latter point up myself). Or is this latter point due to the brighter spindle Ase1-GFP intensity? In either

      case, the authors should more clearly state their reasoning.

      We have clarified this statement in the revised discussion.

      The error bars in Figures 3A and 6E (shown as 95% CI) and elsewhere seem very small for the parameters that are

      being plotted. Spindle length values as shown in Figure 2E cover a broad range (as would be expected for a biological

      process), and it would be more accurate if the error bars in Fig 3A and 6E reflect this, even if it means they start

      overlapping each other. I find the error as shown to be misleading to your readers, and unless the authors have very

      good reason to use 95% CI (which is not as meaningful as standard deviation), then I would encourage them to use

      standard deviation.

      We prefer to use CI for the spindle length plots over time for consistency reason and to avoid overlap, which would make the graphs difficult to read. We have changed the text to provide the standard deviation instead of the standard error of the mean for spindle length and metaphase duration, see point below.

      The same is true for the values stated throughout the text (e.g., for mitotic timing "47{plus minus}2 min" for metaphase

      duration; for distance between SPB and bud neck {plus minus} 0.1 μm, etc). I am highly skeptical that metaphase

      duration (for example) ranged from only 46-48 minutes. Please use standard deviation to describe a more accurate

      description of the range of values for these parameters.

      In the revised manuscript, we now give the mean values plus/minus standard deviation, instead of the standard error of the mean, as requested. In addition, the range of values is directly visible from the individual data points in the plots.

      "Unexpectedly, the kar9 deletion mutant displayed a slightly accelerated metaphase progression relative to wild-type

      cells (26{plus minus}1 min) (Figure 3C). This could be attributed to an increased level of Bim1 on the metaphase spindle

      of kar9Δ (or Kar9-AID) cells." The authors should give us more rationale to explain the "attributing the increased levels of

      Bim1" point here. Do they think that the levels of spindle-associated Bim1 impact metaphase duration somehow? If so,

      how?

      We have added a sentence, speculating about how this could be accomplished.

      "Overall, our cell biology data suggested that major nuclear Bim1 functions are conducted in a complex with Cik1-

      Kar3, while Bik1 and Kar9 have a smaller impact, probably affecting the nuclear- cytoplasmic distribution of Bim1."

      Although I understand and agree with the former conclusion (that Bim1 functions are conducted via Cik1-Kar3"), the latter

      was confusing to me. Did the authors mean that "Bim1 impacts Bik1 and Kar9 to a lesser extent", rather than vice versa?

      The authors are discussing Bim1 functioning via Cik1, but then switch to discussing how Bik1 and Kar9 affect Bim1.

      We have removed the second part of the sentence from the revised manuscript.

      "Next, we compared the comparing genetic interaction profile of a bim1 deletion to that of various other factors by reanalyzing the synthetic genetic interaction data..." Remove "comparing".

      Thanks for pointing out this typo, we have removed it in the revised manuscript.

      As someone who is unfamiliar with the analysis shown in Figure 3H, I think it would be useful to list a Pearson

      correlation value for two genes that are not functionally related. This would help define a lower limit for this analysis.

      For functionally unrelated genes the Pearson correlation between genetic interaction (GI) profiles is very close to zero. The graph below depicts Pearson correlation between GI profile of Bim1 and GIs of every yeast gene (data used for graph is taken from thecellmap.org).

      The axes for the plots in Figure 5E and 5I are very confusing to me. I don't understand what I'm looking at. Why does

      it go from 0 to 1, and then back to 0-1 again? I don't see how this can account for MTs of different lengths. Normalizing all MT length values to 1 would do this, no?

      We have clarified the labelling in the revised manuscript. The x-axis gives the relative position from either the plus-end, or the Spindle pole body (both set to position 0) in micrometres. This allowed us to compare fluorescent intensities on cytoplasmic microtubules of different lengths in wild-type and bim1 delete.

      "These observations are consistent with the idea that Bik1 acts as a processivity factor for Kip2: If more Bik1 is

      present on the lattice, then more Kip2 molecules are able to reach plus-ends without detachment." Perhaps I'm

      misunderstanding the plot shown in Figure 5E, but it seems to indicate that the levels of lattice-bound Bik1 are the same

      in BIM1 and bim1Δ cells (higher SPB-localized levels, though). There are also lower levels of Bik1 at the plus ends in

      bim1Δ cells. So, if Bik1 were a processivity factor for Kip2, this would suggest that they would remain bound at plus ends

      as well, which these data suggest is not the case…

      We have added a section to the discussion that deals with this point and we speculate about the reasons why Kip2 is increased at plus-ends, while Bik1 is not.

      "The data on the CH-Cik1 fusion is very compelling, and indeed supports their hypothesis that Bim1's main role in the

      nucleus is to target Cik1 to the spindle MT plus ends. That being said, it would be a simple, but important task to ensure

      that this fusion behaves as suggested (restores Cik1 plus end binding in cells). Otherwise, it can't' be ruled out that this

      fusion is rescuing bim1Δ functions by some other means. However, as stated above, it's unclear how much was already

      known about this fusion from the lab's previous work.

      In our previous study (Kornakov et al., 2020) we have shown that the CH-Cik1Delta74 fusion indeed is sufficient to enrich Kar3 at plus ends. We expect the same to be true for this slightly different fusion construct. We have added a respective sentence to the results section.

      Regarding the p1-p6 promoter data: p6 is missing from Figure S6A, in spite of it being referenced in the text and the

      figure.

      Thanks for pointing this out, we have corrected that in the revised manuscript and do not refer to p6 anymore.

      "Exogenously expressed Ase1 displayed a similar level and kinetics of localization compared to the endogenous

      protein, indicating that binding sites for microtubule crosslinkers are not a limiting factor on the budding yeast spindle."

      Specifically, the authors show that binding sites for Ase1 may not be limiting (the overlapping 95% CI bars if Fig S6B

      suggest this is not significant), not all crosslinkers. The authors should not use such broad language to describe results

      from one experiment with one crosslinker.

      We have rephrased to make clear that our statement only refers to Ase1.

      "We found that all bim1 mutants were less well recruited to the metaphase spindle compared to the wild-type protein,

      indicating that Bim1-interacting proteins strongly contribute to Bim1 localization." Can the authors rule out the defects in

      localization of these mutants is not compromised MT binding by the Bim1 mutants? Also, regarding this statement: "To

      test that the observed recruitment defects of bim1 mutants are not a result of a compromised spindle or microtubule

      structure, we examined their localization in a situation when GFP-tagged mutants were covered with the unlabeled wildtype

      allele. Indeed, in this situation, the Bim1 mutants displayed very similar localization profiles (Supplementary Figure

      7B)." I wasn't sure what these results were similar to: the wild-type protein, or the mutant without the presence of WT

      Bim1? The lack of quantitation made this difficult to determine.

      At the request of reviewer 1, we have removed the analysis of Bim1-GFP mutants over an unlabelled Bim1 wild-type from the manuscript.

      The zoom crops for many of the images (Fig 1F and C, 3D, 5J, etc) are not labeled. I realize the legends indicated

      what was what, but it would be much easier for the reader if these panels were labeled in the figure.

      We have indicated the channel by a respective frame around the zoom throughout the manuscript. We think this makes orientation easier.

      "While in vitro reconstitution experiments have suggested that Bim1 is required to fully reconstitute the Kip2-

      dependent loading of the Dynein-Dynactin complex to microtubule-plus ends in vitro (Roberts et al., 2014), our

      experiments indicate that it may contribute relatively little to this pathway in cells." Work from other labs have also shown

      Bim1 is dispensable for dynein function in cells. This should be noted by the authors, and the appropriate work cited (see

      work from Lee and Pellman labs. In fact work from the Lee lab showed that Kip2 is dispensable for plus end binding of

      dynein).

      We have re-written this section and now also refer to the Markus 2009 paper (Wei-Lih Lee lab).

      References are missing throughout the text. I have listed a few examples below:

      "We have previously shown that the phenotype of Bim1-binding deficient Cik1 mutants can be rescued by fusing the

      CH-domain to this Cik1 mutant (cik1-Δ74)."

      We have listed the citation of our 2020 paper (Kornakov et al.)

      "We constructed a series of strains expressing an extra copy of Ase1-GFP under different constitutive promoters of

      increasing strength (p1 to p6)"; where did these promoters come from?

      They were selected based on a systematic analysis of promoter strength in Shaw et al., 2019, DOI: 10.1016/j.cell.2019.02.023 . We have added that citation to the methods section.

      "double point mutation exchanging two conserved residues in the EBH domain (bim1 Y220A E228A) is predicted to

      eliminate all EBH-dependent cargo interactions, but does not affect protein dimerization."

      We have cited the Honnapa 2009 paper here.

      "A deletion of the terminal five amino acids is predicted to prevent binding of the CAP-Gly domain of Bik1 to Bim1. The

      combination of both mutations is expected to simultaneously prevent both types of interaction."

      We have cited the Stangier 2018 paper here.

      "Spindle positioning in budding yeast is achieved via two pathways, one relying on the protein Kar9 which interacts

      with the actin-based motor Myo2." Yin et al 2000 should be added (in addition to Hwang et al).

      We have now included the Yin et al. 2000 citation.

      "For nuclear migration to occur efficiently, the Dynein-Dynactin complex must be enriched at the plus-ends of

      cytoplasmic microtubules..." Should cite work from the Lee lab here.

      We now cite Markus and Lee, 2011 as an example.

      "These long microtubules can interact with the bud cortex and initiate pulling events to move the nucleus (Omer et al.,

      2018)." Many papers pre-dating the Omer study found this to the case, including work from the Cooper lab (see Adames

      et al). These studies should be cited either in place of the Omer study, or in addition.

      We have cited additional studies besides the Omer paper.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The authors of the study performed a systematic assessment of the role of Bim1 in the MT-binding activity and function of a large number of nuclear and cytoplasmic MT-associated proteins (MAPs), as well as their role during mitosis and spindle positioning. For example, they find that the reliance of MT-binding activity of several MAPs varies from complete reliance on Bim1, to almost no role (in some cases, loss of Bim1 even increases MAP-MT binding). The density and quality of the data, and the large number of players analyzed by the study, are certainly impressive, and there is no doubt a lot of valuable information contained within that will be of use to many people in the MAP/mitosis/yeast cell biology community. However, I feel the manuscript can be greatly improved following some significant revisions. In particular, although some of their findings are indeed interesting and useful, and can be used to reliably draw conclusions, it is difficult to parse out what is novel, and what is a rehashing of old data. For instance, the role of Bim1 in Bik1/Kip2 targeting was described years (Carvalho et al), and I was surprised to see that the CH-Cik1 fusion was previously described by the authors' lab a couple years ago (see note below regarding lack of appropriate citations and lack of description of previous knowledge). Also, how much did we already know about the Bim1 truncations shown in Figure 7 and S7, and how they might disrupt binding to partners? Finally, regarding this statement in the Discussion: "Our analysis indicates that Bim1 contributes to both of these processes as part of two key protein complexes (Figure 9A): Bim1-Kar9-Myo2 in the cytoplasm and Bim1- Bik1-Cik1-Kar3 in the nucleus." As far as I know, these things have been known for many years; their work might help to support these findings, but the statement as written misleads the readers in to believing the present work proves these old concepts.

      One of the main issues with reading a manuscript with so much data about so many different players and pathways is that this leads to a situation in which each story is only superficially covered, with only minimal depth or detail. This made the paper somewhat difficult for me to follow (and I am a fan of budding yeast mitosis!), especially given the frequent switching from one pathway to another (e.g., the Cik1 section started on page 12 appears to be continued on page.17, only after talking about the spindle orientation story in between the two Cik1 sections). I'm not sure what to suggest, but the manuscript can be improved if the authors try to refocus some of the sections to make it easier to follow one story at a time, for a particular molecule (e.g., Cik1) or pathway (spindle orientation). In addition to explicitly describing what is already known about a particular molecule/pathway, the writing can be greatly improved by introducing their reasoning for the experiments in question. Some of the sections lack sufficient rationale for me to understand the justification for their experiments (e.g., why try to overexpress Ase1 to rescue bim1∆ phenotypes, as described on page 19?).

      Although there is likely much to learn from this study, I felt that some conclusions were a little bold (see below), while alternative hypotheses were not addressed (perhaps Bim1 simply competes for MT binding with some of these factors, thus accounting for them increasing their spindle-binding behavior?). For example, the authors make a point that loss of Bim1 enhances dynein-dynactin function. However, it is important to note that mutations in tubulin (tub2-430∆) and other MAPs (Kar9 or Ase1, the latter of which the authors point out) also lead to increased dynein activity (see work by Yeh et al., 2000, and work from the Moore lab). It is unknown whether mutations to these genes affect dynein targeting in cells similar to what the authors describe here. Thus, a direct causal relationship between their bim1∆ phenotypes and enhanced dynein activity is unclear, and at best is speculative. It's also worth noting that overexpression of Bik1 has been shown to actually reduce Dhc1 localization to plus ends in cells (see Markus et al 2011), which would argues against a simple mechanism of increasing Bik1 correlating with increasing dynein localization and activity.

      Below are some specific points.

      1. "Robustness of Ndc80 loading might be achieved by the coexistence of multiple kinetochore assembly pathways or alternatively determined by intrinsic Ndc80 properties." Wouldn't Ndc80 levels be determined by Ndc80 kinetochore loading, and not by end-binding proteins? This seems to be the more likely means to regulate Ndc80 levels.
      2. "Upon analyzing the associations in the cytoplasm, we found that Kar9-3xGFP foci on bud-directed cytoplasmic microtubules were abolished in the bim1Δ strain, consistent with earlier reports." It would be helpful if the authors commented on the how the localization of some of these proteins are affected by bim1∆ on the mother-directed plus ends. Although I understand the need to account for one class of plus end for the sake of consistency (and the distinct behaviors of the mother vs bud-directed plus end), the text as written leaves me wondering about the other plus end.
      3. "The CAP-Gly domain construct, expressed from a BIM1 promoter, almost exclusively localized to the spindle of yeast cells." For clarity, the authors should explicitly state that the CAP-Gly domain in question is from Bik1. Although this can be deduced, this was not abundantly clear.
      4. "In addition to Ase1, we followed the kinetochore proteins Ndc80-GFP and Sgo1-GFP which specifically marks kinetochores that lack tension." This sentence should add "the latter of which..." to clarify that SgoI, but not Ndc80 exhibits this behavior.
      5. "We observed that bim1Δ cells had mispositioned kinetochores with a bright Sgo1-GFP signal that was much stronger than in wild-type cells." I don't see the mispositioned kinetochores described here. Are the authors referring to the fact that Sgo1 is brighter, which suggests tension-free KTs? If so, this should be clearly stated as such, since the authors are not explicitly assessed kinetochore "positioning".
      6. "We speculate that Bim1-Bik1 in a complex with its cargo Cik1-Kar3 is active after bi-polar spindle formation but before late metaphase and Ase1 can partially substitute for nuclear Bim1 functions." I struggled to grasp the reasoning for these conclusions. I assume the former point (the timing for Bim1-Bik1-Cik-Kar3) is due to the localization dynamics of Bim1 and Bik1, while the latter (Ase1 can substitute for Bim1) is due to the synthetic interaction between Bim1 and Ase1 (I needed to look this latter point up myself). Or is this latter point due to the brighter spindle Ase1-GFP intensity? In either case, the authors should more clearly state their reasoning.
      7. The error bars in Figures 3A and 6E (shown as 95% CI) and elsewhere seem very small for the parameters that are being plotted. Spindle length values as shown in Figure 2E cover a broad range (as would be expected for a biological process), and it would be more accurate if the error bars in Fig 3A and 6E reflect this, even if it means they start overlapping each other. I find the error as shown to be misleading to your readers, and unless the authors have very good reason to use 95% CI (which is not as meaningful as standard deviation), then I would encourage them to use standard deviation.
      8. The same is true for the values stated throughout the text (e.g., for mitotic timing "47{plus minus}2 min" for metaphase duration; for distance between SPB and bud neck {plus minus} 0.1 µm, etc). I am highly skeptical that metaphase duration (for example) ranged from only 46-48 minutes. Please use standard deviation to describe a more accurate description of the range of values for these parameters.
      9. "Unexpectedly, the kar9 deletion mutant displayed a slightly accelerated metaphase progression relative to wild-type cells (26{plus minus}1 min) (Figure 3C). This could be attributed to an increased level of Bim1 on the metaphase spindle of kar9Δ (or Kar9-AID) cells." The authors should give us more rationale to explain the "attributing the increased levels of Bim1" point here. Do they think that the levels of spindle-associated Bim1 impact metaphase duration somehow? If so, how?
      10. "Overall, our cell biology data suggested that major nuclear Bim1 functions are conducted in a complex with Cik1- Kar3, while Bik1 and Kar9 have a smaller impact, probably affecting the nuclear- cytoplasmic distribution of Bim1." Although I understand and agree with the former conclusion (that Bim1 functions are conducted via Cik1-Kar3"), the latter was confusing to me. Did the authors mean that "Bim1 impacts Bik1 and Kar9 to a lesser extent", rather than vice versa? The authors are discussing Bim1 functioning via Cik1, but then switch to discussing how Bik1 and Kar9 affect Bim1.
      11. "Next, we compared the comparing genetic interaction profile of a bim1 deletion to that of various other factors by re-analyzing the synthetic genetic interaction data..." Remove "comparing".
      12. As someone who is unfamiliar with the analysis shown in Figure 3H, I think it would be useful to list a Pearson correlation value for two genes that are not functionally related. This would help define a lower limit for this analysis.
      13. The axes for the plots in Figure 5E and 5I are very confusing to me. I don't understand what I'm looking at. Why does it go from 0 to 1, and then back to 0-1 again? I don't see how this can account for MTs of different lengths. Normalizing all MT length values to 1 would do this, no?
      14. "These observations are consistent with the idea that Bik1 acts as a processivity factor for Kip2: If more Bik1 is present on the lattice, then more Kip2 molecules are able to reach plus-ends without detachment." Perhaps I'm misunderstanding the plot shown in Figure 5E, but it seems to indicate that the levels of lattice-bound Bik1 are the same in BIM1 and bim1∆ cells (higher SPB-localized levels, though). There are also lower levels of Bik1 at the plus ends in bim1∆ cells. So, if Bik1 were a processivity factor for Kip2, this would suggest that they would remain bound at plus ends as well, which these data suggest is not the case.
      15. "The data on the CH-Cik1 fusion is very compelling, and indeed supports their hypothesis that Bim1's main role in the nucleus is to target Cik1 to the spindle MT plus ends. That being said, it would be a simple, but important task to ensure that this fusion behaves as suggested (restores Cik1 plus end binding in cells). Otherwise, it can't' be ruled out that this fusion is rescuing bim1∆ functions by some other means. However, as stated above, it's unclear how much was already known about this fusion from the lab's previous work.
      16. Regarding the p1-p6 promoter data: p6 is missing from Figure S6A, in spite of it being referenced in the text and the figure.
      17. "Exogenously expressed Ase1 displayed a similar level and kinetics of localization compared to the endogenous protein, indicating that binding sites for microtubule crosslinkers are not a limiting factor on the budding yeast spindle." Specifically, the authors show that binding sites for Ase1 may not be limiting (the overlapping 95% CI bars if Fig S6B suggest this is not significant), not all crosslinkers. The authors should not use such broad language to describe results from one experiment with one crosslinker.
      18. "We found that all bim1 mutants were less well recruited to the metaphase spindle compared to the wild-type protein, indicating that Bim1-interacting proteins strongly contribute to Bim1 localization." Can the authors rule out the defects in localization of these mutants is not compromised MT binding by the Bim1 mutants? Also, regarding this statement: "To test that the observed recruitment defects of bim1 mutants are not a result of a compromised spindle or microtubule structure, we examined their localization in a situation when GFP-tagged mutants were covered with the unlabeled wild-type allele. Indeed, in this situation, the Bim1 mutants displayed very similar localization profiles (Supplementary Figure 7B)." I wasn't sure what these results were similar to: the wild-type protein, or the mutant without the presence of WT Bim1? The lack of quantitation made this difficult to determine.
      19. The zoom crops for many of the images (Fig 1F and C, 3D, 5J, etc) are not labeled. I realize the legends indicated what was what, but it would be much easier for the reader if these panels were labeled in the figure.
      20. "While in vitro reconstitution experiments have suggested that Bim1 is required to fully reconstitute the Kip2- dependent loading of the Dynein-Dynactin complex to microtubule-plus ends in vitro (Roberts et al., 2014), our experiments indicate that it may contribute relatively little to this pathway in cells." Work from other labs have also shown Bim1 is dispensable for dynein function in cells. This should be noted by the authors, and the appropriate work cited (see work from Lee and Pellman labs. In fact work from the Lee lab showed that Kip2 is dispensable for plus end binding of dynein).
      21. References are missing throughout the text. I have listed a few examples below:
        • a. "We have previously shown that the phenotype of Bim1-binding deficient Cik1 mutants can be rescued by fusing the CH-domain to this Cik1 mutant (cik1-Δ74)."
        • b. "We constructed a series of strains expressing an extra copy of Ase1-GFP under different constitutive promoters of increasing strength (p1 to p6)"; where did these promoters come from?
        • c. "double point mutation exchanging two conserved residues in the EBH domain (bim1 Y220A E228A) is predicted to eliminate all EBH-dependent cargo interactions, but does not affect protein dimerization."
        • d. "A deletion of the terminal five amino acids is predicted to prevent binding of the CAP-Gly domain of Bik1 to Bim1. The combination of both mutations is expected to simultaneously prevent both types of interaction."
        • e. "Spindle positioning in budding yeast is achieved via two pathways, one relying on the protein Kar9 which interacts with the actin-based motor Myo2." Yin et al 2000 should be added (in addition to Hwang et al).
        • f. "For nuclear migration to occur efficiently, the Dynein-Dynactin complex must be enriched at the plus-ends of cytoplasmic microtubules..." Should cite work from the Lee lab here.
        • g. "These long microtubules can interact with the bud cortex and initiate pulling events to move the nucleus (Omer et al., 2018)." Many papers pre-dating the Omer study found this to the case, including work from the Cooper lab (see Adames et al). These studies should be cited either in place of the Omer study, or in addition.

      Referees cross-commenting

      It seems that one of my major concerns is reflected in Reviewer #1's review: that a lot of the findings described in the manuscript have been published elsewhere, and are not novel. In spite of this, I do think there are useful data in this manuscript that make this an important contribution, and that it should definitely be published. However, this would first require a significant re-writing with appropriate description of known vs unknown, and additional citations.

      Significance

      The current study aims to clarify the role of Bim1 (EB1 homolog in budding yeast) in the various pathways in which it has been implicated. To achieve this aim, the authors assess the localization of numerous other microtubule-associated proteins in cells with and without Bim1. In addition to high quality localization data (e.g., intensity values), the authors perform a number of cell biological assessments (e.g., mitotic spindle length values before, during and after anaphase), genetic assessments (synthetic interaction assays), and in vitro binding assays. The current study is indeed rich with new insights into the mechanisms by which these molecules function, and will no doubt prove valuable to a number of people in the microtubule/motor/yeast mitosis fields. As someone who is interested in and studies mitosis in budding yeast, I found the study to be interesting.

    1. we cannot see why the regulations made by ourselves shouldnot, on the contrary, be a protection and a benefit for every one of us. And yet, when we consider howunsuccessful we have been in precisely this field of prevention of suffering, a suspicion dawns on us thathere, too, a piece of unconquerable nature may lie behind -this time a piece of our own psychicalconstitution.

      I surprisingly agree with that statement. I think he's saying that we may try to create rules and boundaries for ourselves and our relationships as a protection mechanism, however we still end up suffering because of it.

    2. And yet, when we consider howunsuccessful we have been in precisely this field of prevention of suffering, a suspicion dawns on us thathere, too, a piece of unconquerable nature may lie behind -this time a piece of our own psychicalconstitution.

      Freud introduces the first two sources of suffering as something physical and uncontrollable. I think here he is also saying human nature and behavior is also something that can't easily be controlled.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank the reviewers for their thorough and positive assessment of our work. We also thank them for their careful review of our manuscript. Our responses to their specific comments are provided in the lines below.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary:

      The manuscript entitled „Metastatic potential in clonal melanoma cells is driven by a rare, early-invading subpopulation" by Kaur and colleagues provides a phenotypical analysis of the invasive potential of established melanoma cell lines on single cell level. The aim of the study was to answer the question if even homologous tumor cells bear the intrinsic potential to give rise to cells with high invasive (and therefore potentially metastatic) capacity in absence of selection pressure from the tumor microenvironment.

      The authors used clones from two different melanoma cell lines (to prevent the accumulation of random (epi)genetic changes during cultivation) and performed invasion assays with Matrigel-coated transwell inlays to differentiate between cells that were able to invade early (up to 8 h, approx. 1% of the total cell population) or late (8-24 h; approx. 3% of the total cell population) after plating. Comparative RNA sequencing of early invaders and non-invaders populations revealed a high expression of SEMA3C in early invaders, which was then established as marker in the used cell lines. Interestingly, in vivo models using NSG mice injected with a mixture of early and late invading melanoma cells revealed that both contributed similarly to the primary tumor, while metastatic cells in the lung consisted almost exclusively of early invaders. Subsequent ATAC sequencing revealed an increase of binding sites for the transcription factor NKX2.2 in the early invaders. Functional analyses revealed that a knockout of NKX2.2. led to an increase in both invasion and proliferation. Finally, the authors showed with different sorted early and late invaders as well as SEMA3Chigh and SEMA3Clow expressers that pro-invasive features go along with reduced proliferation potential in accordance to previously published data. However, they decrease with time, thus demonstrating a reversion of the phenotype and high plasticity.

      Major comments:

      In general, the paper contains novel and interesting data, is concisely written and supported by replicates. The key conclusion, the presence of a small proportion of highly invasive cells in a seemingly homologous cell population and their striking requirement for lung metastasis, is very convincing. In vitro, SEMA3C was confirmed as a marker for the early invaders in two independent cell lines. However, a few questions remain open, as detailed below:

      We thank the reviewer for their positive assessment of our work. We also thank them for their careful review of our manuscript. Our responses to their specific comments are provided in the lines below.

      1) The relevance of NKX2.2 in the early invaders is currently unclear to me.

      The ATAC sequencing data revealed a high enrichment of accessible NKX2.2 binding sites in early invaders, and data were tested by comparative RNA sequencing of control cells and cells with NKX2.2 ko (Figure 2). The Figure legend of Figure 2 says: "NKX2.2 is a transcription factor that promotes the invasive subpopulation", but the data don`t support this (ko leads to reduced invasion). Accordingly, the authors also state in the Results part "... the direction of the effect is the opposite of what one might have expected".

      To set the role of NKX2.2 into context, it would be useful to confirm the actual involvement of NFX2.2 in the invasive phenotype and clarify if NFX2.2. might probably even suppress some pro-invasive genes. I would advise to investigate the protein levels and/or protein localization of NFX2.2 and probably perform ChIp experiments on selected pro-invasive genes that play a role in the early invaders.

      The reviewer has raised some excellent points about our studies of NKX2.2 and its role in invasion. Indeed, we were also surprised by the fact that NKX2.2 had the opposite effect as expected (its peaks are enriched for accessibility in the early invaders in FS4, but knockout leads to increased invasion). We elected to include the results because it was a hypothesis we tested, so in the interest of full disclosure of results, we chose to leave the result in.

      The reviewer has also made some nice suggestions about how to further explore the role of NKX2.2 in regulation (e.g. ChIP-seq). Owing to the complexity of validating and performing this assay, we felt these experiments were beyond the scope of the current manuscript; we hope to explore these possibilities more fully in the future.

      Another excellent suggestion the reviewer made was to look at the regulatory capacity of NKX2.2 to directly demonstrate the link between NKX2.2 regulation and expression differences between early- and late-invading cells. In order to establish this connection, we used a gene set from molecular signatures database (MSigDB: https://www.gsea-msigdb.org/gsea/msigdb/human/geneset/NKX2_2_TARGET_GENES.html) consisting of genes with an NKX2.2 binding site within their promoter (TSS -1000 bp to TSS +100 bp) identified by the gene transcription regulation database (GTRD–paper here: https://pubmed.ncbi.nlm.nih.gov/33231677/). We used the Fisher’s exact test to see if the overlap between these genes regulated by NKX2.2 and genes that are differentially expressed between early-invading cells versus their respective parental population in both cell lines had more overlap than one would expect by chance. Indeed, the p-values using this approach were 3.937e-16 and 0.037 for the FS4 and 1205Lu cell lines, respectively. These results, combined with the motif analysis with our ATAC-seq data, demonstrated that the activity of NKX2.2 is relevant in the early-invading state. We thank the reviewer for the suggestion and feel this additional analysis has improved our conclusions about NKX2.2.

      Also, we further checked whether NKX2.2 levels correlated in early versus late invading cells across a panel of cell lines (Fig. 2C). We found that in 4/6 of these lines, NKX2.2 expression was higher in the early invaders. These results further support the case that NKX2.2 is an important positive regulator of invasion in multiple contexts.

      “In order to establish the generality of our results, we measured NKX2.2 expression levels across multiple cell lines by single molecule mRNA FISH. We found that the early invaders had higher levels of NKX2.2 expression in four out of the 6 lines tested (Fig. 2C), demonstrating the generality of our results and strengthening the case that NKX2.2 is a potential regulator of early invasiveness. The role of NKX2.2 as a regulator of early invasiveness was further established through comparative analysis between genes with NKX2.2 promoter region binding sites (-1000 bp to +100 bp relative to the transcription start site (TSS) as annotated by the Gene Transcription Regulation Database (GTRD)) and genes differentially expressed in early-invading and parental cells. Analysis using Fisher's exact test revealed a significant overlap between GTRD annotated genes regulated by NKX2.2 and genes expressed in FS4 (****p=3.937e-16) and 1205Lu (*p=0.037) early-invading cells. These results, in complement with our results from ATAC-sequencing motif analysis, further supported the relevance of NKX2.2 regulation in the early-invading state.”

      2) The sequencing data are currently accessible via a Dropbox link. They should be deposited instead in a data repository.

      We thank the reviewer for noting this problem. We have uploaded all data to the SRA/GEO at the following links:

      https://urldefense.com/v3/https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE224772;!!IBzWLUs!SEr5DTViPf08-IBQnv0ml-CoLX3cbaiNlCz-DJbpIKm7UcVXlL9-OD9reVQJs5pm_gzeqJYC_dM-MV8DonwX4c4$

      https://urldefense.com/v3/https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE224769;!!IBzWLUs!SEr5DTViPf08-IBQnv0ml-CoLX3cbaiNlCz-DJbpIKm7UcVXlL9-OD9reVQJs5pm_gzeqJYC_dM-MV8DtY6ZB3A$

      https://urldefense.com/v3/https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE224771;!!IBzWLUs!SEr5DTViPf08-IBQnv0ml-CoLX3cbaiNlCz-DJbpIKm7UcVXlL9-OD9reVQJs5pm_gzeqJYC_dM-MV8Dq_3ghAU$

      Minor comments:

      1) The cell line used for Supplementary Figure 4 should be named in the figure legend.

      We thank the reviewer for the suggestion. We have included the name of the cell line in the figure legend for Supplementary Figure 4. The text reads as follows:

      “A. FS4 melanoma cells were sorted based on SEMA3C expression. Cells were live-imaged for ~10 days every hour and single cells were tracked manually for cell position, cell division and lineage. Lineages were traced manually from single cells. Cell speed was calculated for each cell using the average distance traveled over time.”

      2) In Figures 4H-M and Supplementary Figure 4D-I, the authors describe data performed in "sister" and "cousin" cells. It would be useful to provide a definition for both in the main text or figure legend.

      This is a very good point. We have provided the following definitions in the main text, and have changed the wording from “sister” to “sibling” to avoid gendered terminology:

      “(sibling cells are defined as those that share a common parent cell, and cousin cells are defined as those that share a common grandparent.)”

      3) Discussion: "This lack of permanence may reflect the fact that the invasive cells are not subjected to stress-in our case, cells merely pass through a transwell, which may be the reason for the "burning in" of the phenotype in the case of resistance."

      This sentence is misleading - please clarify.

      We apologize for the confusion caused by this sentence. We have now changed it to the following:

      “It is interesting that the early-invading cells eventually revert to the population average even after going through the transwell. Such a result contrasts with our previous work (Shaffer et al., 2017b), in which a rare subpopulation became permanently therapy resistant and did not revert even after several weeks off-treatment. One possibility is that the stress of undergoing therapy treatment induces a transcriptional rewiring, and this rewiring is not induced by the migration through transwells. Further studies will be required to test these hypotheses.”

      Furthermore, there are some errors in the reference to the Figures throughout the paper. These which should be corrected:

      We thank the reviewer for their detailed reading and finding these issues. We have now fixed them all in our revised manuscript.

      4) Results, section "NKX2.2 is a transcription factor that promotes the invasive subpopulation".

      Here the authors write: "...we performed RNA sequencing on the NKX2.2 knockout cells and compared the effects on gene expression to the gene expression differences between early vs. non- invaders across the two cell lines." This sentence should contain the reference to Supplementary Figure 3B-D (which is otherwise not referred to).

      We thank the reviewer for their detailed reading and noticing this issue. We have now referenced Supplementary Figure 3B-D in the text cited above.

      5) Results: "Overexpression of SEMA3C in FS4 cells revealed no changes in invasiveness, suggesting that SEMA3C is a marker with no functional relevance to invasiveness per se; Fig. 1D, Fig. 2A-B)"

      The correct reference should be: Suppl. Fig. 1D, Fig. 2A-B. Also, in the current manuscript version the authors jump from Figures 1 to Figure 2 A,B, before coming back to Figure 1. To avoid this, I would advise to shift the current Figure 2A, B to Figure 1 or the supplementary information.

      We thank the reviewer for pointing out this error in the reference to these figures. Figure 2A-B is now referenced as “Supp. Fig. 1 E-F”. The figure legend has also been updated.

      6) Results: "We then sampled lungs from mice at various times post-injection to look for metastatic cells (Fig.1F, Suppl. Fig. 2B,C)."

      As Supplementary Figure 2B, C does not show metastasis, but rather primary tumor growth, I would advise the following wording: "We then sampled lungs from mice at various times post-injection to look for metastatic cells (Fig.1F) and overall tumor growth (Suppl. Fig. 2B,C)."

      We thank the reviewer for their advice to reword the sentence cited above. We have now edited the text to read as suggested by the reviewer. In addition, Supp. Fig. 2B,C is not referenced as Supp. Fig. 2C,D.

      "We then sampled lungs from mice at various times post-injection to look for metastatic cells (Fig.1F) and overall tumor growth (Supp. Fig. 2C,D)."

      7) Results: "Interestingly, NKX2.2 knockout cells showed markedly increased invasion and proliferation (Fig. 2A,B), suggesting a change in regulation of both processes. "

      The correct reference is Fig. 2C, D.

      The reviewer is right that we only have results in one cell line, and fully agree that the results in FS4 are only correlative. We have now weakened the language in the abstract and the results to emphasize that this result held in 1205Lu cells only.

      • Given the robust literature regarding phenotypic switching in melanoma, the NKX2.2 knockout increasing both invasiveness and proliferation (figures 2C, 2D) suggests it may not be involved in phenotype switching. Perhaps NKX2.2 is a negative regulator of cell activity/metabolism. We thank the reviewer for highlighting the possible connections with metabolism. To explore this possibility , we performed metabolic assays on NKX2.2 knockout and AAVS control cells and observed no significant changes in Extracellular acidification rate (B). We did observe some differences in oxygen consumption rate in the cells (A), but the differences do not seem to be large enough or systematic enough to be meaningful given the variation within the controls. We have now included these results in Supp. Fig. 3E-F.

      Note, the data previously referenced as Figure 2C,D is now in Figure 2A,B.

      “NKX2.2 is a transcriptional repressor and activator essential for the differentiation of pancreatic endocrine cells (Habener et al., 2005). In mice, deletion of NKX2.2 prevents the specification of pancreatic islet cells resulting in the replacement of insulin-expressing β cells and glucagon-expressing α cells with ghrelin-expressing cells; This lack of specification resulted in mortality of newborn mice due to hyperglycemia (Sussel et al. 1998; Prado et al. 2004). Given the link of NKX2.2 with glucose metabolism, we wondered whether NKX2.2 had an effect on metabolic activity prompting us to test the NKX2.2 knockout lines for metabolic differences in the oxygen consumption rate (OCR; an indicator of oxidative phosphorylation) and the extracellular acidification rate (ECAR; an indicator of glycolysis) of the cells. Seahorse assay analysis revealed no systematic differences in metabolic activity (Supp. Fig. 3E,F).”

      We thank the reviewer for the correction. The reference has now been corrected in the main text.

      Reviewer #1 (Significance (Required)):

      Nature and significance of the advance/ literature context:

      In their manuscript, the authors provide interesting biological data about the presence of intrinsically and reversibly pro-invasive / pro-metastatic melanoma cells in a seemingly homogenous subpopulation. With SEMA3C, they also provide a marker for early invading cells, which might be useful in future studies to identify therapeutic vulnerabilities for this subgroup. This study sheds further light on the functional effects of phenotypic plasticity, which was previously described particularly in the context of therapy resistance, as mentioned by the authors.

      We thank the reviewer for their kind assessment of the impact of our work.

      Audience:

      The study is interesting for scientists from the melanoma field as well as the cancer metastasis field in general.

      Own expertise:

      Melanoma, phenotypic switch, metabolism, signal transduction, stress response

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Metastatic potential in clonal melanoma cells is driven by a rare, early-invading subpopulation

      Kaur et al.

      In this manuscript the authors highlight a small subpopulation of "early-invading" melanoma cells and functionally characterize the nuances of these early cells compared to their slowly invading counterparts. A cell surface marker, SEMA3C and the transcription factor NKX2.2 were associated with differences in the invasive rates. Importantly, the group demonstrates that existence of the invasive subpopulation is not reliant on genetic changes, and thus exhibits plasticity. While the underlying concept surrounding this paper (phenotypic plasticity) is not novel, highlighting a surface marker and transcription factor that may, at least in part, be associated with phenotype plasticity is interesting. However, the current study seems underdeveloped. Specific points of concern are listed:

      Major

      • Only two cell lines are used throughout this study. We thank the reviewer for pointing out the need for more cell lines. We have now added two new cell lines to our study, WM793 and WM1799, both of which recapitulate the fundamental phenomenology in question. Although we did not show it in our initial submission, we had originally queried a panel of melanoma cell lines in order to determine their suitability for our study (from which we settled on 1205Lu and FS4). This panel has multiple melanoma cell lines obtained from a variety of melanoma tumor samples from Radial Growth Phase (RGP), Vertical Growth Phase (VGP), and metastatic tissues. We now have included these data in our revised manuscript, since they further support our point.

      “We tested a panel of different melanoma cell lines from Radial Growth Phase (RGP), Vertical Growth Phase (VGP), and metastatic tumor types for the existence of fast invading subpopulations. We used four patient-derived melanoma cell lines, FS4, 1205Lu, WM1799, WM793, all of which have BRAF mutations (V600K for FS4, V600E for 1205Lu, WM1799, and WM793) and are known to be highly invasive in vitro and in vivo (27). Out of the 11 melanoma cell lines tested, the FS4 (not shown) and 1205lu cell lines displayed the highest levels of fast invading subpopulations (Supp. Fig. 1A).”

      First, we showed that they all have an invasive subpopulation, with 1205Lu and FS4 (not shown) having the most invasive cells. Second, validating a central claim of the manuscript, we showed that many of these cell lines, including WM1799 and WM793, showed much higher levels of both SEMA3C (4/6) and NKX2.2 (4/6) expression in the early invading population as compared to the late invading population.

      Together, these data make a strong case that our findings generalize across multiple cell lines, including RGP and VGP models. We have incorporated new text that reads as follows:

      “In order to establish the generality of our results, we measured expression of the surface marker SEMA3C across the early and late invading subpopulations of a panel of melanoma cell lines. We found that SEMA3C levels were higher in the early invading subpopulation in 4 of the 6 lines tested (Supp. Fig. 1H). Thus, these results held across a variety of cell lines and, thus, were not a unique feature of a particular patient sample.”

      • The in vivo metastasis assay in figure 1 is difficult to interpret and presents a number of concerns. 1) Only ~50% of early invading cells were labeled with GFP, this confounds many aspects of the experiment. The authors comment that in the primary tumor, as expected "...a roughly equal mix of human melanoma cells that were GFP positive and negative." If there was an expectation of equal proliferative rates in the primary tumor of early and late invading cells, given that only 1/2 of the early cells were GFP+, wouldn't we expect only 25% of the human cells to be GFP+?

      The reviewer has raised a very important quantitative question about our experiments, which we have now addressed with a more thorough set of analyses. Initially, we quantified GFP positivity post -transduction by looking at fluorescent protein levels, for which the threshold was fairly arbitrary, and potentially could have miscounted many GFP positive cells as GFP negative due to low but non-zero levels of expression. We hence recalculated our positivity rate based on single molecule RNA FISH for GFP and mCherry, given that the technique is sensitive down to even veryl ow levels of expression.

      As can be seen in Supp. Fig. 2B, the vast majority of transduced cells did indeed get the transgene and had some level of expression of GFP/mCherry. At a threshold of 5/10 molecules (GFP/mCherry, respectively), we obtained 88% and 96.15% positivity rates for GFP and mCherry, respectively. At these rates of positivity, we would expect much closer to 50% of the cells being GFP positive in the tumors, as observed. We thank the reviewer for noticing this discrepancy, and feel that our new analysis clears up the confusion and strengthens our results. These results are described in the main text as follows:

      “We labeled the cells with sufficient virus so that 88% of the early invaders were labeled with GFP and 96.15% of the late invaders were labeled with mCherry (Supp. Fig. 2B). We then sampled lungs from mice at various times post-injection to look for metastatic cells (Fig.1F) and overall tumor growth (Supp. Fig. 2C,D).”

      2) The authors note technical difficulties in detecting mCherry in sections. It seems as though this forced them to use a RNA FISH probe to identify human vs. mouse and by extension/negative selection the human FISH positive, GPF negative cell represented a mCherry stained late-invading cell. This is not ideal and seems over complicated. If the population of interest was engineered to express mCherry, why not directly probe for mCherry?

      The reviewer has raised an important point about our experimental design. Indeed, we attempted multiple times and in multiple ways to detect mCherry protein directly. We tried multiple times with multiple antibodies, but the signal was simply not detectable. Hence, we arrived at the experimental design we outlined. We felt that a fully transparent disclosure of the issues was preferable, even if it did make the design sound overly complex. We will note that our primary result—that the vast majority of the metastatic cells are GFP positive and hence derived from fast invaders—is robust to any detection issues for mCherry.

      3) Given the poor initial labeling/transduction of the early invaders, how can the authors be confident that all human cells without GFP signal are late invaders?

      The reviewer raises a great point that is addressed by our GFP and mCherry RNA FISH analysis above, showing that the transduction efficiency was actually quite a bit higher than initially thought due to low but non-zero GFP signal being counted as GFP negative. With the much higher transduction efficiencies we have now validated, we believe that the vast majority of human cells with no GFP signal should be late invaders.

      • The authors may have missed an opportunity to study FS4 clone F6 and 1205 clone E11. What is the SEMA3C and NKX2.2 status of these clones? Are they able to revert expressions? The reviewer has pointed out an interesting opportunity for further exploration. Unfortunately, because they were identified as part of an initial screening study, those particular clones were not kept for subsequent analysis. However, in our revised manuscript, we have now worked up multiple additional cell lines (WM1799 and WM793), both of which had high expression levels of both SEMA3C (Supp. Fig. 1H, shown above) and NKX2.2 (Fig. 2C) in the early invading subpopulation. Currently, we do not have data on reversion experiments for these two cell lines, but we would expect them to behave similarly to the other cell lines we examined in this study.

      • The lack of statistical analysis/comparisons throughout the paper needs to be addressed. We thank the reviewer for pointing out these deficiencies. We have now added statistical comparisons throughout.

      • In figures 1E and 3B, why do the parental (homogenous) cells demonstrate less invasiveness than the selected for the SEMA3C low or "late-invaders" respectively? This is an important point that the reviewer has raised. The finding did occur in every replicate, so we assume it is biologically and not statistical. We have now included the following language in the discussion noting the issue and some possible explanations.

      “It is worth noting that, while the SEMA3C-high (early-invading) subpopulation drove the highly invasive phenotype, the SEMA3C-low (late-invading) subpopulation also displayed a somewhat more invasive phenotype than the parental population. It is unclear what the underlying cause of this difference in invasive behavior is between the SEMA3C-low and parental populations. One possibility is that paracrine signaling between cells in the parental population confers them with less invasive potential than when the cells are isolated into early- and late-invading subpopulations. Another possibility is that technical factors associated with the sorting of SEMA3C-low cells from the parental population alter their invasive properties, thus making them distinct from the parental population.”

      • Conclusions that NKX2.2 knockout increases invasiveness and proliferation are based on 1 cell line. The comparisons done with FS4 early and late invading cells in Figure 1F may be supportive but is correlative in nature. The reviewer is right that we only have results in one cell line, and fully agree that the results in FS4 are only correlative. We have now weakened the language in the abstract and the results to emphasize that this result held in 1205Lu cells only.

      • Given the robust literature regarding phenotypic switching in melanoma, the NKX2.2 knockout increasing both invasiveness and proliferation (figures 2C, 2D) suggests it may not be involved in phenotype switching. Perhaps NKX2.2 is a negative regulator of cell activity/metabolism. We thank the reviewer for highlighting the possible connections with metabolism. To explore this possibility , we performed metabolic assays on NKX2.2 knockout and AAVS control cells and observed no significant changes in Extracellular acidification rate (B). We did observe some differences in oxygen consumption rate in the cells (A), but the differences do not seem to be large enough or systematic enough to be meaningful given the variation within the controls. We have now included these results in Supp. Fig. 3E-F.

      Note, the data previously referenced as Figure 2C,D is now in Figure 2A,B.

      “NKX2.2 is a transcriptional repressor and activator essential for the differentiation of pancreatic endocrine cells (Habener et al., 2005). In mice, deletion of NKX2.2 prevents the specification of pancreatic islet cells resulting in the replacement of insulin-expressing β cells and glucagon-expressing α cells with ghrelin-expressing cells; This lack of specification resulted in mortality of newborn mice due to hyperglycemia (Sussel et al. 1998; Prado et al. 2004). Given the link of NKX2.2 with glucose metabolism, we wondered whether NKX2.2 had an effect on metabolic activity prompting us to test the NKX2.2 knockout lines for metabolic differences in the oxygen consumption rate (OCR; an indicator of oxidative phosphorylation) and the extracellular acidification rate (ECAR; an indicator of glycolysis) of the cells. Seahorse assay analysis revealed no systematic differences in metabolic activity (Supp. Fig. 3E,F).”

      • Given that sorted SEMA3C high levels did not revert to parental FS4 levels, yet the invasive phenotype reverted to parental-like behavior undermines the usefulness of SEMA3C as a marker of invasiveness. The reviewer has brought up an important point. We were able to show that 1205Lu cells had SEMA3C levels revert to those of the parental. The reviewer is right that FS4 did not, which may be because it takes longer for FS4 to revert. It is true that the phenotypic behavior did revert. We have seen similar things in our therapy resistance work (Shaffer et al. 2017, etc.). One possible reason is that the phenotype is governed by multiple factors, and so the phenotype can revert before the expression of SEMA3C. We still think that SEMA3C is a good marker, just perhaps context dependent. We have added text to the discussion to make these important points.

      “We note that SEMA3C levels in FS4-SEMA3C-high cells did not revert to the parental levels within two weeks. This incomplete reversion may be because SEMA3C takes longer to revert than the tested time period. Interestingly, the invasive phenotype did revert in this time period, suggesting that there may be multiple factors associated with the phenotype beyond SEMA3C. It may thus be that SEMA3C is a marker of the early-invading population, but only in certain contexts.”

      Minor

      • How does SEMA3C and/or NKX2.2 expression (here 1.5% of FS4 cells were noted as "SEMA3C high") of metastatic cell lines (FS4 and 1205) compare to RGP and VGP cell lines? The reviewer has asked a great question about radial and vertical growth phase cells. We have tested several other cell lines to determine cell lines that were suitable for transwell assays. We have now included two figures (Supp. Fig. 1H and Fig. 2C) showing the SEMA3C and NKX2.2 status of each of these cell lines (parental cells) and their different subpopulations (early invaders and late invaders)—see also Reviewer #2, Major point 1. We found that the same pattern of SEMA3C-high cells held for both RGP and VGP cell lines.

      • There were a number of instances throughout the manuscript that were not clear, colloquial, or simply unnecessary - i.e. description of transwell assay. The reviewer has raised a good point about our language. We have gone through and tried to improve the clarity and precision. As for descriptions of the various assays, we have found that some readers of our papers are unfamiliar with these assays, so we elected to keep those descriptions in. We hope the reviewer does not object too strenuously.

      • The authors only analyze/mention lung metastases. Were metastases observed at other sites? The reviewer has posed a very good question about whether metastasis occurred at other locations. We stained additional tissues (liver and kidney) that were collected from the same mice and stained as per our lung invasion assays. As shown in our new Supplemental Fig. 2E, we found a similar pattern with the vast majority of metastatic cells being GFP positive; i.e., early-invaders, just as was the case for lung. We thank the reviewer for this helpful suggestion.

      “In the lung, however, we saw predominantly GFP-positive cells, showing that the vast majority of cells that migrated from the primary tumor site were initially early invading cells (Fig. 1I,J). The number of GFP cells in the lung was variable, but generally increased with time. The liver and kidney also showed an enrichment of GFP-positive cells (early invaders), suggesting that the metastatic potential of these cells is not limited to any one particular metastatic location (Supp. Fig. 2E). Thus, we established that the highly invasive subpopulation was able to drive metastasis in vivo.”

      • What is PE indicating in Figure 1D? Apologies, PE refers to the channel we used for the sorting on the FACS machine and stands for “Phycoerythrin”. To avoid any confusion, we have omitted the “PE” text on the y-axis of Fig. 1D.

      • The number of invaded cells seems to vary quite a bit between experiments - Parental 1205 cells in Fig 2C = ~200, yet 1205 clone F6 and the non-clonal 1205 cell line demonstrate ~10,000. Similar differences observed with Fs4 cells - Parental Fig 1E vs. Empty control Figure 2A. The reviewer has a good eye—indeed, there is a wide variability in the amount of invading cells. We have now remarked on this variability in the results section:

      “We note that the number of invading cells varied significantly between experiments. This variability is due to the fact that we employed transwell dishes with different growth areas, ranging from 0.33 cm2 to 4.67 cm2, leading us to collect different cell numbers for individual experiments. The cell density per cm2, however, was kept constant between experiments.”

      Note that Figure 2C and Figure 2A are now referenced as Figure 2A and Supplemental Figure 1F, respectively .

      Reviewer #2 (Significance (Required)):

      This work contributes to the growing fields of phenotypic plasticity and intratumoral heterogeneity. The authors claim to have identified a surface marker SEMA3C and a transcription factor NKX2.2 that may play a role in driving invasive proclivity. Importantly, the group demonstrates that changes in these proteins are not genetic, and therefore represent "intrinsic differences" that are a property of the tumor. Furthermore, the authors indicate how the present observations of early invading cells parallels drug resistance phenomena as their previous works highlights intrinsically resistant subpopulations (Shaffer et al., Nature 2017, Torre et al., Nature Genetics 2021 and others.). Taken together, the current and previous work underscores the importance of cell to cell non-genetic variability in disease progression and response to therapy.

      We thank the reviewer for their kind comments on the significance of our manuscript.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this study, Kaur et al. intended to use similar strategy that the same group had developed (https://www.nature.com/articles/nature22794) to identify the subpopulation in melanoma responsible for metastasis. In brief, the melanoma cell population was subjected to the selection of a specific phenotype (transwell migration dubbed as "invasiveness" behavior). By comparing the early and late invaders, a cell maker was identified to allow distinguishing the high-invasive subpopulation. A series of experiments were devised to validate the metastatic function of the high-invasive cells and delineate the signaling that drove this phenotype. The authors concluded that this rare subpopulation was originated from transcriptional fluctuation, and invasiveness is a trade-off of cell growth. Therefore, as the cells growing, overtime the phenotype was reverted to low invasiveness.

      Consistency is the most important factor for evaluating observation over temporal and spatial range. Therefore, several controls need to be clarified before further investigation in mechanisms:

      1) If the rare invader cells are arising from gene expression fluctuation, the SEMA3C-low population of parental line should generate SEMA3C-high invader subpopulation over time. This should be addressed.

      The reviewer has made an excellent point. Indeed, it is the case that the SEMA3C-low population starts to regenerate the high invader subpopulation over time. We have re-graphed Figure 3D to demonstrate this fact more clearly (See Supplemental Fig. 5A,B), showing that the SEMA3C low population regenerates many more SEMA-3C high cells after 14 days.

      2) Both early and late invader cells exhibited higher invasiveness than the parental line (Fig. 3B). Therefore, the in vivo metastatic potential of the three lines should be compared to validate the role of the invader cells in the metastatic function.

      We thank the reviewer for their comment about testing all three populations in the in vivo context. It is an excellent suggestion, but in order to fully control the experiment, we would need to add all three populations in three separate colors. Given the difficulties we had with getting even the two colors to work together, we think it is beyond the scope of our current efforts to attempt this complex experiment. We have added the following caveat to the text:

      “For unknown reasons, the parental population consistently showed lower invasiveness than the early- and late-invading subpopulations. Given that we did not test the parental population for invasiveness in vivo, future studies may address the sources and mechanisms by which the parental population differs and how those differences manifest in vivo.”

      3) To evaluate the possible intervention of cellular function by fluorescent proteins (https://doi.org/10.1016/j.ccell.2022.01.015), admix of GFP- and mCherry-labeled populations of early invader cells should be used as a control in Fig. 1F. Noticeably, the labeling ratio of the two populations was not even in Fig. 1F.

      The reviewer has brought up an important point about the potential differences brought about by the fluorescent proteins themselves. At this point, it is difficult to redo these complex in vivo experiments, but we can appeal to the fact that the admixture is maintained throughout time as the primary tumor site still has a roughly equal ratio of GFP and mCherry cells in it (Fig. 1I and Supp. Fig. 2E).

      4) When the invader cells were expanded and passed, their invasiveness will revert to the level similar to parental line in 14 days (Fig. 3B). The isolated cells were expanded for further testing and manipulation in Fig. 1C and 1F, respectively. How long did was the period for cell expansion in these experiments?

      We thank the reviewer for bringing up an important question about the details of cell expansion. For the RNA-seq, the cells were directly processed upon going through the transwell, so there was no expansion period. We have made sure to outline this more carefully in our methods section (see below).

      “RNA sequencing and analysis:

      RNA collection and library prep: Each treatment/sample was tested in 3 separate biological replicates. Upon passing through the transwell, cells were immediately collected and processed for RNA sequencing. Total RNA isolation was performed using the phenol-chloroform extraction followed by RNA cleanup using RNAeasy Micro (Qiagen 74004) kit. For transwell assays, library preparation was performed using Nebnext single-cell/low input RNA library prep kit (E6420L, NEB). For NKX2.2 CRISPR experiments, library preparation was done using NEBNext Poly(A) mRNA Magnetic Isolation Module (NEB E7490L) integrated with NEBNext Ultra II RNA Library Prep Kit for Illumina (NEB E7770L).

      Mouse tumor implantation and growth:

      All mouse experiments were conducted in collaboration with Dr. Meenhard Herlyn at The Wistar Institute, Philadelphia, PA. NSG mice (NOD.Cg-Prkdcscid Il2rgtm1Wjl/SzJ) were bred in-house at The Wistar Institute Animal Facility. All experiments were performed under approval from the Wistar Institute Care and Use Committee (protocol 201174). As in the case of RNA sequencing experiments, cells were not expanded prior to injection into the mouse, but were collected and implanted right after passing through the transwell. 50,000 melanoma cells were suspended in DMEM with 10% FBS and injected subcutaneously in the left flank of the mouse.”

      5) If invasiveness and growth are trade-off, why did the mCherry-labeled cells not dominate the population of primary tumors in Fig. 1J?

      Note that Figure 1J is now referenced as Figure 1I. The reviewer brings up a good point. For potential explanations, first, the difference in growth rate is not large, so we would not necessarily expect mCherry cells to dominate on this timescale. Also, we believe that in vivo, the tradeoff may be mitigated by other factors and cell-cell interactions that are not present in vitro. We have added a note on this point to the results.

      “(Note that these numbers were similar despite the slightly increased growth rate of the late-invading subpopulation; we assume this is due to the relatively small difference and cell-cell interactions that could prevent one population from dominating the other.)”

      6) In Fig. 1G, why RNA FISH was not used to detect mCherry-labeled cells?

      Another excellent point. RNA FISH in tissue sections can often be rather challenging due to various reasons including RNA degradation, and mCherry RNA signal was hard to definitively show in these sections. Hence, we opted for MALAT1, which is very heavily expressed and hence provided a strong and reliable signal.

      “For technical reasons, the mCherry cells were not detectable due to the fluorescence of the mCherry protein not being visible in the mouse sections. Nevertheless, we were able to detect late invaders in the population by using a human-specific MALAT1 RNA FISH probe that binds only to human MALAT1 RNA and not mouse MALAT1 RNA (28).”

      7) In vivo cycling (harvesting the cells from metastatic site and implanting them to the primary site in mouse models) has been employed to select metastatic sublines from a parental line. Could in vivo cycling make the early invader phenotype fixed?

      The reviewer has raised a very interesting point about cycling and selection. Indeed, the 1205Lu cells were derived from repeated cycling of invasive lung cells. That is probably the reason that these cells were useful for our assay, because the percentage of early-invading cells was higher. Nevertheless, the cells still have a significant proportion of late invaders, suggesting that the phenotype has not yet been fixed in the population. Perhaps with further cycling, such a fixation could be achieved. We have now noted this possibility in our discussion.

      “It is also possible that repeated cycles of selection, even of non-genetic phenotypes, could lead to an increased fraction of invasive cells. Indeed, 1205Lu cells were derived by exactly such repeated cycles, which presumably are the reason they have a higher percentage of invasive cells; however, despite these repeated rounds of selection, most cells are still not highly invasive, suggesting that it is difficult for this property to fully fix in the population.”

      **Referees cross-commenting**

      Both reviewers' questions are important for adequate controls.

      Reviewer #3 (Significance (Required)):

      There are several studies trying to identify subpopulation responsible for the metastasis of melanoma and other types of cancer, and a few mechanisms have been revealed. However, the significance depends on if the results can be validated on clinical data. It is lacking in this study.

      We thank the reviewer for their statement of interest in the problem. We agree that it is helpful to link these results to clinical data. We did perform TCGA analyses of several different genes, including SEMA3C, that emerged from our data, and there were no systematic relationships to phenotype. Of course, the relationship to clinical data is complex and many important factors are not obvious from the TCGA data, so we do not think that necessarily diminishes our results. Rather, we think our results raise a conceptual point that there can be rare cells with non-genetic differences that can drive metastasis. Further work will be required to translate these results to the clinic.

      We have added the following to the main text:

      “We found that the SEMA3C-high cells were far more invasive, intrinsically, than SEMA3C-low cells and the population overall, thus demonstrating that cells vary intrinsically in their invasiveness, and the very invasive subpopulation is marked by the expression of SEMA3C (Fig. 1E). Note, overexpression of SEMA3C in FS4 single cell clones revealed no changes in invasiveness, suggesting that SEMA3C is a marker with no functional relevance to invasiveness per se (Fig. 1D; Supp. Fig. 1E-G). We verified the expression levels of the genes identified in our RNA sequencing study in the The Cancer Genome Atlas (TCGA) data. We combined the list of differentially expressed genes in early invaders with the gene set enrichment analysis (GSEA) “Hallmarks of cancer epithelial-mesenchymal transition” and compared expression in primary vs. metastatic TCGA samples, finding no appreciable difference (Fig. 5A-B). These data suggest that these markers do not have obvious clinical correlates. Moreover, Kaplan Meier analysis comparing the survival time (days to death) between patient cohorts with either high or low SEMA3C expression levels revealed that SEMA3C does not predict survival time post-diagnosis, as both survival curves (p=0.898) follow comparable trends between the two cohorts (Fig. 5C). However, conceptually, our results raise the possibility that a rare, non-genetically defined subpopulation of cells may drive metastasis due to its increased degree of invasiveness, which further data collection efforts in patient samples may help validate.”

    1. Author Response

      Reviewer #1 (Public Review):

      1) I was confused about the nature of the short-term plasticity mechanism being modeled. In the Introduction, the contrast drawn is between synaptic rewiring and various plasticity mechanisms at existing synapses, including long-term potentiation/depression, and shorter-term facilitation and depression. And the synaptic modulation mechanism introduced is modeled on STDP (which is a natural fit for an associative/Hebbian rule, especially given that short-term plasticity mechanisms are more often non-Hebbian).

      Indeed, because of its associative nature, the modulation mechanism was envisioned to be STDP-like, i.e. on faster time scales than the complete rewiring of the network (via backpropagation) but slower time scales than things like STSP which, as the reviewer points out, are usually not considered associative. One thing we do want to highlight is that backpropagation and the modulation mechanism are certainly not independent of one another. During training, the network’s weights that are being adjusted by backpropagation are experiencing modulations, and said modulations certainly factor into the gradient calculation.

      We have edited the abstract and introduction to try to make the distinction of what we are trying to model clearer.

      1) cont: On the other hand, in the network models the weights being altered by backpropagation are changes in strength (since the network layers are all-to-all), corresponding more closely to LTP/LTD. And in general, standard supervised artificial neural network training more closely resembles LTP/LTD than changing which neurons are connected to which (and even if there is rewiring, these networks primarily rely on persistent weight changes at existing synapses).

      Although we did not highlight this particular biological mechanism because we wanted to keep the updates as general as possible, one could view the early versus late LTP. We have added an additional discussion of how the associative modulation mechanisms and backpropagation might biologically map into this mechanism in the discussion section.

      1) cont: Moreover, given the timescales of typical systems neuroscience tasks with input coming in on the 100s of ms timescale, the need for multiple repetitions to induce long-term plasticity, and the transient nature/short decay times of the synaptic modulations in the SM matrix, the SM matrix seems to be changing on a timescale faster than LTP/LTD and closer to STP mechanisms like facilitation/depression. So it was not clear to me what mechanism this was supposed to correspond to.

      We note that although the structure of the tasks certainly resembles known neuroscience experiments that happen on shorter time scales (and with the introduction of the 19 new NeuroGym tasks, even more so), we did not have a particular time scale for task effects in mind. So each piece of “evidence” in the integration tasks may indeed occur over significantly slower time scales and could abstractly represent multiple repetitions in order to induce (say) early phase LTP.

      Given that the separation between the two plasticity mechanisms may be clearer for STSP, and indeed many of the tasks we investigate may more naturally be mapped to tasks that occur on time scales more relevant to STSP, we have introduced a second modulation rule that is only dependent upon the presynaptic firing rates. See our response to the Essential Revisions above for additional details on these new results.

      2) A number of studies have explored using short-term plasticity mechanisms to store information over time and have found that these mechanisms are useful for general information integration over time. While many of these are briefly cited, I think they need to be further discussed and the current work situated in the context of these prior studies. In particular, it was not clear to me when and how the authors' assumptions differed from those in previous studies, which specific conclusions were novel to this study, and which conclusions are true for this specific mechanism as opposed to being generally true when using STP mechanisms for integration tasks.

      We have added additional works to the related works sections and expanded the introduction to try to better convey the differences with our work and previous studies. Briefly, mostly our assumptions differed from previous studies in that we considered a network that relied only on synaptic modulations to do computations, rather than a network with both recurrence and synaptic modulations. This allowed us to isolate the computational power and behavior of computing using synaptic modulations alone.

      It is hard to say which of the conclusions are generally true when using STP mechanisms for integration tasks without a comprehensive comparison of the various models of STP on the same tasks we investigated here. That being said, we believe we have presented in this work conclusions that are not present in other works (as far as we are aware) including: (1) a demonstration of the strength of computing with synaptic connection on a large variety of sequential tasks, (2) an investigation into the dynamics of such computations how they might manifest in neuronal recordings, and (3) a brief look at how these different dynamics might be computational beneficial in neuroscience-relevant areas. We also note that one reason for the simplicity of our mechanism is that we believe it captures many effects of synaptic modulations (e.g. gradual increase/decrease of synaptic strength that eventually saturates) with a relatively simple expression, and so we believe other STP mechanisms would yield qualitatively similar results. We have edited the text to try to clarify when conclusions are novel to this study and when we are referencing results from other works.

      Reviewer #2 (Public Review):

      On the other hand, the general principle appears (perhaps naively) very general: any stimulus-dependent, sufficiently long-lived change in neuronal/synaptic properties is a potential memory buffer. For instance, one might wonder whether some non-associative form of synaptic plasticity (unlike the Hebbian-like form studied in the paper), such as short-term synaptic plasticity which depends only on the pre-synaptic activity (and is better motivated experimentally), would be equally effective. Or, for that matter, one might wonder whether just neuronal adaptation, in the hidden layer, for instance, would be sufficient. In this sense, a weakness of this work is that there is little attempt at understanding when and how the proposed mechanism fails.

      We have tried to address if the simplicity of the tasks considered in this work may be a reason for the MPN’s success by training it on 19 additional neuroscience tasks (see response to Essential Revisions above). Across all these additional tasks, we found the MPN performs comparable to its RNN counterparts.

      To address whether associativity is necessary in our setup we have introduced a version of the MPN that has modulation updates that are only presynaptic dependent. We call this the “MPNpre” and have added several results across the paper addressing its computational abilities (again, additional details are provided above in Essential Revisions). We find the MPNpre has dynamics that are qualitatively the same as its MPN counterpart and has very comparable computational capabilities.

      Certainly, some of the tasks we consider may also be solvable by introducing other forms of computation such as neuronal adaptation. Indeed, we believe the ability of the brain to solve tasks in so many different ways is one of the things that makes it so difficult to study. Our work here has attempted to highlight one particular way of doing computations (via synapse dynamics) and compared it to one particular other form (recurrent connections). Extending this work to even more forms of computation, including neuronal dynamics, would be very interesting and further help distinguish these different computational methods from one another.

      Reviewer #3 (Public Review):

      Because the MPN is essentially a low-pass filter of the activity, and the activity is the input - it seems that integration is almost automatically satisfied by the dynamics. Are these networks able to perform non-integration tasks? Decision-making (which involves saddle points), for instance, is often studied with RNNs.

      We have tested the MPN on 19 additional supervised learning tasks found in the NeuroGym package (Molano-Mazon et. al., 2022), which consists of several decision-making-based tasks and added these results to the main text (see response to Essential Revisions above, and also Figs. 7i & 7j). Across all tasks we investigated, we found the MPN performs at comparable levels to its RNN counterparts.

      Manuel Molano-Mazon, Joao Barbosa, Jordi Pastor-Ciurana, Marta Fradera, Ru-Yuan Zhang, Jeremy Forest, Jorge del Pozo Lerida, Li Ji-An, Christopher J Cueva, Jaime de la Rocha, et al. “NeuroGym: An open resource for developing and sharing neuroscience tasks”. (2022).

      The current work has some resemblance to reservoir computing models. Because the M matrix decays to zero eventually, this is reminiscent of the fading memory property of reservoir models. Specifically, the dynamic variables encode a decaying memory of the input, and - given large enough networks - almost any function of the input can be simply read out. Within this context, there were works that studied how introducing different time scales changes performance (e.g., Schrauwen et al 2007).

      Thank you for pointing out this resemblance and work. In our setup, the fact that lamba is the same for the entire network means all elements of M decrease uniformly (though the learned modulation updates may allow for the growth of M to be non-uniform). One modification that we think would be very interesting to explore is the effects on the dynamics of non-uniform learning rates or decays across synapses. In this setting, the M matrix could have significantly different time scales and may even further resemble reservoir computing setups. We have added a sentence to the discussion section discussing this possibility.

      Another point is the interaction of the proposed plasticity rule with hidden-unit dynamics. What will happen for RNNs with these plasticity rules? I see why introducing short-term plasticity in a "clean" setting can help understand it, but it would be nice to see that nothing breaks when moving to a complete setting. Here, too, there are existing works that tackle this issue (e.g., Orhan & Ma, Ballintyn et al, Rodriguez et al).

      Thank you for pointing out these additional works, they are indeed very relevant and we have added them all to the text where relevant.

      Here we believe we have shown that either recurrent connections or synaptic dynamics alone can be used to solve a wide variety of neuroscience tasks. We don’t believe a hybrid setting with both synaptic dynamics and recurrence (e.g. a Vanilla RNN with synaptic dynamics) would “break” any part of this setup. Since each of the computational mechanisms could be learned to be suppressed the network could simply solve the task by relying on only one of the two mechanisms. For example, it could use a strictly non-synaptic solution by driving eta (the learning rate of the modulations) to zero or it could use a non-recurrent solution by driving the influence of recurrent connections to be very small. Orhan & Ma mention they have a hard time training a Vanilla RNN with Hebbian modulations on the recurrent weights for any modulation effect that goes back more than one time step, but unlike our work they rely on a fixed modulation strength.

      Indeed, we think how networks with multiple computational mechanisms will solve tasks is a very interesting question to be further investigated, and a hybrid solution may be likely. We believe our work is valuable in that it illuminates one end of the spectrum that is relatively unexplored: how such tasks could be solved using just synaptic dynamics. However, what type of solution a complete setup ultimately lands on is likely largely dependent upon both the initialization and the training procedure, so we felt exploring the dynamics of such networks was outside the scope of this work.

      One point regarding biological plausibility - although the model is abstract, the fact that the MPN increases without bounds are hard to reconcile with physical processes.

      Note although the MPN expression does not have explicit bounds, in practice the exponential decay eventually does balance with the SM matrix updates, and so we observe a saturation in its size (Fig. 4c, except for the case of lamba=1.0, which is not considered elsewhere in the text). However, we explicitly added modulation bounds to the M matrix update expression and did not find it significantly changed the results (see comments on Essential Revisions above for details).

    1. Compiled ratings, author response, and editorial comment


      Ratings and predictions

      Ratings (1-100)

      <table> <tr> <td> </td> <td>Evaluator 1 </td> <td> </td> <td>Evaluator 2 </td> <td> </td> <td>Evaluator 3 </td> <td> </td> </tr> <tr> <td>Rating category </td> <td>Rating (0-100) </td> <td>90% CI </td> <td>Rating (0-100) </td> <td>90% CI </td> <td>Rating (0-100) </td> <td>Confidence </td> </tr> <tr> <td> </td> <td> </td> <td> </td> <td> </td> <td> </td> <td> </td> <td> </td> </tr> <tr> <td>Overall assessment </td> <td>40 </td> <td>20-60 </td> <td>80 </td> <td>60-90 </td> <td>65 </td> <td>Medium </td> </tr> <tr> <td>Advancing knowledge and practice </td> <td>30 </td> <td>20-60 </td> <td>80 </td> <td>70-90 </td> <td>70 </td> <td>Medium </td> </tr> <tr> <td>Methods: Justification, reasonableness, validity, robustness </td> <td>50 </td> <td>40-60 </td> <td>70 </td> <td>50-90 </td> <td>Not qualified </td> <td> </td> </tr> <tr> <td>Logic & communication </td> <td>60 </td> <td>40-75 </td> <td>85 </td> <td>65-95 </td> <td>80 </td> <td>Medium-to-high </td> </tr> <tr> <td>Open, collaborative, replicable </td> <td>70 </td> <td>40-75 </td> <td>73 </td> <td>50-95 </td> <td>Not qualified </td> <td> </td> </tr> <tr> <td>Relevance to global priorities </td> <td>90 </td> <td>60-95 </td> <td>85 </td> <td>70-90 </td> <td>80 </td> <td>High </td> </tr> </table>

      Journal predictions (1-5)

      <table> <tr> <td> </td> <td>Evaluator 1 </td> <td> </td> <td>Evaluator 2 </td> <td> </td> <td>Evaluator 3 </td> <td> </td> </tr> <tr> <td>Prediction metric </td> <td>Rating (0-5) </td> <td>90% CI </td> <td>Rating (0-5) </td> <td>90% CI </td> <td>Rating (0-5) </td> <td>Confidence </td> </tr> <tr> <td>What ‘quality journal’ do you expect this work will be published in? </td> <td>2 </td> <td>1-2 </td> <td>3.5 </td> <td>3-5 </td> <td>3.5 </td> <td>Medium </td> </tr> <tr> <td>On a ‘scale of journals’, what tier journal should this be published in? </td> <td>2 </td> <td>1-2 </td> <td>4 </td> <td>3-5 </td> <td>3.5 </td> <td>High </td> </tr> </table>

      Author response

      To start with we would like to commend the format and reviewer comments which were of extremely high quality. The evaluations provided well thought out and constructively critical analysis of the work, pointing out several assumptions which could impact findings of the paper while also recognizing the value of the work in spite of some of these assumptions. Research in this space is difficult due to the highly interdisciplinary nature of the questions being asked, and the major uncertainties that need to be addressed. We value good epistemics and understand that it takes many people critically looking at a problem to achieve this, which is what motivated our participation in the Unjournal pilot. A format which allows work to be published and reviewed in an open nuanced manner can reduce the friction of working on such questions and speed up communal sense making on important questions. We are excited to have participated and look forward to seeing how Unjournal progresses. We hope that future work highlighted by the reviewers that addresses assumptions and issues of the paper will be undertaken, by external parties who are better equipped to critically analyse this area of research improving epistemics in relation to nuclear risk, resilient foods, AGI safety and the greater the existential risk space.

      To clarify, the intention of the comparison of resilient foods to AGI safety was chosen as AGI safety is considered the greatest x-risk by many. Consequently, comparison of cost-effectiveness of resilient foods to AGI safety was intended to highlight the merit of resilient foods to motivate further investment, as opposed to motivating redirecting funding from AGI safety to resilient foods.

      We have included responses to aspects of the evaluations below.

      Evaluation 1

      Structure of cost-effectiveness argument

      • The biggest issue with interpretability this causes is that I struggle to understand what features of the analysis are making resilient food appear cost-effective because of some feature of resilient food, and which are making resilient food appear cost-effective because of some feature of AI. The methods used by the authors mean that a mediocre case for resilient food could be made to look highly cost-effective with an exceptionally poor case for AI, since their central result is the multiplier of value on a marginally invested dollar for resilient food vs AI. This is important, because the authors’ argument is that resilient food should be funded because it is more effective than AI Risk management, but this is motivated by AI Risk proponents agreeing AI Risk is important – in scenarios where AI Risk is not worth investing in then this assumption is broken and cost effectiveness analysis against a ’do nothing’ alternative is required. For example, the authors do not investigate scenarios where the benefit of the intervention in the future is negative because “negative impacts would be possible for both resilient foods and AGI safety and there is no obvious reason why either would be more affected”. While this is potentially reasonable on a mathematical level, it does mean that it would be perfectly possible for resilient foods to be net harmful and the paper not correctly identify that funding them is a bad idea – simply because funding AI Risk reduction is an even worse idea, and this is the only given alternative. If the authors want to compare AGI risk mitigation and resilient foods against each other without a ‘do nothing’ common comparator (which I do not think is a good idea), they must at the very least do more to establish that the results of their AI Risk model map closely to the results which cause the AI Risk community to fund AI Risk mitigation so much. As this is not done in the paper, a major issue of interpretability is generated.

      We could have compared to the Open Philanthropy last dollar if that had been available at the time of publishing ($200 trillion per world saved or 0.05 basis points of existential risk per $billion): https://forum.effectivealtruism.org/posts/NbWeRmEsBEknNHqZP/longterm-cost-effectiveness-of-founders-pledge-s-climate. Our median for spending $100 million is ~2x10^-10 far future potential increase per dollar, or 500 basis points per $billion, or ~10,000 times as cost-effective. Ours is about 500 times as cost effective as the upper bound on that page.

      • More generally, this causes the authors to have to write up their results in a non-natural fashion. As an example of the sort of issues this causes, conclusions are expressed in entirely non-natural units in places (“Ratio of resilient foods mean cost effectiveness to AGI safety mean cost effectiveness” given $100m spend), rather than units which would be more natural (“Cost-effectiveness of funding resilient food development”). I cannot find expressed anywhere in the paper a simple table with the average costs and benefits of the two interventions, although a reference is made to Denkenberger & Pearce (2016) where these values were presented for near-term investment in resilient food. This makes it extremely hard for a reader to draw sensible policy conclusions from the paper unless they are already an expert in AGI risk and so have an intuitive sense of what an intervention which is ‘3-6 times more cost-effective than AGI risk reduction’ looks like. The paper might be improved by the authors communicating summary statistics in a more straightforward fashion.

      Figure 5 is Far future potential increase per $, which is an absolute value. That said, we acknowledge that the presentation of findings throughout could have been made more straightforward for non-expert readers and will aim to communicate summary statistics in a more accessible way in future work.

      Continuing on from this point, I don’t understand the conceptual framework that has the authors consider the value of invested dollars in resilient food at the margin. The authors’ model of the value of an invested dollar is an assumption that it is distributed logarithmically. Since the entire premise of the paper hinges on the reasonability of this argument, it is very surprising there is no sensitivity analysis considering different distributions of the relationship between intervention funding and value. Nevertheless, I am also confused as to the model even on the terms the authors describe; the authors’ model appears to be that there is some sort of ‘invention’ step where the resilient food is created and discovered (this is mostly consistent with Denkenberger & Pearce (2016), and is the only interpretation consistent with the question asked in the survey). In which case, the marginal value of the first invested dollar is zero because the ’invention’ of the food is almost a discrete and binary step. The marginal value per dollar continues to be zero until the 86 millionth dollar, where the marginal value is the entire value of the resilient food in its entirety. There seems to be no reason to consider the marginal dollar value of investment when a structural assumption made by the authors is that there is a specific level of funding which entirely saturates the field, and this would make presenting results significantly more straightforward – it is highly nonstandard to use marginal dollars as the unit of cost in a cost-effectiveness analysis, and indeed is so nonstandard I’m not certain fundamental assumptions of cost-effectiveness analysis still hold.

      In the survey, we ask about the job of spending $100 million, but then we refer to the cost per life saved paper which discusses separate interventions of research, planning, and piloting, some of these interventions such as early stage research don't cost very much money and increase the probability of success, which is why we argue marginal thinking makes sense. For instance, significant progress in the last year in prioritizing the most cost-effective resilient foods that also feed a lot of people has been achieved, this could lead to development and deployment of much more effective food production methods for such scenarios.

      Methods

      The presentation of the sensitivity analysis as ‘number of parameters needed to flip’ is nonstandard, but a clever way to intuitively express the level of confidence the authors have in their conclusions. Although clever, I am uncertain if the approach is appropriately implemented; the authors limit themselves to the 95% CI for their definition of an ‘unfavourable’ parameter, and I think this approach hides massive structural uncertainty with the model. For example, in Table 5 the authors suggest their results would only change if the probability of nuclear war per year was 4.8x10^-5 (plus some other variables changing) rather than their estimated of 7x10^-3 (incidentally, I think the values for S model and E model are switched in Table 5 – the value for pr(nuclear war) in the table’s S model column corresponds to the probability given in the E model).

      This appears to be a coincidence, the lowest 5th percentile value of all nuclear war probabilities was used, which was given by the furthest year into the future with no nuclear war. For S model this is 49 years into the future and has a value of 4.8x10^-5 and for E model this is 149 years into the future and has a value of 1.8X10^-4 (see inserted screen shots).

      S model 5th percentile of nuclear war probability per year after x years of no nuclear war (lowest probability of nuclear war per year after 49 years no nuclear war): Figure link

      image1

      E model 5th percentile of nuclear war probability per year after x years of no nuclear war (lowest probability of nuclear war per year after 149 years no nuclear war): Figure link

      image2

      Third, the authors could have done more to make it clear that the ‘Expert Model’ was effectively just another survey with an n of 1. Professor Sandburg, who populated the Expert Model, is also an author on this paper and so it is unclear what if any validation of the Expert Model could reasonably have been undertaken – the E model is therefore likely to suffer from the same drawbacks as the S model. It is also unclear if Professor Sandburg knew the results of the S Model before parameterising his E Model – although this seems highly likely given that 25% of the survey’s respondents were Professor Sandburg’s co-authors. This could be a major source of bias, since presumably the authors would prefer the two models to agree and the expert parameterising the model is a co-author.

      Professor Sandberg was not shown the S model parameters to avoid introducing bias. That said, we acknowledge that the small size of the existential risk field, and influence of several highly cited early works such as the FHI TECHNICAL REPORT Global Catastrophic Risks Survey have the potential to introduce anchoring bias.

      Parameter estimates

      Notwithstanding my concerns about the use of the survey instrument, I have some object level concerns with specific parameters described in the model.

      • The discount rate for both costs and benefits appears to be zero, which is very nonstandard in economic evaluation. Although the authors make reference to “long termism, the view that the future should have a near zero discount rate”, the reference for this position leads to a claim that a zero rate of pure time preference is common, and a footnote observing that “the consensus against discounting future well-being is not universal”. To be clear, pure time preference is only one component of a well-constructed discount rate and therefore a discount rate should still be applied for costs, and probably for future benefits too. Even notwithstanding that I think this is an error of understanding, it is a limitation of the paper that discount rates were not explored, given they seem very likely to have a major impact on conclusions.

      Thank you for highlighting this point, this is an important consideration that would make valuable future work.

      • A second concern I have relating to parameterisation is the conceptual model leading to the authors’ proposed costing for the intervention. The authors explain their conceptual model linking nuclear war risk to agricultural decline commendably clearly, and this expands on the already strong argument in Denkenberger & Pearce (2016). However, I am less clear on their conceptual model linking approximately $86m of research to the widescale post-nuclear deployment of resilient foods. The assumption seems to be (and I stress this is my assumption based on Denkenberger & Pearce (2016) – it would help if the authors could make it explicit) that $86m purchases the ‘invention’ of the resilient food, and once the food is ‘invented’ then it can be deployed when needed with only a little bit of ongoing training (covered by the $86m). This seems to me to be an optimistic assumption; there seems to be no cost associated with disseminating the knowledge, or any raw materials necessary to culture the resilient food. Moreover, the model seems to structurally assume that distribution chains survive the nuclear exchange with 100% certainty (or that the materials are disseminated to every household which would increase costs), and that an existing resilient food pipeline exists at the moment of nuclear exchange which can smoothly take over from the non-resilient food pipeline.

      Denkenberger & Pearce (2016) does not include costs post GCR and only considers R&D, and response and preparedness planning, and related costs pre disaster. This work pre disaster would likely result in expenditure post disaster being significantly lower than stored food.

      I have extremely serious reservations about these points. I think it is fair to say that an economics paper which projected benefits as far into the future as the authors do here without an exploration of discount rates would be automatically rejected by most editors, and it is not clear why the standard should be so different for existential risk analysis. A cost of $86m to mitigate approximately 40% of the impact of a full-scale nuclear war between the US and a peer country seems prima facie absurd, and the level of exploration of such an important parameter is simply not in line with best practice in a cost-effectiveness analysis (especially since this is the parameter on which we might expect the authors to be least expert). I wouldn’t want my reservations about these two points to detract from the very good and careful scholarship elsewhere in the paper, but neither do I want to give the impression that these are just minor technical details – these issues could potentially reverse the authors’ conclusions, and should have been substantially defended in the text.

      We agree that this estimate from the published work is likely low and have since updated our view on cost upwards. The nuclear war probability utilized does not include other sources of nuclear risk such as accidental detonation of nuclear weapons leading to escalation, intentional attack, or dyads involving China.

      Evaluation 2

      The Methods section is well organised and documented, but once in a while it lacks clarity and it uses terminology that may or may not be appropriate. Here’s a list of things Ii found a bit confusing:

      • Terminology
        • The submodels for food and AGI are said to be “independent”; is this meant in a probabilistic way? Are there no hidden/not modelled variables that influence both?

      In reality we anticipate that there are a myriad of ways in which nuclear risk and AGI would interact with one another. Are AI systems implemented in nuclear command and control? If so when and how does this change nuclear war probability? What will data sets used to train AI systems post nuclear exchange look like compared to present? Post nuclear exchange will there be greater pressure to utilize autonomous systems? How many/which chip fabs will be destroyed during a nuclear exchange?

      Capturing such interactions in the model in a rigorous way would have required a considerable section within the paper, which was beyond the scope of what could be included. We raised that the submodels are independent to make people aware of this simplifying assumption.

      We believe that investigating the interdependence of x-risks is an important open question that would make valuable future work.

      • The “expert” model was quite confusing for me, maybe because “Sandberg” and the reference number after “Sandberg” don’t match, or maybe because I was expecting a survey vs. expert judgement quantification of uncertainty. As I said (structured) expert judgement is one of my interests: https://link.springer.com/book/10.1007/978-3-030-46474-5

      There is an error in the referencing, this should have linked to the following guesstimate model: Denkenberger, D., Sandberg, A., Cotton-Barrat, O., Dewey, D., & Li, S. (2019b). Food without the sun and AI X risk cost effectiveness general far future impact publication. Guesstimate. https://www.getguesstimate.com/models/11691

      • In the caption of fig 2, “index nodes” and “variable nodes” are introduced. Index nodes are later described, but I don't think I understood what was meant by “variable” nodes. Aren’t all probabilistic nodes variable?

      This language comes from analytica taxonomies of the different types of nodes, this is simply describing what the nodes are in the analytica implementation. See this link for more information: https://docs.analytica.com/index.php/Create_and_edit_nodes

      • Underlying assumptions/definitions
      • The structure of the models is not discussed. How did you decide that this is a robust structure (no sensitivity to structure performed as far as I understood)

      An earlier model only considered collapse and nonrecovery of civilization as the route to far future impact. The current structure developed the structure further and is more inclusive.

      • What is meant by “the data from surveys was used directly instead of constructing continuous distributions”?

      Instead of sampling from a distribution created from the survey data, the model randomly draws a survey response value from the index of values for each of the 32000 model runs.

      It is great that the models are available upon request, but it would be even better if they would be public so the computational reproducibility could be evaluated as well.

      Links to the models are available at the following links.

      S-model: https://www.getguesstimate.com/models/13082

      E-model: https://www.getguesstimate.com/models/11691


      Editorial note

      Evaluators were asked to follow the general guidelines available here. They were also provided with this document with additional resources specific to the paper, rationale for its selection, and an ‘editorial’ first pass of aspects of the paper to consider.

      Note that this evaluation was organized during the Unjournal pilot phase and was managed manually using several Google Docs. The format may differ from future evaluations that will be managed with the Kotahi Platform.

      The evaluations were conducted on the version of the article published in the International Journal of Disaster Risk Reduction, however, we have posted the evaluations on the preprint for accessibility.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      RC-2022-01661

      Response to reviewers:

      Review Commons questions and Reviewers’ comments verbatim in plain text.

      Authors’ responses in bold text. Line numbers refers to numbers in the marked-up manuscript. In text citations in this document – see bibliography at bottom of this document.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: Cells within multicellular organisms are mutually dependent on each other - cells of one type or in one location provide signals that can regulate the health and differentiation of the target cells that receive those signals. Such signalling can operate bi-directionally, emphasizing the co-dependence of cells upon each other. The ovarian follicle provides an excellent model system to study intercellular signaling and its consequences, in this case between the oocyte and the somatic granulosa cells that surround it. Oocytes secrete members of the TGFbeta growth factor family that are required for normal differentiation of the granulosa cells, which in turn is necessary for normal development of the oocyte. Here the autohors show that adding TGFB-type growth factors (cumulin or BMP15) to the cuture medium during in vitro maturation increases the fraction of oocytes that can reach the blastocyst stage (improved developmental competence) and alters the pattern of protein landscape in both the (cumulus) granulosa cells and the oocyte. Changes in the mitochondria and parameters relevant to energy metabolism are also altered. They conclude that these changes underpin the acquisition of developmental competence by the oocytes.

      Major issues: The authors are world leaders in this field and therefore exceptionally well-qualified to carry out the proposed work. There are a number of issues, however, that limit the confidence with which conclusions may be drawn.

      First, the experimental strategy makes drawing inferences about the role of cumulin and BMP15 challenging. Maturing oocytes express GDF9 and BMP15 (the components of cumulin). Thus, the experiments are not comparing presence vs absence of cumulin and BMP15, but rather comparing oocytes and cumulus cells exposed to supra-physiological levels of these factors to controls that are exposed to physiological levels. In other words, the experimental setup detects changes that occur in response to higher than normal levels of the factors. Ideally, one would have complementary experiments where GDF9 and BMP15 were deleted from the system, to illustrate the effects of their absence. This would be a massive additional undertaking, however. Yet, without such experiments, relying on the results of the overexpression approach to understand the functions of cumulin and BMP15 at physiological levels is risky. RESPONSE #1 We appreciate these insightful perspectives. We apologise for not making it clear that the model used is not in fact an overexpression model. This is because, by removing the cumulus-oocyte complex from the follicle and studying it in vitro (oocyte IVM), secretion of these growth factors by the oocyte is notably compromised, so the controls are not exposed to normal physiological levels as suggested by the reviewer. This loss of normal secretion ex vivo is evidenced by: 1) in Mester B. et al _[1]_; Figure 2, we showed the mouse oocytes matured in vitro (i.e. as per the current study) are essentially devoid of the mature domain BMP15 protein, which will therefore be likewise for cumulin as cumulin contains one subunit of BMP15, and 2) mammalian cumulus-oocyte complexes explanted and cultured in vitro by IVM benefit (in terms of developmental competence) from the addition of exogenous oocyte-secreted factors such as BMP15, GDF9 and cumulin, demonstrating that they are rate-limiting under IVM conditions. We were the first to demonstrate this in 2006 _[2]_ which has been subsequently verified in many papers, including in the current paper for cumulin. The exact extent to which the controls are deficient in BMP15 and cumulin is unclear, as there are not yet reliable mouse ELISAs for these, but the model is an add-back model rather than an overexpression model. We have now added text at lines 150-152 and in the Fig 1 legend, to make this point clearer.

      Re using complimentary deletion, knock-out or antagonist-type experiments: we agree this would be ideal. However, this is likely impossible as cumulin is a non-covalent heterodimer of BMP15 and GDF9 (as first named and characterised by us: Mottershead DG et al ____[3]____). Hence, to knockout cumulin one needs to knockout either or both of BMP15 and GDF9, making it impossible to discriminate the actions of the heterodimer from the homodimers. In support of this, reviewer #3 made exactly this point, and stated “Such functional analysis cannot be done using gene knockout mouse lines…… only functional work as the one presented in this manuscript can find the mechanisms of action of these hormones”. This issue is further complicated by the fact that BMP15 and GDF9 are thought to exist as homodimers, as well as monomers, including in equilibrium in heterodimeric form as cumulin (also noted by Reviewer #3). Furthermore, there is no cumulin-specific antagonist, e.g. a cumulin-specific neutralizing antibody. Small molecule signaling inhibitors (e.g. Smad2/3 or Smad1/5/8 antagonists) certainly block cumulin actions, but therefore simultaneously also block GDF9 or BMP15 actions. Collectively, these unique (with the TGFβ superfamily) structural peculiarities of cumulin make it complex to interrogate its mechanisms of action, to the extent that others have largely focused on BMP15 or GDF9 homodimer actions only, when in reality, cumulin is likely the key natural protagonist responsible for oocyte paracrine signalling. We have added a paragraph to this effect to the discussion, at lines 417-423, including acknowledging the experimental limitations of the study dictated by having to deal with a noncovalent heterodimer.

      Second, the granulosa cells and oocytes interact throughout the prolonged period of growth, and this is the time when the beneficial effects of the granulosa cells on the oocyte have been most clearly documented. Yet the experiments focus on the much shorter period of meiotic maturation. This is when oocyte-granulosa cell interaction is being down-regulated, even if not entirely disrupted. RESPONSE #2: Indeed, oocyte-granulosa interaction is absolutely essential during oocyte growth, development and meiotic maturation, for healthy oocyte function, including the orchestrated down-regulation of oocyte-granulosa interactions during the latter phase. As pioneered by John Eppig and others, including ourselves ____[4]____ (ref has 673 citations), the master conductor of this dynamic oocyte-granulosa interaction during oocyte meiotic maturation are the oocyte-secreted factors. Hence, these factors are critical at this stage, and we maintain that this is a very important phase of oocyte development to study.

      Third, the data reported illustrate associations or correlations, but no experiments test the function of the changes in the proteome or of the changes in the morphology of the mitochondria or ER. Which if any of these is linked to the improved development of the oocytes after fertilization is unknown. Moreover, no experiments address how the growth factors cause the observed changes, which occur over a period of a few hours. RESPONSE #3 This is true. The study is already very large and has many functional experiments (e.g. oocyte respiration, oocyte MS, etc), that follow-up the findings from the proteomic analysis. Hence, the study has taken a global cellular metabolism approach, e.g. we show that cumulin downregulates oxidative phosphorylation globally, c.f. pathways within OXPHOS. We found an abundance of individual proteins altered in this period (see figure 4) and to follow up on the actions and consequences of individual proteins would: 1) at best show small incremental effects, as metabolism of such a cellular syncytium is vastly complex and inter-connected, 2) further increase the size of what is already a large study, and 3) detract from the more important wholistic effects on cumulus-oocyte complex metabolism, which must act as whole, interacting entity, to support the complexities of supporting early life post-fertilization.

      __Taken together, these issues unfortunately limit the potential impact of the work. But the amount of work required to address them would be substantial and not really feasible for this manuscript. The best route may be to present the work as an overexpression study that has identified associations, with a discussion that acknowledges the limitations of this approach. __RESPONSE #4 This is not an over-expression study – see RESPONSE #1 above. We have added text in the discussion at lines 417-423, that acknowledges the limitations of the study by the impossibility to conduct a killer knockout experiment of cumulin.

      Minor issues: The text of the manuscript should be revised in a number of places. 32: We characterized the molecular mechanisms by which two model OSFs, cumulin and BMP15, regulate oocyte maturation and cumulus-oocyte cooperativity. --Mechanistic studies were not performed. RESPONSE #5 The scope of this work was to; (a) identify global changes to protein expression, and (b) to use this data to implement follow-up experiments on some of the lead indicators, such as metabolism (respiration, small molecule metabolic markers) and cellular morphology. This work provides the groundwork, insight and rationale for future additional studies of specific mechanisms of COC interactions. As discussed at RESPONSE# 1, these studies are as close as anyone can probably get currently to mechanistic studies of a NOVEL noncovalent heterodimer, when the noncovalent homodimers are also in play, as also noted by reviewer #3 who specifically references mechanisms: “…… only functional work as the one presented in this manuscript can find the mechanisms of action of these hormones”.

      In some instances, in the interests of brevity, we made remarks based on our data, but without specifying details in the text. To redress this, we have now added specific details which illustrate and justify our statements based on the data collected (see RESPONSES #6, #7, #9 below). For greater clarity, we have also restructured our supplementary data set to cover the analysis progression from full raw proteomic data to differentially expressed proteins, to use of differentially expressed proteins in network analysis. The supplementary data set now includes the full proteomics lists for both cells and treatments (Supplementary Tables S1, S2, S3, S4), protein sequences confidently identified by both proteomic software platforms (Supplementary Tables S5, S6), differentially expressed proteomics lists for both cells and treatments (Supplementary Tables S7, S8, S9, S10), differentially expressed protein list used for the network analysis (supplementary Table S11). The Table S11 lists are intended to facilitate use by readers to perform their own analyses, if they so wish, since they can simply copy and paste the list to the on-line STRING platform. Finally, the reanalysed network analysis output, based on the differentially expressed proteins shown in supplementary Table S11, are shown in supplementary Tables S12 and S13.

      __40: Collectively, these data demonstrate that OSFs remodel cumulus cell metabolism during oocyte maturation in preparation for ensuing fertilization and embryonic development. --No mechanistic studies demonstrate this. __RESPONSE #6 There is no mention of mechanism in this sentence at line 40 and we have provided exhaustive evidence that cumulus cell metabolism is remodelled as stated (Figures 4B and 4C). For example, of the 59 upregulated proteins in the cumulus cells of cumulin treated COC (Figure 4C and supplementary Table S11), 38 (i.e. 64%) are involved in primary metabolic processes (supplementary Table S12), including amino acid metabolism (GOT2, SHMT1, CTH, MAT2B), lipid and steroid metabolism (CERS5, DHCR7, HSD17b4), aldehydes metabolism (RDH11), nucleotides biosynthesis (RRM1, GMPR2), glycans biosynthesis and protein glycosylation (UGDH, GFPT2, GALNT2), respiratory chain (mt-ND1). The cellular macromolecule metabolic process is also a significantly enriched network, involving 26 out of the 59 upregulated proteins (i.e. 44%, Figure 4C and supplementary Table S11) and includes processes such as protein complex assembly (TM9sF4, DHX30, AP2M1), RNA metabolism and mRNA processing (DDX17, DDX5, DDX39bPRPF19, PRPF6, HNRNPF, CPSF6). To help clarify the specificity of our findings, we have added this text to the revised manuscript (lines 465-474).

      __46: Oocyte-secreted factors downregulate protein catabolic processes, and upregulate DNA binding, translation, and ribosome assembly in oocytes. --No direct evidence is provided. __RESPONSE #7 The proteomic data provides direct evidence that these processes are involved. Sentence modified at lines 47-48 to be more specific re processes. Additional text has been included (revised manuscript lines 434-443) to provide specific details of the differentially expressed proteins involved in each of these processes.

      48: Oocyte-secreted factors alter mitochondrial number... --Need to establish that the MitoTracker is a suitable tool to measure the number of mitochondria. RESPONSE #8____ We recognise that total mitochondrial uptake of the MitoTracker Orange dye could be a reflection of either mitochondrial function (polarity) and/or mitochondrial number, given the manufacturer’s (Thermo Fischer) statement that “MitoTracker™ Orange CMTMRos is an orange-fluorescent dye that stains mitochondria in live cells and its accumulation is dependent upon membrane potential”, as we specified in several places in the original manuscript (Lines 354-355, 366-367 and 235 of the marked up manuscript version) . However, we agree that in several places in the manuscript we also indicated that MitoTracker was being used as a measure of mitochondrial number. To avoid this ambiguity, we have made some clarifications in the text (revised manuscript lines 235, 351-352, 377, 481-482, and in Figure 5B legend). Given the extensive and diverse metabolic changes indicated by the proteomic data, our aim was to explore the potential role of mitochondria in response to cumulin and BMP15 treatment of COCs, which we did by use of EM morphology studies (figure 5A), mitochondrial respiration (figures 6B and 6C), quantification of energy metabolites, such as ATP, NAD and related compounds, by mass spectrometry (figure 6D), metabolites identified in multispectral unmixing studies (figure 7) and mitochondrial function using MitoTracker (figure 5B). Collectively this data suggested a modest downturn of energy metabolism, particularly in cumulin treated COCs. This downturn did not cause a change in net energy charge in COCs (figure 6D) despite a reduction in redox ratio in both cells (figure 7A and 7B) and respiration in COCs (Figure 6B and 6C), and could reflect adaptive changes in response to cumulin and BMP15, reflecting metabolic plasticity/Warburg effect, as explained in the discussion (revised manuscript lines 453-551).

      79: ...for maintaining genomic stability and integrity of the oocyte... 83: ...minimizing secondary production of potentially DNA damaging free radicals. --Please provide supporting references from the literature. RESPONSE #9 References have been added (lines 82 and 85 of the revised manuscript)

      373: This study provides a detailed exploration of the mechanisms by which oocyte-secreted factors... --No mechanistic studies were performed. RESPONSE #10 We respectfully disagree. One of many mechanisms we have studied here is OXPHOS. We have shown this is how OSFs change metabolism – that is a mechanism. As discussed at RESPONSE #1, these studies are as close as anyone can probably get currently to mechanistic studies of a noncovalent heterodimer, when the noncovalent homodimers are also in play, as also noted by reviewer #3 who specifically references mechanisms: “…… only functional work as the one presented in this manuscript can find the mechanisms of action of these hormones”. Please also refer to the comments in RESPONSE #5.

      383: Collectively, these data demonstrate that oocyte paracrine signaling remodels COC metabolism in preparation for ensuing fertilization and embryonic development. --Studies do not show that the differences observed between control and treatment groups are related to fertilizability or embryonic development. RESPONSE #11 The data in Fig 2C, 2D show exactly that; that the difference between control and treatment (cumulin) is an increase in embryonic development. It does not show fertilizability, so we removed that at lines 41 and 415.

      396: suggesting that cumulin affects meiosis in the oocyte and may increase meiotic fidelity... --This statement is highly speculative. RESPONSE #12 We accept this critique - reference to meiosis and meiotic fidelity removed, line 435 (revised manuscript).

      409: ...lacks the machinery for amino acid uptake... --Is the oocyte unable to take up any amino acids or only certain amino acids? RESPONSE #13 Thank you for noting this as this sentence is too absolute. Oocytes have a very poor capacity to take up most or even all AAs, which are instead supplied to the oocyte via cumulus cells. Sentence modified at lines 455-456 to be less absolute.

      In general, the manuscript is written clearly. However, in several places, technical terms or jargon will make tough going for readers who are not already familiar with the techniques being used. These should be explained using language that will be understood by journal readers who are unfamiliar with the details of the techniques. Examples include:

      51: define metabolic workload using scientific terms.

      RESPONSE #14____ “metabolic workload” rephrased to “metabolic processes”. Lines 52-53.

      67: metabolically 'inept' requires more precision. RESPONSE #15 “metabolically inept” rephrased to “metabolically dependent on surrounding granulosa cells” ____[5]____. Line 69

      262: explain 'multispectral analysis' RESPONSE #16 A citation has been added, which explains the technique (ref ____[6]____ at the end of this response letter, which is the same paper as citation [34] in the revised manuscript; lines 111 and 217; revised manuscript). A detailed explanation of this technique has also been added in the supplementary information, under the section “Multispectral microscopy”.

      268: how is 'limited' overlap defined. RESPONSE #17 The phrase “distinct profiles, with limited overlap between…” has been rephrased to “distinct profiles, between…” (line 279 of the revised manuscript), as the main point is that the patterns/profiles across treatments are different, and we did not quantify the extent of overlap.

      318: define higher workload RESPONSE #18 the phrase “…implying a higher workload for both organelles” has been replaced with a more specific explanation; “We suggest that such changes in morphology may be related to the remarkable increase in a diversity of metabolic processes which we observed (Figure 4C and supplementary Table S12), since ER morphology and architecture is known to be highly dynamic in response to environmental and developmental factors which affect cells” ____[7]____ (Lines 342-345).

      324: provide documentation or citations to support the assertion that the intensity of MitoTracker staining is an accurate proxy for the number of mitochondria.

      RESPONSE #19____ Please refer to explanation under RESPONSE #8

      358: Multispectral discrimination modelling utilised cellular image features from the autofluorescent profiles of oocytes and cumulus cells. --Please clarify this merthodology and provide support for its utility.

      RESPONSE #20____ The supplementary information section (Multispectral microscopy, lines 239-258) has been expanded and clarifications provided as to the wavelengths of the channels, the features used and the unsupervised nature of algorithms.

      360: intersection of union of 5-22%

      RESPONSE #21____ This is a measure of the extent of overlap of data distribution for each class (treatment), i.e. of how different they are. The ellipse (Fig 3D) represents one standard deviation around the central mean value for that data set. The overlap of these ellipses is quantified by their intersection over union (IoU) value, which is the ratio of the area of the two-ellipse intersections, divided by the area of their union (the shape created by their overlap being treated as creating one continuous object). IoU values range from 0 to 100% for fully separated and fully overlapping, respectively. Hence, a 5% IoU represents a low level of overlap of data distribution between treatments. Brief explanatory text has now been added at line 387-388.

      Comments on Figures. Fig. 3A, B. The total number of proteins and the number of differentially expressed proteins among the treatment groups don't match between A and B. For example, A (Mascot-Sheffield) indicates that 17 proteins were differentially expressed between untreated and cumulin-treated oocytes. B shows (138 + 74) expressed un the untreated but not cumulin-treated and (156 + 87) expressed in the cumulin-treated but not untreated. Please account for this difference. RESPONSE #22 The panels in Fig 3A and Fig 3B each contain different representations of the information contained within the proteomics dataset, and explain different aspects of the data. The Venn diagram panels in Figure 3B display the level of overlap of specific proteins identified in each cell, treatment and software subgroup. The degree of overlap in each cluster is high (i.e., 76 – 78% for Mascot/scaffold and 95 - 97 % for PD2.4) as would be expected within the same cell type and analysis approach, where the main variable is cell treatment. We agree that the total numbers in the Venn diagrams did not exactly match the total numbers in Figure 3A, which likely resulted from using slightly different parameters during data processing. We have now used exactly the same data set in panels A and B (the full PD2.4 and Mascot/scaffold datasets are shown in the supplementary proteomics summary Excel spreadsheet), so that total numbers are now identical, and will hopefully avoid any confusion in comparing across panels. However, the main conclusion to be drawn from Fig 3B remains unchanged, in that it shows that by far the majority of identified proteins overlap between treatments (control, BMP, cumulin), regardless of cell type or data analysis approach. However, it should be noted that Figure 3B has no information about protein fold change/differential expression, and only represents numbers of proteins confidently identified, and the level of overlap of identified proteins between treatments. Only panel 3A shows differential protein expression relative to the respective control groups.

      Fig. 3D. What do the circles represent and how were their parameters (size, position) established? RESPONSE #23 The separation of data distributions for each class is shown by an ellipse for each cluster, which encompasses one standard deviation around the central mean values. This text has now been added to the Fig 3 legend.

      Reviewer #1 (Significance (Required)): These studies identify changes in cumulus cells and oocytes that occur in response to addition of cumulin or BMP15 to the culture medium during in vitro maturation. While the data are new, the significance of the advance is limited by (i) the fact that the control group were exposed to physiological levels of GDF9 and BMP15, so this is essentially an over-epxression study and (ii) no mechanistic studies experimentally tested how the observed changes (eg, in quantity of a specific protein) affect the developmental potential of the oocytes or cumulus cells. RESPONSE #24 We thank the reviewer for their perspectives however we respectfully disagree on all accounts. We have rebutted these 2 concerns: point (i) at RESPONSE #1, and point (ii) at RESPONSE #5 above.

      Reviewer expertise: growth and meiotic maturation of the mammalian oocyte

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: The report by Richani et al, presents a research carried out in mice, in which they treated cumulus-oocyte complexes with either BMP15 and cumulin. Upon treatment they evaluated a series of biologically relevant parameters in oocytes and cumulus cells. Their findings indicate that the treatment with these molecules alter the molecular composition of oocytes and cumulus cells (proteome and metabolome), mitochondrial morphology in cumulus cells and overall oxygen consumption in COCs.

      Major comments: - Are the key conclusions convincing? - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? * part of the discussion related to metabolic pathways being up regulated due to the treatments need to the revised because For instance, It is hard for me to grasp how a pathway with 2 proteins achieved FDR significance below 0.01, as I see in figure 4c

      RESPONSE #25____ Network enrichment was performed using the open access software STRING ( ____https://string-db.org/ [8]____), and we have now provided additional information on how we utilised STRING in the supplementary information section, under “Gene Ontology Network Enrichment Analysis” (lines 176-217). STRING utilises information available in the Gene Ontology (GO) database ( ____http://geneontology.org/docs/ontology-documentation/____ ) to determine; (a) how many of the differentially expressed proteins identified in the proteomics experimental data fall into specific networks, (b) how much enrichment this represents relative to a random network of the same size, and (c) whether the enrichment is statistically significant based on the FDR statistic. The size of each GO network within the background set (whole genome or other) will therefore be a major determinant of whether the number of proteins identified in the proteomics experiment represents significant enrichment of a particular network. A few proteins identified within a small background network will represent greater enrichment (and lower FDR score) than the same number of proteins in a much larger network. In fact the “count in network” is often approximately the inverse of the enrichment strength (see supplementary Table S12, within the supplementary dataset Excel spreadsheet). Note that only significantly differentially expressed proteins were used for the network analysis presented in this paper, so even in the case where just 2 proteins are significantly enriched in a network (e.g., “Farnesyl diphosphate metabolic process” identified in the GO biological process section of BMP15 treated cumulus cells) they represent two upregulated proteins in a small network, so the functional/biological significance of this is likely quite high.

      In revision of the manuscript we noticed that we had likely originally used the full lists of differentially expressed proteins for network analysis, rather than separating up and downregulated proteins as intended. Furthermore an updated version of STRING is now available, with improvements in the method of correction for multiple testing within the FDR output (STRING version 11.5, current since August 12, 2021). We have therefore revised the STRING network analyses, and have provided a list of the STRING input proteins (supplementary Table S11), STRINGv11.5 gene ontology (GO) functional enrichments for up and downregulated proteins in BMP and cumulin treated cumulus cells and oocytes respectively (supplementary Tables S12 and S13), and replaced the very large Figure 4C and D heatmaps (submitted version) with a summary (new Figure 4C; revised version). The updated heat maps can still be viewed in supplementary Tables S12 and S13 (the heatmaps now being the updated ones, deriving from our review response).

      * In the discussion the authors use the term "oocyte secreted factors" a lot (one example lanes 490, 515, 516, 517), but they should specify BMP15 and cumulin, because these were their treatments. *Including in the title, you did not evaluate all oocyte paracrine factors, just BMP15 and cumulin RESPONSE #26 “Oocyte secreted factors (OSFs)” replaced with BMP15 and cumulin throughout the manuscript where we refer specifically to our treatments, results or discussion of results, except where we refer to “these OSFs” (eg line 34), and not where we refer to the principal of OSF signalling more generically. Re the latter, hence we wish to retain the title as is, as BMP15 and cumulin are prototypical oocyte secreted factors, as the title refers to the principal more generally.

      - Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. NA

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. NA

      • Are the data and the methods presented in such a way that they can be reproduced? *no, in some instances, the methods are not described, see my comment below about enrichment analysis. RESPONSE #27 Addressed next below

      • Are the experiments adequately replicated and statistical analysis adequate? *I was not able to access enrichment analysis.

      RESPONSE #28____ The method of Network Enrichment is now described in more detail in the supplementary methods section. See previous explanation under RESPONSE #25 above.

      *lines 241-242: "MitoTracker staining and data from metabolite analysis by mass spectrometry were analysed by one-way ANOVA with Tukey's (parametric data) or Kruskal-Wallis (non- parametric data) post-hoc tests. " Specify which test was used for which data RESPONSE #29 Post-hoc test for MitroTracker data was Tukey’s, as already stated in Figure 5 legend. Post-hoc test for metabolite analyses was Kruskal-Wallis – text now added to Figure 6 legend.

      Minor comments: - Specific experimental issues that are easily addressable. NA

      • Are prior studies referenced appropriately? Yes

      • Are the text and figures clear and accurate? *lines 178-180: "expressed proteins list was further analyzed using STRING software to explore clustering and enrichment of specific molecular functions, and biological pathways. Detailed methodology and rationale for this approach is provided in the supplementary methods." I did not read text in the supplementary materials indicating how enrichment analysis was carried out.

      RESPONSE #30____ Our apologies for this oversight. We have now provided additional information on how we utilised STRING in the supplementary information, in a new section titled “Gene Ontology Network Enrichment Analysis” (lines 176-217).

      * What was the concentration of treatment for the samples used for proteome and mascot/scaffold experiments?

      RESPONSE #31____ The two bioinformatic analyses were conducted on common biological samples, so naturally the treatment concentrations were also the same. Text modified at line 175 to make this clearer.

      * lanes 263 and 264: "Cell types and treatment conditions can be clearly distinguished based on these orthogonal global approaches." I did not see what is the basis for this statement

      RESPONSE #32____ The sentences immediately following this (i.e. lines 274-281) elaborated the basis for this statement, particularly where it is explicitly stated “____Proteomic heat maps (Fig. 3C) and multispectral analysis plots (Fig. 3D) both show distinct profiles, between controls, BMP15 and cumulin treated COCs, in both cell types.____”, at lines 277-281.

      The data for the two global approaches are shown in Figure 3C (heat maps generated by PD2.4 comparing differences in protein abundance across treatments, shown separately for cumulus cells and oocytes), and Figure 3D (linear discriminant analysis comparing differences in multispectral imaging data across treatments, shown separately for cumulus cells and oocytes). Both of these global analyses show clear differences in distribution pattern between controls (untreated) and treated samples (BMP15 and cumulin), in both oocytes and cumulus cells. The approaches are (a) global, since each relates to analysis of the complete cell extracts (as opposed to targeting a specific component/analyte), and (b) orthogonal because different and unrelated measurement techniques are used i.e., proteomics (mass spectrometry) and multispectral imaging (spectroscopy).____ *I did not understand the discrepancy between the numbers observed in Figure 3A and Figure 3B.

      RESPONSE #33____ Refer to RESPONSE #22 above. We have checked the data, and revised the Venn diagrams (Figure 3B) with data analysed using identical parameters, for both Figures 3A and 3B, to avoid confusion over protein numbers. We also noticed and corrected a discrepancy with regard to the number of differentially expressed oocyte proteins under the merged data column of Figure 3A.____ *I could not make sense of the shades of green or red that were used in 4C and 4D. Is the reader only supposed to make those comparisons within column? RESPONSE #34 Note: Figures 4C and 4D are now Supplementary Tables S12 and S13. The red shades represent network enrichment analysis of upregulated proteins, while the green shades represent network enrichment analysis of downregulated proteins. The colour gradients in each case follow the numerical values for “count in network”, enrichment strength, and lower FDR, with greater colour intensity for higher numbers (and lower FDR). However, we agree that the original four panels (A, B, C and D) comprising figure 4, made for a very large and potentially overwhelming figure. To simplify the data presentation we have reprocessed the data in STRING (see details under RESPONSE #25 above) and have moved the now considerably shorter network lists (originally displayed as Figures 4C and 4D) to supplementary Tables S12 and S13, and the new Figure 4C provides a network enrichment summary instead. This is likely easier to comprehend, with the marked contrast in networks identified between oocytes and cumulus cells easier to see. The numbers of up and downregulated proteins on which the network analysis is based are also shown in Figure 4C, while the specific proteins used and networks identified are shown in supplementary tables S11, S12 and S13 (original colour coding retained, and also explained within each table). - Do you have suggestions that would help the authors improve the presentation of their data and conclusions? *Figure 4 is really hard to process. At least in my pdf it spanned 4 pages.

      RESPONSE #35____ Indeed Figure 4 was large and has now been shortened. We made considerable effort to attempt to present in Fig 4 the vast amount of proteomic data in a summarized, hopefully comprehensible fashion. We have now moved Figs 4C and 4D to the supplementary, and replaced it with the simplified new Fig 4C (tabular format). Pease also see comments under RESPONSES#25 and #34 re this. *I did not understand why put networks that are not significant as up-regulated or down-regulated. Besides, as mentioned above, I do not know how significance was assessed.. RESPONSE #36 Network analysis was performed using only those proteins which were significantly differentially expressed and had a consistent direction of fold change in both mascot/scaffold spectral counting and PD2.4 peak intensity proteomics quantitative approaches. Proteins with no significant expression change (i.e., the majority of proteins, which represented proteins with __Reviewer #2 (Significance (Required)):

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. - Place the work in the context of the existing literature (provide references, where appropriate). *This paper is significant because it provided a variety of measurements following the treatment of cumulus cells with BMP15 and cumulin. The authors show that these two oocyte factors can impact the molecular structure, physiology and structure of organelles in cumulus cells. The work is well contextualized with the current literature. RESPONSE #37 We thank the reviewer for these positive remarks.

      • State what audience might be interested in and influenced by the reported findings. *Researchers in the field of developmental biology would be most interested in this report.

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. * I do not have expertise in hyperspectral analysis. I have been working with cumulus-oocyte complexes for over a decade, mixing technologies in cell biology, microscopy, high-throughput genome, and proteome analysis. We do all our bioinformatics work in-house.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The work is interesting. Cumulin is a heterodimer hormone formed of GDF9 and BMP15. It is the main oocyte secreted factor. Being an heterodimer, gene knockout provides very little information about its mechanism of action. The team has a unique form of cumulin that is stable. This is why I think this work is important. However, I found two technical issues: one regarding mitochondrial count using MitoTracker and the other about comparing gene lists between the two cell types when protein input submitted to mass spectrometry differ between the two cell types. It is expected to find more with more input material. The text would need to be adjusted accordingly. Also, there is a lot of free statements and a lack of precision that is annoying. In my opinion, there are many overstatements that are not supported by the data because the work was not designed to test what is stated. The Discussion is very circular as the same statements come back on the next pages. RESPONSE #38 See specific responses below

      Detailed review:

      The manuscript entitled "Oocyte and cumulus cell cooperativity and metabolic plasticity under the direction of oocyte paracrine factors" reports an in depth analysis of the exposure of cumulus oocyte complexes to either BMP15 or cumulin, the GDF9-BMP15 heterodimer. Impact assessment was done by determining developmental competence of the exposed oocytes, comparative profiling of the proteomes and spectral emissions as well as testing a potential impact at the ultrastructure level by electron microscopy imagery. Mitochondrial respiration as well as abundance of related metabolites was contrasted between the two treatments.

      Overall, the work is interesting. It is very difficult to study hormonal heterodimers because they originate from two different genes and they can naturally be found in a monomeric as well as a dimeric state. Such functional analysis cannot be done using gene knockout mouse lines. Genetic disruption provided the background that GDF9 and BMP15 are key oocyte secreted factors however only functional work as the one presented in this manuscript can find the mechanisms of action of these hormones. RESPONSE #39 We thank the reviewer for these positive comments, especially in relation to the difficulty of getting to the mechanism of actions of a non-covalent heterodimer, and hence the importance of functional experiments in providing mechanistic insights.

      Comments:

      I really appreciated the reference to auto-symbiosis. We often see the reference to a cellular syncytium but this one is interesting. RESPONSE #40 Thank you.

      Although I appreciated the work, two important technical issues (between cell types comparisons and mitochondrial count) have been raised and there is a bit of unnecessary overselling throughout the manuscript. Sticking to the results would keep the value of the work high and wouldn't give that impression of overstatement. RESPONSE #41 Technical issues – see responses below, as well as responses to other reviewers. We have provided additional methodological information for greater clarity, and added specific observations from our data, to support all statements, to avoid the impression of unsubstantiated overstatements.

      Technical issues:

      While the gene/protein enrichment analysis can be influenced by the input material submitted to mass spectrometry, the gene network analysis is influenced by the number of gene/proteins available for the enrichment analysis. It is thus difficult to compare both cell types. RESPONSE #42 We agree that shorter protein lists might be expected to result in fewer networks. However, it is interesting to consider the possible reasons for the shorter list:

      (1) In our case the amount of protein extracted from oocytes (2-3____m____g) was much less than from cumulus cells (15-17____m____g) as explained in the “Mass Spectrometry for proteomic analysis” section in the Supplementary Information. This is because COCs have many more cumulus cells than oocytes by number as well as total mass. Consequently it was possible to load a larger ____m____g amount of total peptides from cumulus cells onto the nanoLCMSMS system, but it should be noted that on-column loading is not only determined by the total amount of material injected, but also by the limits in capacity of the C18 peptide capture cartridge upstream from the column (which is 1 – 1.5 ug estimated from the binding capacity of C18 with a bed volume of 0.35____m____L, since the trap cartridges have dimensions of 300____m____m ID and 5mm length; ____http://tools.thermofisher.com/content/sfs/manuals/Man-M5001-LC-Nano-Capillary-Micro-Columns-ManM5001-EN.pdf____ and ____https://www.optimizetech.com/opti-pak-trap-columns/____ ). Consequently, the different initial loading of oocyte vs cumulus cell proteins/peptides are likely to have made little if any contribution to proteome coverage, since 2-17____m____g all exceed the trap cartridge binding capacity, and consequently 1 - 1.5____m____g was captured and transferred to the nano-column, while the excess was transferred to waste. Based on the capacity limits of the capture cartridge, there was likely enough peptides/proteins in both oocyte and cumulus cell extracts to reach the saturation point, and therefore much more consistent on-column loadings than the initial ____m____g loadings would imply. We have added some additional information re this to the method section (see the section “Mass Spectrometry for proteomic analysis” in the Supplementary Information).

      (2) The expressed proteomes of different cell types may be expected to differ not only in specific proteins expressed but also in the number of different proteins. In a recent study by Marei et al ____[9]____, equal amounts of total protein (9.5ug) from bovine oocytes and matching cumulus cells were prepared for their proteomics comparisons, and interestingly these authors also report about half as many proteins identified in oocytes as compared with cumulus cells, despite equal amounts of total protein used; “A total of 1703 and 1185 proteins were identified in CCs and oocytes, respectively, 679 of which were common.” Furthermore, a transcriptomic study of bovine oocytes and cumulus cells by Moorey et al ____[10]____, showed 69 and 128 differentially expressed genes (DEGs) in oocytes and cumulus cells respectively (comparing small vs large cells in each case), pointing to about double the differential gene expression in cumulus cells than oocytes, again implying a larger cumulus cell vs oocyte transcriptome. Our data support these observations, which collectively suggest a real difference in proteome size between oocytes and cumulus cells. If the difference in proteome size is real, then differences in network enrichment are also likely to have biological relevance, despite differences in size of the differentially expressed proteins lists.

      (3) Even if initial protein loading was a contributing factor to the size of the oocyte vs cumulus cell proteomes, it is of note that we observed approximately 2 fold fewer total proteins identified in oocytes as in cumulus cells (Figure 3A, 3B and new Figure 4C), yet the difference between number of identified networks is multiple-fold (a cumulative total of 2 networks identified in BMP15 and cumulin treated oocytes vs 143 networks identified in BMP15 and cumulin treated cumulus cells – see new Figure 4C). Furthermore, there does not seem to be a strictly linear relationship between the number of proteins submitted for network analysis and the numbers of networks identified. For example, 34 upregulated proteins in cumulin treated oocytes identified a single enriched network, while a similar number of 38 upregulated proteins in BMP15 treated cumulus cells, identified a total of 42 networks (new Figure 4C), and similarly cumulin treated cumulus cells had 59 upregulated proteins and 58 downregulated proteins, which resulted in 57 and 23 enriched networks respectively.

      Also, when performing GO terms analysis, the level of "branching" can explain the results. In other words, GO terms are organized in a tree like structure where general elements (e.g. nucleus) are delineated in finer elements (e.g. nuclear function) leading to finer ones (e.g. DNA binding)... to finer ones (e.g. DNA repair)... etc. The number of genes/proteins available in the initial list directly dictates to which level of precision the analysis can reach. In the present work, the number of identified network may simply reflect the number of elements available in the initial lists. With more info on the cumulus cells side, it is logical to be able to reach finer branches that contain only a few genes. I have looked in the supplemental data files but could not find more info about the background used. Was it all known proteins? Was it all identified proteins where the differentially expressed proteins are compared to the detected proteins? Using the list of detected proteins as background for the analysis could help. Proteome Discoverer generated much less differentially expressed proteins between treatments than Mascot/Scaffold (2-17 vs. 74-390). Maybe use the Mascot/Scaffold data using the same number of top genes (e.g. 87) between both cell types. Then it would be much more comparable. RESPONSE #43 Please also refer to the explanations under RESPONSE #34 and #42 above. We have added an additional explanation of how we performed the enrichment analysis, in the supplementary information section under the heading “Gene Ontology Network Enrichment Analysis”. In the data presented here we used the whole mouse genome as our background set. The number of total proteins identified by Mascot/Scaffold and ProteomeDiscoverer were similar, but actually considerably more differentially expressed proteins were identified using ProteomeDiscoverer (Fig 3A), as expected using peak intensity vs spectral counting ____[11]____. The spectral counting approaches usually identify fewer differentially expressed proteins, but also with a lower quantitative false positive rate, while peak intensity approaches tend to identify more differentially expressed proteins, but with a higher quantitative false positive rate ____[11]____. Our reasoning was therefore to combine proteins which vary in common across both platforms, to maximise the differentially expressed proteins list while simultaneously minimising the quantitative false positive rate. We thank the reviewer for the suggestion of using our full protein list as the background set. Initially we revised our network enrichment analysis (see comments under RESPONSE #25) still using the mouse whole genome, resulting in fewer overall networks, but improved contrast between oocytes and cumulus cells (see summary in new Figure 4C, and network analysis details in supplementary Tables S12 and S13). We then repeated the network analyses using our full protein list (4450 proteins identified in both oocytes and cumulus cells; see background list in Supplementary Table S11) as the background set. With this we similarly found no enriched GO networks for BMP15 and cumulin treated oocytes, and only 6 and 1 enriched network in BMP15 and cumulin treated cumulus cells. We suggest that detecting network enrichment against a cell specific background list may not give us the same level of “contrast” as can be achieved when comparing against the whole mouse genome.

      Line 226 and 324-328 and line 350: I have never seen the use of MitoTracker Orange to count mitochondria. According to the manufacturer: "MitoTracker Orange CMTMRos is an orange-fluorescent dye that stains mitochondria in live cells and its accumulation is dependent upon membrane potential. The dye is well-retained after aldehyde fixation." It is indicative of mitochondrial potential but it is not a method to count the number of mitochondria within a cell. I do not agree that more fluorescence means more mitochondria. RESPONSE #44 We agree and in places we used ambiguous language re mitochondrial function vs mitochondrial number. We have now clarified and corrected this - please refer to detailed comments and manuscript changes under RESPONSE #8.

      I understand that the MitoTracker data is counterintuitive to the oxygen consumption rate and stable levels of energetic metabolites. However, as the authors mention, mitochondria are known to be capable of switching from aerobic to anaerobic energy production. In some cases, heterogeneity in the mitochondrial population (such as the one in the oocyte) could mean that a mosaic respiratory potential exists where some mitochondria are more aerobic than others... To change the number of mitochondria, either fission or mitophagy must occur. Although mitochondrial DNA replication is done in approximatively 2 h and fission/division can occur over 1 h, and protein ubiquitination is done over 12 h-18 h during mitophagy, TEM micrographs (figure 5) do not show elongated mitochondria in the process of division. To detect active mitophagy, protein markers and association with lysosome would be needed. A shift in mitochondrial number may not be the suitable interpretation of the data. RESPONSE #45 Please refer to comments under RESPONSE #8

      For the spectral data analysis (Figure 3D), how did the three replicates perform? The figure does not show the replication variance relative to the treatment variance. RESPONSE #46 A version of Figure 3D but with the replicates colour-coded has been added to Supplementary Material (Supplementary Figure 2) and the manuscript text has been revised with the information that data from the three replicates are shown, added to the caption to Figure 3D.

      Wording/interpretation issues

      Lines 114-116: "This intercellular cooperativity facilitates oocyte maturation while simultaneously protecting germ-line genomic integrity, in a manner which could not be achieved by a single cell." This is an overstatement because genomic integrity was not assessed. Why consider that the nuclear function found in the proteome contrast is necessarily associated with genomic integrity. Miosis requires in dept chromatin handling. What evidence provided from the results is associated with cellular numbers. The presence of cumulus cells is known to support meiosis but it doesn't mean that some of the cellular processes have been imparted to the surrounding somatic cells. The work done for this manuscript does not test any of this claim. RESPONSE #47 We accept this point and agree that, especially the claim re germ-line genomic integrity, is an overstatement. This has been removed. We maintain however that there is ample evidence in our results that there is clear inter-cellular metabolic cooperativity between oocyte and cumulus cells and that this ultimately leads to an oocyte with improved developmental competence. The sentence has been modified to reflect this, line 117-118.

      On numerous occasions, the statements are imprecise. For example: Line 274: "More than double the number..." Since doubling a minute value does not mean the same thing as doubling a large value, values, measurements with units and ideally with SEM should be added. RESPONSE #48 Has been rephrased (see line 284 of the revised manuscript)

      Line 287: "... and almost a third more significant networks..." Please add values. RESPONSE #49 Section has been deleted (line 291-300 of the revised manuscript)

      On the same statement, since sample input material to the mass spectrometry is vastly different between cumulus cells and oocytes, is it truly comparable? Could these differences between the two cell types be associated with the amounts of proteins in the extracted samples? Typically, more variable results are obtained with the low input. It sometimes lead to apparently more difference between treatments simply because of low count numbers. On line 292, authors mentioned that protein loading was considered. How was that done? Low input cannot be compensated or normalized. The following statement on line 293 indicate that more proteins were identified in cumulus cells. This is probably due to more input material submitted to mass spectrometry. It is not necessarily a difference in protein diversity between cumulus cells and oocytes. RESPONSE #50 Please refer to detailed explanations under RESPONSES #42 and #43

      Line 293: "... resulted in the identification of about double the number..." Please add values. RESPONSE #51 Values added at lines 305-306, and additional detail has been added to this section of the manuscript (lines 305-317 revised manuscript). Line 294: "However, there were 4-5 times as many differentially expressed proteins..." Please add values. RESPONSE #52 Values added and additional detail added to this section (new lines 309-312 of the revised manuscript).

      Line 298: "...difference was quite marked..." More factual info should be added. RESPONSE #53 Values added and additional detail has been provided (lines 314-317 of the revised manuscript), as follows; “____Cumulin appeared to have a greater impact on proteomic differential expression in both cell types than BMP15 did, with 59 vs 38 and 34 vs 27 upregulated proteins in cumulin vs BMP15 treated cumulus cells and oocytes respectively, and similarly 14 vs 6 downregulated proteins in cumulin vs BMP15 treated cumulus cells and oocytes respectively (Figure 4C)”.

      Line 305: Again, the whole comparison between cell types could be argued from the standpoint of input material subjected to the analysis. Given the point is to state that cumulin has a profound impact on cumulus cells, maybe it is not necessary to compare with the oocyte data. It is logical that an oocyte secreted factor targets the neighbouring cells. The point can be made without raising the question about the potential issue of input material. RESPONSE #54 We agree with the reviewers point that it is logical that OSFs should target cumulus cells, with lesser impact on the oocyte. Nonetheless the treatments were performed on COCs, and even though the OSFs are targeting the cumulus cells, however ultimately the cumulus cells response is expected to impact oocytes. Therefore, it is relevant to look at proteomic changes to both cell types and also the related network analysis. We have however rephrased this section, to be more specific as to which data we are reporting, and have included additional citations (lines 325-334 of the revised manuscript).

      __Line 317-317: "... exhibited more rounded and swollen mitochondria..." How was that determined? In the periphery of the oolemma, mitochondria aggregates in clusters which can be quite different from one another. Maybe proportions of different shapes of mitochondria could be provided if enough mitochondria are counted from the EM micrographs. __RESPONSE #55 These are subjective observations of the typical morphological features seen in response to the different treatments. This is the typical application of TEM. Quantitative features of mitochondria are better assessed using confocal than TEM, which is the complimentary approach we took using MitroTracker in the companion figure 5B, the text for which immediately follows the TEM results. We altered the text at the sentence in question to note that these are subjective observations (line 340).

      Line 169: What do you mean by "The results were merged based on consistency..."? This seems to be a trivial way to analyse the data. RESPONSE #56 The majority of published papers reporting data dependent analysis (DDA) proteomics results utilise just a single quantitative method (i.e., either spectral counting or peak intensity). This certainly simplifies reporting, and avoids confronting uncomfortable discrepancies between different analytical approaches. However, we reasoned that robust expression change data would maintain consistency, despite the orthogonal quantitative methods. We consider it a notable strength of the approach used here that we have utilised a differentially expressed proteins list which includes only those proteins with consistent direction of fold-change in both the spectral counting and peak intensity workflows. Please also refer to comments under RESPONSE #43, re spectral counting vs peak intensity quantitative methods in data dependent analysis (DDA) proteomics.

      Line 170: "A further requirement was that at least one, if not both methods..." Again, when did you decide to use one method or to use both? Why not use the common ground from both methods? RESPONSE #57 Refer also to RESPONSE #43. In fact the main question being asked in many/most proteomics experiments is whether there is a real expression change between treatment groups. Therefore fold-change is the most pertinent common ground across disparate quantitative methodology, and indeed commonality of fold-change was the basis for merging the datasets. Since integrating peak areas is a very different approach to counting the number of spectra, then this difference in approach can make a big difference to the p-values, and is the reason why spectral counting is less sensitive to detect differential expression. For similar reasons the fold-change ratio may differ somewhat between these quantitative methodologies. However direction of fold-change is a minimum requirement for demonstrating consistent trends, hence we used this as the common ground for merging the datasets.

      Line 384: Is the paracrine signaling remodeling COC metabolism or is it enhancing the rate at which it is done? I believe this switch in metabolism occurs in untreated COCs. RESPONSE #58 We see the reviewers point in this subtle difference in wording. We agree that there is a switch in metabolism in untreated COCs during maturation – our point is that that process of changing metabolism is further remodelled by oocyte paracrine signals, to the overall betterment of the oocyte in terms of competence. We have edited this sentence to make this point clearer (line 413-415). Our data on energy charge, respiration, energy metabolite levels (Figure 6), redox potential (Figure 7) and mitotracker intensity (Figure 5) are all presented in comparison with “untreated” cells, and our conclusion that there is remodelling of metabolism is therefore relative to “untreated” COCs.

      __ __The Discussion is somewhat circular. Section will need to be adjusted if the Mitotracker-based mitochondrial count and between cell types gene/protein lists comparisons are removed.

      Accounts for mitochondrial counts: (lines 387-393) (lines 424-427) (line 463).

      RESPONSE #59 All reference to Mitotracker in the context of mitochondrial counts only have been altered to Mitotracker being an indicator of mitochondrial function/polarity and/or counts. Accounts for comparisons of gene lists length between cell types: Lines 389-391 and 475-477 and 496-499). RESPONSE #60 Please see comments under RESPONSE # 53 and the new Figure 4C.

      Line 395: "... a substantial number of oocyte upregulated proteins... Please provide number. RESPONSE #61 Additional specific proteins have been listed to support our claims of effects on specific processes (see lines 435-443 of the revised manuscript). Also see comments under RESPONSE # 7.

      Line 397: The data was not designed to test the potential of cumulin to preserve meiotic fidelity. This is an overstatement since DNA binding is part of the normal course of even during meiosis. Again, cumulin could accelerate the kinetic of meiosis. RESPONSE #62 Reference to meiosis and meiotic fidelity removed, line 435.

      __ __Line 402-405: the work was not designed to determine if cumulin would shift work allocation between the oocyte and the cumulus cells. Showing that cumulin drives meosis is interesting by itself.

      __RESPONSE #63____ Not clear that any change is requested or needed. This sentence is interpreting the significance of the results, as required in a Discussion.


      __Line 453-455: the link with the epigenome is an overstatement. RNA and DNA processing pathways are general cellular processes.

      RESPONSE #64____ The link to the epigenome was a reference to some published work. However it was linked to observations in the current data, and additional information has now been added to the updated manuscript to explain this further, as follows (currently lines 509-516);

      "These included significantly enriched networks of RNA binding, helicase activity, ribonucleoprotein complex biogenesis, and mRNA processing (supplementary Tables S11 and S12; upregulated proteins RNF20, SHMT1, DHX30, DDX17, DDX5, PRPF19, RPS4X, NOP58, DDX39b, HNRNPF, RPS271, NOP56, PRPF6, POLR2b, CPSF6, OOEP), as well as upregulation of key epigenetic regulators (HDAC2 and UHRF1; see supplementary Table S11), histone modifying protein MTA2, and significant network enrichment of the spliceosomal complex (supplementary Table S12; proteins DDX5, PRPF19, HNRNPF, PRPF6, POLR2B), which has been linked to epigenetic regulation ____[12]____.

      Minor details Line 36: I suggest to be more precise on the "nuclear" function that is affected in the oocyte. Given that oocytes are transcriptionaly quiescent at this stage, some might argue that it is a vague statement.

      RESPONSE #65____ Information relating to specific oocyte upregulated proteins and their cellular roles has been added to the updated manuscript (currently lines 434-443).

      DNA binding and ribosomal constituents (Fig. 4A, 4C),

      In vitro should be in italic because it is Latin. RESPONSE #6____6 corrected throughout

      __Lines 125-126: are the batch numbers relevant to anything? __RESPONSE #6____7 We would assume so – for the historical record. These are in-house produced proteins, cumulin is complex to produce and only a few labs worldwide have made it.

      __Line 168: Mascor = Mascot __RESPONSE #6____8 Corrected

      __Line 168: a reference for the software? __RESPONSE #6____9 URL and published references added (lines 172-175 revised manuscript)

      Line 178: need a reference for the software? RESPONSE #70 URL and published references added (line 185)

      __Line 187: Need a complete source for "Procure, 812" __RESPONSE #71 Added

      Line 188: Need a complete source for "Diatome" RESPONSE #72 Added

      Line 197: Need a complete source for "Cell-Tak" RESPONSE #73 Added

      Line 232: though = through RESPONSE #74 Corrected

      Line 243: define OCR RESPONSE #75 Added

      Line 268: If I am not mistaking, it is not a multispectral analysis. The multispectral values were analysed through a principal component analysis. RESPONSE #7____6 Data was analysed through linear discriminative analysis (LDA). This information has been added in Line 278.

      Line 363: What is the "behaviour" of an oocyte and cumulus cells? RESPONSE #77 replaced with “function”

      Line 512-513: Maybe add more on the fact that most clinics use ovulated eggs and do not perform IVM. However, IVM is needed is specific contexts such as PCOS. RESPONSE #78 Edited accordingly; lines 575-577.

      Reviewer #3 (Significance (Required)):

      Cumulin is the most potent oocyte secreted factor. Its mecanism of action is still unknown.

      I have been working on the mammalian oocyte for the past 25 years.

      References

      1. Mester, B., et al., Oocyte expression, secretion and somatic cell interaction of mouse bone morphogenetic protein 15 during the peri-ovulatory period. Reprod Fertil Dev, 2015. 27(5): p. 801-11.
      2. Hussein, T.S., J.G. Thompson, and R.B. Gilchrist, Oocyte-secreted factors enhance oocyte developmental competence. Dev Biol, 2006. 296(2): p. 514-21.
      3. Mottershead, D.G., et al., Cumulin, an Oocyte-secreted Heterodimer of the Transforming Growth Factor-beta Family, Is a Potent Activator of Granulosa Cells and Improves Oocyte Quality. J Biol Chem, 2015. 290(39): p. 24007-20.
      4. Gilchrist, R.B., M. Lane, and J.G. Thompson, Oocyte-secreted factors: regulators of cumulus cell function and oocyte quality. Hum Reprod Update, 2008. 14(2): p. 159-77.
      5. Sugiura, K., F.L. Pendola, and J.J. Eppig, Oocyte control of metabolic cooperativity between oocytes and companion granulosa cells: energy metabolism. Dev Biol, 2005. 279(1): p. 20-30.
      6. Campbell, J.M., et al., Multispectral autofluorescence characteristics of reproductive aging in old and young mouse oocytes. Biogerontology, 2022. 23(2): p. 237-249.
      7. Schwarz, D.S. and M.D. Blower, The endoplasmic reticulum: structure, function and response to cellular signaling. Cell Mol Life Sci, 2016. 73(1): p. 79-94.
      8. Szklarczyk, D., et al., STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res, 2019. 47(D1): p. D607-D613.
      9. Marei, W.F.A., et al., Proteomic changes in oocytes after in vitro maturation in lipotoxic conditions are different from those in cumulus cells. Sci Rep, 2019. 9(1): p. 3673.
      10. Moorey, S.E., et al., Differential Transcript Profiles in Cumulus-Oocyte Complexes Originating from Pre-Ovulatory Follicles of Varied Physiological Maturity in Beef Cows. Genes (Basel), 2021. 12(6).
      11. Ramus, C., et al., Benchmarking quantitative label-free LC-MS data processing workflows using a complex spiked proteomic standard dataset. J Proteomics, 2016. 132: p. 51-62.
      12. Luco, R.F., et al., Epigenetics in alternative pre-mRNA splicing. Cell, 2011. 144(1): p. 16-26.
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The work is interesting. Cumulin is a heterodimer hormone formed of GDF9 and BMP15. It is the main oocyte secreted factor. Being an heterodimer, gene knockout provides very little information about its mechanism of action. The team has a unique form of cumulin that is stable. This is why I think this work is important. However, I found two technical issues: one regarding mitochondrial count using MitoTracker and the other about comparing gene lists between the two cell types when protein input submitted to mass spectrometry differ between the two cell types. It is expected to find more with more input material. The text would need to be adjusted accordingly. Also, there is a lot of free statements and a lack of precision that is annoying. In my opinion, there are many overstatements that are not supported by the data because the work was not designed to test what is stated. The Discussion is very circular as the same statements come back on the next pages.

      Detailed review:

      The manuscript entitled "Oocyte and cumulus cell cooperativity and metabolic plasticity under the direction of oocyte paracrine factors" reports an in depth analysis of the exposure of cumulus oocyte complexes to either BMP15 or cumulin, the GDF9-BMP15 heterodimer. Impact assessment was done by determining developmental competence of the exposed oocytes, comparative profiling of the proteomes and spectral emissions as well as testing a potential impact at the ultrastructure level by electron microscopy imagery. Mitochondrial respiration as well as abundance of related metabolites was contrasted between the two treatments.

      Overall, the work is interesting. It is very difficult to study hormonal heterodimers because they originate from two different genes and they can naturally be found in a monomeric as well as a dimeric state. Such functional analysis cannot be done using gene knockout mouse lines. Genetic disruption provided the background that GDF9 and BMP15 are key oocyte secreted factors however only functional work as the one presented in this manuscript can find the mechanisms of action of these hormones.

      Comments:

      I really appreciated the reference to auto-symbiosis. We often see the reference to a cellular syncytium but this one is interesting.

      Although I appreciated the work, two important technical issues (between cell types comparisons and mitochondrial count) have been raised and there is a bit of unnecessary overselling throughout the manuscript. Sticking to the results would keep the value of the work high and wouldn't give that impression of overstatement.

      Technical issues:

      While the gene/protein enrichment analysis can be influenced by the input material submitted to mass spectrometry, the gene network analysis is influenced by the number of gene/proteins available for the enrichment analysis. It is thus difficult to compare both cell types.

      Also, when performing GO terms analysis, the level of "branching" can explain the results. In other words, GO terms are organized in a tree like structure where general elements (e.g. nucleus) are delineated in finer elements (e.g. nuclear function) leading to finer ones (e.g. DNA binding)... to finer ones (e.g. DNA repair)... etc. The number of genes/proteins available in the initial list directly dictates to which level of precision the analysis can reach. In the present work, the number of identified network may simply reflect the number of elements available in the initial lists. With more info on the cumulus cells side, it is logical to be able to reach finer branches that contain only a few genes. I have looked in the supplemental data files but could not find more info about the background used. Was it all known proteins? Was it all identified proteins where the differentially expressed proteins are compared to the detected proteins? Using the list of detected proteins as background for the analysis could help. Proteome Discoverer generated much less differentially expressed proteins between treatments than Mascot/Scaffold (2-17 vs. 74-390). Maybe use the Mascot/Scaffold data using the same number of top genes (e.g. 87) between both cell types. Then it would be much more comparable.

      Line 226 and 324-328 and line 350: I have never seen the use of MitoTracker Orange to count mitochondria. According to the manufacturer: "MitoTracker{trade mark, serif} Orange CMTMRos is an orange-fluorescent dye that stains mitochondria in live cells and its accumulation is dependent upon membrane potential. The dye is well-retained after aldehyde fixation." It is indicative of mitochondrial potential but it is not a method to count the number of mitochondria within a cell. I do not agree that more fluorescence means more mitochondria.

      I understand that the MitoTracker data is counterintuitive to the oxygen consumption rate and stable levels of energetic metabolites. However, as the authors mention, mitochondria are known to be capable of switching from aerobic to anaerobic energy production. In some cases, heterogeneity in the mitochondrial population (such as the one in the oocyte) could mean that a mosaic respiratory potential exists where some mitochondria are more aerobic than others... To change the number of mitochondria, either fission or mitophagy must occur. Although mitochondrial DNA replication is done in approximatively 2 h and fission/division can occur over 1 h, and protein ubiquitination is done over 12 h-18 h during mitophagy, TEM micrographs (figure 5) do not show elongated mitochondria in the process of division. To detect active mitophagy, protein markers and association with lysosome would be needed. A shift in mitochondrial number may not be the suitable interpretation of the data.

      For the spectral data analysis (Figure 3D), how did the three replicates perform? The figure does not show the replication variance relative to the treatment variance.

      Wording/interpretation issues

      Lines 114-116: "This intercellular cooperativity facilitates oocyte maturation while simultaneously protecting germ-line genomic integrity, in a manner which could not be achieved by a single cell." This is an overstatement because genomic integrity was not assessed. Why consider that the nuclear function found in the proteome contrast is necessarily associated with genomic integrity. Miosis requires in dept chromatin handling. What evidence provided from the results is associated with cellular numbers. The presence of cumulus cells is known to support meiosis but it doesn't mean that some of the cellular processes have been imparted to the surrounding somatic cells. The work done for this manuscript does not test any of this claim.

      On numerous occasions, the statements are imprecise. For example: Line 274: "More than double the number..." Since doubling a minute value does not mean the same thing as doubling a large value, values, measurements with units and ideally with SEM should be added.

      Line 287: "... and almost a third more significant networks..." Please add values.

      On the same statement, since sample input material to the mass spectrometry is vastly different between cumulus cells and oocytes, is it truly comparable? Could these differences between the two cell types be associated with the amounts of proteins in the extracted samples? Typically, more variable results are obtained with the low input. It sometimes lead to apparently more difference between treatments simply because of low count numbers. On line 292, authors mentioned that protein loading was considered. How was that done? Low input cannot be compensated or normalized. The following statement on line 293 indicate that more proteins were identified in cumulus cells. This is probably due to more input material submitted to mass spectrometry. It is not necessarily a difference in protein diversity between cumulus cells and oocytes.

      Line 293: "... resulted in the identification of about double the number..." Please add values.

      Line 294: "However, there were 4-5 times as many differentially expressed proteins..." Please add values.

      Line 298: "...difference was quite marked..." More factual info should be added.

      Line 305: Again, the whole comparison between cell types could be argued from the standpoint of input material subjected to the analysis. Given the point is to state that cumulin has a profound impact on cumulus cells, maybe it is not necessary to compare with the oocyte data. It is logical that an oocyte secreted factor targets the neighbouring cells. The point can be made without raising the question about the potential issue of input material.

      Line 317-317: "... exhibited more rounded and swollen mitochondria..." How was that determined? In the periphery of the oolemma, mitochondria aggregates in clusters which can be quite different from one another. Maybe proportions of different shapes of mitochondria could be provided if enough mitochondria are counted from the EM micrographs.

      Line 169: What do you mean by "The results were merged based on consistency..."? This seems to be a trivial way to analyse the data.

      Line 170: "A further requirement was that at least one, if not both methods..." Again, when did you decide to use one method or to use both? Why not use the common ground from both methods?

      Line 384: Is the paracrine signaling remodeling COC metabolism or is it enhancing the rate at which it is done? I believe this switch in metabolism occurs in untreated COCs.

      The Discussion is somewhat circular. Section will need to be adjusted if the Mitotracker-based mitochondrial count and between cell types gene/protein lists comparisons are removed.

      Accounts for mitochondrial counts: (lines 387-393) (lines 424-427) (line 463).

      Accounts for comparisons of gene lists length between cell types: Lines 389-391 and 475-477 and 496-499).

      Line 395: "... a substantial number of oocyte upregulated proteins... Please provide number.

      Line 397: The data was not designed to test the potential of cumulin to preserve meiotic fidelity. This is an overstatement since DNA binding is part of the normal course of even during meiosis. Again, cumulin could accelerate the kinetic of meiosis.

      Line 402-405: the work was not designed to determine if cumulin would shift work allocation between the oocyte and the cumulus cells. Showing that cumulin drives meosis is interesting by itself.

      Line 453-455: the link with the epigenome is an overstatement. RNA and DNA processing pathways are general cellular processes.

      Minor details

      Line 36: I suggest to be more precise on the "nuclear" function that is affected in the oocyte. Given that oocytes are transcriptionaly quiescent at this stage, some might argue that it is a vague statement.

      In vitro should be in italic because it is Latin.

      Lines 125-126: are the batch numbers relevant to anything?

      Line 168: Mascor = Mascot

      Line 168: a reference for the software?

      Line 178: need a reference for the software?

      Line 187: Need a complete source for "Procure, 812"

      Line 188: Need a complete source for "Diatome"

      Line 197: Need a complete source for "Cell-Tak"

      Line 232: though = through

      Line 243: define OCR

      Line 268: If I am not mistaking, it is not a multispectral analysis. The multispectral values were analysed through a principal component analysis.

      Line 363: What is the "behaviour" of an oocyte and cumulus cells?

      Line 512-513: Maybe add more on the fact that most clinics use ovulated eggs and do not perform IVM. However, IVM is needed is specific contexts such as PCOS.

      Significance

      Cumulin is the most potent oocyte secreted factor. Its mecanism of action is still unknown.

      I have been working on the mammalian oocyte for the past 25 years.

    1. Author Response

      Reviewer #1 (Public Review):

      Reviewer 1 confirmed the view that your paper provides new insight into YTHDC1 function in regulating SC activation/proliferation but added that some of the data could be improved to fully support the conclusions. Specifically:

      The title "Nuclear m6A Reader YTHDC1 Promotes Muscle Stem Cell Activation/Proliferation by Regulating mRNA Splicing and Nuclear Export" seems a bit overstated. Their data are not sufficient to show YTHDC1 regulating nuclear export. From figure 6 we could see some mRNAs export was inhibited upon YTHDC1 loss but intron retention also occurs on these mRNAs, for example, Dnajc14. Since intron retention could lead to mRNA nuclear retention, the mRNA export inhibition may be caused by splicing deficiency. From the data they provided we could not draw the conclusion that YTHDC1 directly affects mRNA export. I think they could not emphasize this point in the title.

      Thanks for the suggestion. It is true that in our initial submission, we had more data to support YTHDC1 regulation of mRNA splicing but not enough on nuclear export. It will take substantial amount of time and efforts to have thorough dissection on both mechanisms. Nevertheless, we argue that our data does provide evidence on YTHDC1 regulation of nuclear export. For example, in Figures 6 C, H, and M, only ~20% of the target mRNAs (such as Dnaj14) showed alteration in both splicing and export upon YTHDC1 loss while the majority of the export targets showed no splicing deficiency. For example, Btbd7 and Tiparp in Figure 6 N showed no intron retention. In addition, we have now performed Co-IP experiments to validate the interaction between YTHDC1 and THOC7 (new result added in Figure 7L), which provides extra evidence to support YTHDC1 function in regulating mRNA nuclear export. We thus would like to keep the original title in order to reflect the multifaceted function of YTHDC1 in muscle stem cells.

      The mechanism of YTHDC1 promoting muscle stem cell activation/proliferation is not solidified. The authors could strengthen their evidence through bioinformatics analysis or give more discussion. Besides, the previous work done by Zhao and colleagues (Zhao et al,. Nature 542, 475-478 (2017).) reported another m6A reader Ythdf2 promotes m6A-dependent maternal mRNA clearance to facilitate zebrafish maternal-to-zygotic transition. Does YTHDC1 regulate mRNA clearance during SC activation/proliferation? The authors should explore this possibility by deep-seq data analysis and give some discussion.

      Thanks for the critical comment. For the first concern, we think YTHDC1 promotes muscle stem cell activation/proliferation through the multi-level gene regulatory capabilities of YTHDC1 on both transcriptional and post-transcriptional processes and the myriads of targets regulated by YTHDC1. In addition, with the newly added data, we believe that YTHDC1’s function is largely dependent on its synergism with hnRNPG (Figure 7 K). We have added the discussion in lines 421-427 of the revised text. For the second question, our data showed that YTHDC1 predominantly localizes in the nucleus of SCs and myoblasts (Figure 1 F&G), thus it may not have a role in regulating mRNA clearance in the cytoplasm like YTHDF2. Nevertheless, there are a few existing reports1, 2 suggesting its possible role in mRNA degradation and stability which may arise from its transient shuttling to cytoplasm of cells. We have now added this point in lines 469-472 of the revised text.

      Reviewer #2 (Public Review):

      Reviewer 2 was similarly positive stating that several tour-de-force techniques were used to examine m6A and the biological consequence in satellite cells and that there was a large amount of data supporting the conclusions with only a few minor weaknesses.

      General points: The main body is lengthy, and some content can be reduced or condensed. For example, RNA-seq was used to determine gene expression in WT and cKO cells, but the purpose of this is not well justified given that YTHDC1 mainly functions to regulate splicing and nuclear expert of mRNA rather than controlling their expression levels. Does the RNA-seq data suggest that YTHDC1 may also regulate gene expression independent of m6A reader function?

      Thanks for the comment. We have now revised the entire text to condense the content. Nevertheless, we must point out that the purpose of the RNA-seq is to provide extra evidence for the proliferation defect of the YTHDC1 KO cells but not to search for the underlying mechanism. We have now revised in lines 159-160 to clarify this.

      Reference:

      1. Shima, H., Matsumoto, M., Ishigami, Y., Ebina, M., Muto, A., Sato, Y., Kumagai, S., Ochiai, K., Suzuki, T. & Igarashi, K. S-Adenosylmethionine Synthesis Is Regulated by Selective N(6)-Adenosine Methylation and mRNA Degradation Involving METTL16 and YTHDC1. Cell Rep 21, 3354-3363 (2017).
      2. Zhang, Z., Wang, Q., Zhao, X., Shao, L., Liu, G., Zheng, X., Xie, L., Zhang, Y., Sun, C. & Xu, R. YTHDC1 mitigates ischemic stroke by promoting Akt phosphorylation through destabilizing PTEN mRNA. Cell Death Dis 11, 977 (2020).
      3. He, P.C. & He, C. m(6) A RNA methylation: from mechanisms to therapeutic potential. EMBO J 40, e105977 (2021).
      4. Widagdo, J., Anggono, V. & Wong, J.J. The multifaceted effects of YTHDC1-mediated nuclear m(6)A recognition. Trends Genet 38, 325-332 (2022).
      5. Sheng, Y., Wei, J., Yu, F., Xu, H., Yu, C., Wu, Q., Liu, Y., Li, L., Cui, X.L., Gu, X., Shen, B., Li, W., Huang, Y., Bhaduri-Mcintosh, S., He, C. & Qian, Z. A Critical Role of Nuclear m6A Reader YTHDC1 in Leukemogenesis by Regulating MCM Complex-Mediated DNA Replication. Blood (2021).
      6. Cheng, Y., Xie, W., Pickering, B.F., Chu, K.L., Savino, A.M., Yang, X., Luo, H., Nguyen, D.T., Mo, S., Barin, E., Velleca, A., Rohwetter, T.M., Patel, D.J., Jaffrey, S.R. & Kharas, M.G. N(6)-Methyladenosine on mRNA facilitates a phase-separated nuclear body that suppresses myeloid leukemic differentiation. Cancer Cell 39, 958-972 e958 (2021).
      7. Chen, C., Liu, W., Guo, J., Liu, Y., Liu, X., Liu, J., Dou, X., Le, R., Huang, Y., Li, C., Yang, L., Kou, X., Zhao, Y., Wu, Y., Chen, J., Wang, H., Shen, B., Gao, Y. & Gao, S. Nuclear m(6)A reader YTHDC1 regulates the scaffold function of LINE1 RNA in mouse ESCs and early embryos. Protein Cell 12, 455-474 (2021).
      8. Xiao, W., Adhikari, S., Dahal, U., Chen, Y.S., Hao, Y.J., Sun, B.F., Sun, H.Y., Li, A., Ping, X.L., Lai, W.Y., Wang, X., Ma, H.L., Huang, C.M., Yang, Y., Huang, N., Jiang, G.B., Wang, H.L., Zhou, Q., Wang, X.J., Zhao, Y.L. & Yang, Y.G. Nuclear m(6)A Reader YTHDC1 Regulates mRNA Splicing. Mol Cell 61, 507-519 (2016).
      9. Webster, M.T., Manor, U., Lippincott-Schwartz, J. & Fan, C.M. Intravital Imaging Reveals Ghost Fibers as Architectural Units Guiding Myogenic Progenitors during Regeneration. Cell Stem Cell 18, 243-252 (2016).
      10. Yankova, E., Blackaby, W., Albertella, M., Rak, J., De Braekeleer, E., Tsagkogeorga, G., Pilka, E.S., Aspris, D., Leggate, D., Hendrick, A.G., Webster, N.A., Andrews, B., Fosbeary, R., Guest, P., Irigoyen, N., Eleftheriou, M., Gozdecka, M., Dias, J.M.L., Bannister, A.J., Vick, B., Jeremias, I., Vassiliou, G.S., Rausch, O., Tzelepis, K. & Kouzarides, T. Small-molecule inhibition of METTL3 as a strategy against myeloid leukaemia. Nature 593, 597-601 (2021).
      11. Otto, A., Schmidt, C., Luke, G., Allen, S., Valasek, P., Muntoni, F., Lawrence-Watt, D. & Patel, K. Canonical Wnt signalling induces satellite-cell proliferation during adult skeletal muscle regeneration. J Cell Sci 121, 2939-2950 (2008).
      12. Liu, J., Gao, M., He, J., Wu, K., Lin, S., Jin, L., Chen, Y., Liu, H., Shi, J., Wang, X., Chang, L., Lin, Y., Zhao, Y.L., Zhang, X., Zhang, M., Luo, G.Z., Wu, G., Pei, D., Wang, J., Bao, X. & Chen, J. The RNA m(6)A reader YTHDC1 silences retrotransposons and guards ES cell identity. Nature 591, 322-326 (2021).
      13. Xu, W., Li, J., He, C., Wen, J., Ma, H., Rong, B., Diao, J., Wang, L., Wang, J., Wu, F., Tan, L., Shi, Y.G., Shi, Y. & Shen, H. METTL3 regulates heterochromatin in mouse embryonic stem cells. Nature 591, 317-321 (2021).
      14. Roberson, P.A., Romero, M.A., Osburn, S.C., Mumford, P.W., Vann, C.G., Fox, C.D., McCullough, D.J., Brown, M.D. & Roberts, M.D. Skeletal muscle LINE-1 ORF1 mRNA is higher in older humans but decreases with endurance exercise and is negatively associated with higher physical activity. J Appl Physiol (1985) 127, 895-904 (2019).
      15. Mumford, P.W., Romero, M.A., Osburn, S.C., Roberson, P.A., Vann, C.G., Mobley, C.B., Brown, M.D., Kavazis, A.N., Young, K.C. & Roberts, M.D. Skeletal muscle LINE-1 retrotransposon activity is upregulated in older versus younger rats. Am J Physiol Regul Integr Comp Physiol 317, R397-R406 (2019).
    1. Author Response

      Reviewer #1 (Public Review):

      Laurent et al. generate genotyping data from 259 individuals from Cabo Verde to investigate the histories and patterns of admixture in the set of islands that make up Cabo Verde. The authors had previously studied admixture in an earlier study but in a smaller set of individuals from two cities on one island (from Santiago) in Cabo Verde. Here, the authors sample from all the islands of Cabo Verde to study admixture in these islands and reveal that there is a varied picture of admixture in that the demographic histories are distinct amongst this set of islands.

      I found the article interesting and clearly written, and I like that it highlights that admixture is a dynamic process that has manifested differently in distinct geographical regions, which will be of broad interest. It also highlights how genetic ancestry patterns are correlated with the populations that were in power/enslaved during colonial times and proposes that certain social practices (e.g. legally enforced segregation) might have affected the distribution/length of runs of homozygosity.

      We thank the reviewer for this positive and encouraging appreciation of our work.

      My main suggestion is that the authors provide a set of hypotheses regarding admixture that may explain their observations, and it would be nice to see if at least one of these has some support using simulations. Could the authors run simulations under their proposed demographic model for populations in Cabo Verde vs what we would expect in a pseudo-panmictic population with two sources of admixture? The authors probably already have simulations they could use. And then see how pre/post admixture founding events change patterns of ancestry.

      As suggested by the reviewer, in the revised version of the manuscript, we conducted the same MetHis-ABC scenario-choice and posterior parameter inference considering the 225 Cabo Verde-born individuals as a single random-mating population, in addition to our main results considering each island of birth separately. Most interestingly, we find that our ABC inferences fail to accurately reconstruct the detailed admixture history of Cabo Verde when considered as a whole instead of per each island of birth separately. This is due to admixture histories substantially differing across islands of birth of individuals, also consistent with the significantly differentiated genetic patterns within Cabo Verde obtained from ADMIXTURE, local-ancestry inferences, ROH, and isolation-by-distance analyses. These results are now implemented throughout the revised version of the manuscript and in supplementary figures and tables. See in particular Results L758-769, and Appendix1-figures and tables, Figure7-figure supplement 1-3, and Appendix 5-table 10.

      Reviewer #2 (Public Review):

      In this article, the authors leveraged patterns on the empirical genomic data and the power of simulations and statistical inferences and aimed to address a few biologically and culturally relevant questions about Cabo Verde population's admixture history during the TAST era. Specifically, the authors provided evidence on which specific African and European populations contributed to the population per island if the genetic admixture history parallels language evolution, and the best-fitting admixture scenario that answers questions on when and which continental populations admixed on which island, and how that influenced the island population dynamics since then.

      Strengths

      1) This study sets a great example of studying population history through the lens of genetics and linguistics, jointly. Historically most of the genetic studies of population history either ignored the sociocultural aspects of the evidence or poorly (or wrongly) correlated that with genetic inference. This study identified components in language that are informative about cultural mixture (strictly African-origin words versus shared European-African words), and carefully examined the statistical correlation between genetic and linguistic variation that occurred through admixture, providing a complete picture of genetic and sociocultural transformation in the Cabo Verde islands during TAST.

      We thank the reviewer for this very enthusiastic and encouraging comment on our work.

      2) The statistical analyses are carefully designed and rigorously done. I especially appreciate the careful goodness-of-fit checking and parameter error rates estimation in the ABC part, making the inference results more convincing.

      Again, we thank the reviewer for this positive comment.

      Weaknesses

      1) Most of the methods in the main analyses here were previously developed (eg. MDS, MetHis, RF/NN-ABC). However, when being introduced and applied here, the authors didn't reinstate the necessary background (strength and weakness, limitations and usage) of these methods to make them justifiable over other methods. For example, why ADS-MDS is used here to examine the genetic relationship between Cabo Verde populations and other worldwide populations, rather than classic PCA and F-statistics?

      As mentioned in the answer to the general comments, we extensively modified our manuscript in both Results and Material and Methods, to clarify and justify our reasoning for each one of the analyses conducted, and to discuss pros and cons of the methods used. We warmly thank the reviewers for this request, as we believe it allowed us to strongly improve the accessibility of our work in particular for the less specialized audience, as well as equally crucially improve replicability of our work for specialists. See in particular Results L185-193, L245-250, L368-371, L380-386, L495-511, L567-571, L606-621, and the corresponding Material and Methods sections.

      For the particular example of PCA raised by the reviewer: see Results L185-193.

      For that of F-statistics, see Results L368-386. Note that we added the F-stat analysis suggested by the reviewer to the revised version of our manuscript (see detailed answers below), Figure 3-figure supplement 2.

      We believe that these changes strongly strengthen our manuscript and enlarged its potential readership, and we thank, again, the reviewer for this request.

      2) The senior author of this paper has an earlier published article (Verdu et al. 2017 Current Biology) on the same population, using a similar set of methods and drew similar conclusions on the source of genetic and linguistic variation in Cabo Verde. Although additional samples on island levels are added here and additional analyses on admixture history were performed, half of the main messages from this paper don't seem to provide new knowledge than what we already learned from the 2017 paper.

      We substantially modified the text of the revised version of the manuscript to address the concern raised by the reviewer in numerous locations of the Abstract, Introduction and Results and Discussion sections, thus hoping to highlight better what we think is the profound novelty brought by this study. In particular, see Introduction L128-153.

      3) Furthermore, there are a few essential factors that could confound different aspects of the major analyses in this article that I believe should be taken into account and discussed. Such factors include the demographic history of source populations prior to admixture, different scenarios of the recipient population size changes, differences in recombination rates across the genome and between African and European populations, etc.

      We thank the reviewer for these comments which allowed us to improve the clarity of our manuscript and rise very interesting discussion points that we had overlooked. As indicated in part in the general answer to reviewers above:

      1) We clarified our methods’ design and discussed extensively its limitations with respect to ancestral populations’ sizes mis-specifications. Indeed, ancestral source population sizes are not modelized in our MetHis-ABC approach. Instead, we consider that the observed proxy source populations from Africa and Europe are at the drift-mutation equilibrium and are large since the initial and recent founding of Cabo Verde in the 1460’s, and thus use observed genetic variation patterns in these populations to build virtual gamete reservoirs for the admixture history of Cabo Verde with the MetHis-ABC framework. Therefore, while we cannot evaluate explicitly the influence of ancestral source population sizes differences on our inferences in Cabo Verde, as we now state in the revised version of our manuscript: “we nevertheless implicitly take the real demographic histories of these source populations into account in our simulations, as we use observed genetic patterns themselves the product of this demographic history to create the virtual source populations at the root of the admixture history of each Cabo Verdean island.”. We then discuss the outcome of such an approach which mimics satisfactorily the real data for ABC inference. See in particular the revised versions of the Material and Methods L1454-1491 novel section “Simulating the admixed population from source-populations for 60,000 independent SNPs with MetHis”, and Results L637-649.

      2) Concerning the possibilities for population-size changes in the admixed population in our simulations and ABC inferences, we clarified our Material and Methods and explanations of our Results to better show that we readily consider various possible scenarios (for each island separately). Indeed, with our MetHis simulation design, given values of model-parameters correspond either to a constant, a linearly increasing, or a hyperbolic increase in reproductive size in the admixed population over time. We further clarified our Results and Discussion pointing out that we find, a posteriori, indeed, different demographic regimes among islands.

      Nevertheless, reviewers are right that we did not test the possibility for bottlenecks. We thus substantially expanded the Results and Discussion sections in multiple locations to highlight this limitation and the challenges involved in overcoming it in future work. See in particular Material and Methods L1386-1404 section “Hyperbolic increase, linear increase, or constant reproductive population size in the admixed population”, Results L739-742, and Discussion L934-941, and Perspectives.

      3) Finally, concerning recombination rate, we considered only independent SNPs in our simulation and inference process, as is now clarified in multiple locations throughout the text. Otherwise, we further discuss matters of recombination concern regarding specifically our ROH analyses, as suggested in the detailed reviewer’s comments. In brief, we note that in Figure 8 Pemberton 2012 (AJHG 91:275-292) shows that occurrence of long ROH at the same genomic location across individuals is correlated with low recombination rates, although the effect is relatively weak unless in extreme recombination cold spots. Unless there were many extreme recombination cold spots that were different among the islands or ancestral populations, we anticipate fine-scale recombination rate differences not to matter very much for total ROH levels in these data. Similarly, we do not expect large genome-wide differences in mutation rate, and therefore we don’t anticipate minor local variation in mutation rates to make a systematic difference in total ROH levels. We now refer to these important points in the revised version of our Results L414-415.

      Overall, the paper is of interest to the field of human evolutionary genetics - that not only does it tell the story of a historically important population, but also the methodology behind this paper sets a great example for future research to study genetic and sociocultural transformations under the same framework.

      We would like to thank the reviewer for this very encouraging conclusion and for the detailed revision of our work which, we believe, helped us to substantially improve our manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      This work describes a new method, Proteinfer, which uses dilated neural networks to predict protein function, using EC terms and GO terms. The software is fast and the server-side performance is fast and reliable. The method is very clearly described. However, it is hard to judge the accuracy of this method based on the current manuscript, and some more work is needed to do so.

      I would like to address the following statement by the authors: (p3, left column): "We focus on Swiss Prot to ensure that our models learn from human-curated labels, rather than labels generated by electronic annotation".

      There is a subtle but important point to be made here: while SwissProt (SP) entries are human-curated, they might still have their function annotated ("labeled") electronically only. The SP entry comprises the sequence, source organism, paper(s) (if any), annotations, cross-references, etc. A validated entry does not mean that the annotation was necessarily validated manually: but rather that there is a paper backing the veracity of the sequence itself, and that it is not an automatic generation from a genome project.

      Example: 009L_FRG3G is a reviewed entry, and has four function annotations, all generated by BLAST, with an IEA (inferred by electronic annotation) evidence code. Most GO annotations in SwissProt are generated that way: a reviewed Swissprot entry, unlike what the authors imply, does not guarantee that the function annotation was made by non-electronic means. If the authors would like to use non-electronic annotations for functional labels, they should use those that are annotated with the GO experimental evidence codes (or, at the very least, not exclusively annotated with IEA). Therefore, most of the annotations in the authors' gold standard protein annotations are simply generated by BLAST and not reviewed by a person. Essentially the authors are comparing predictions with predictions, or at least not taking care not to do so. This is an important point that the authors need to address since there is no apparent gold standard they are using.

      The above statement is relevant to GO. But since EC is mapped 1:1 to GO molecular function ontology (as a subset, there are many terms in GO MFO that are not enzymes of course), the authors can easily apply this to EC-based entries as well.

      This may explain why, in Figure S8(b), BLAST retains such a high and even plateau of the precision-recall curve: BLAST hits are used throughout as gold-standard, and therefore BLAST performs so well. This is in contrast, say to CAFA assessments which use as a gold standard only those proteins which have experimental GO evidence codes, and therefore BLAST performs much poorer upon assessment.

      We thank the reviewer for this point. We regret if we gave the impression that our training data derives exclusively, or even primarily, from direct experiments on the amino acid sequences in question. We had attempted to address this point in the discussion with this section:

      "On the other hand, many entries come from experts applying existing computational methods, including BLAST and HMM-based approaches, to identify protein function. Therefore, the data may be enriched for sequences with functions that are easily ascribable using these techniques which could limit the ability to estimate the added value of using an alternative alignment-free tool. An idealised dataset would involved training only on those sequences that have themselves been experimentally characterized, but at present too little data exists than would be needed for a fully supervised deep-learning approach."

      We have now added a sentence in the early sentence of of the manuscript reinforcing this point:

      "Despite its curated nature, SwissProt contains many proteins annotated only on the basis of electronic tools."

      We have also removed the phrase "rather than labels generated by a computational annotation pipeline" because we acknowledge that this could be read to imply that computational approaches are not used at all for SwissProt which would not be correct.

      While we agree that SwissProt contains many entries inferred via electronic means, we nevertheless think its curated nature makes an important difference. Curators as far as possible reconcile all known data for a protein, often looking for the presence of key residues in the active sites. There are proteins where electronic annotation would suggest functions in direct contradiction to experimental data, which are avoided due to this curation process. As one example, UniProt entry Q76NQ1 contains a rhomboid-like domain typically found in rhomboid proteases (IPR022764) and therefore inputting it into InterProScan results in a prediction of peptidase activity (GO:0004252). However this is in fact an inactive protein, as discovered by experiment, and so is not annotated with this activity in SwissProt. ProteInfer successfully avoids predicting peptidase activity as a result of this curated training data. (For transparency, ProteInfer is by no means perfect on this point: there are also cases in which UniProt curators have annotated single proteins as inactive but ProteInfer has not learnt this relationship, due to similar sequences which remain active).

      We had also attempted to address this point by comparing with phenotypes seen in a specific high-throughput experimental assay ("Comparison to experimental data" section).

      We have now added a new analysis in which we assess the recall of GO terms while excluding IEA annotation codes. We find that at the threshold that maximises F1 score in the full analysis, our approach is able to recall 60-75% (depending on ontology) of annotations. Inferring precision is challenging due to the fact that only a very small proportion of the possible function*gene combinations have in fact been tested, making it difficult to distinguish a true negative from a false negative.

      "We also tested how well our trained model was able to recall the subset of GO term annotations which are not associated with the "inferred from electronic annotation" (IEA) evidence code, indicating either experimental work or more intensely-curated evidence. We found that at the threshold that maximised F1 score for overall prediction, 75% of molecular function annotations could be successfully recalled, 61% of cellular component annotations, and 60% of biological process annotations."

      Pooling GO DAGs together: It is unclear how the authors generate performance data over GO as a whole. GO is really 3 disjoint DAGs (molecular function ontology or MFO, Biological Process or BPO, Cellular component or CCO). Any assessment of performance should be over each DAG separately, to make biological sense. Pooling together the three GO DAGs which describe completely different aspects of the function is not informative. Interestingly enough, in the browser applications, the GO DAG results are distinctly separated into the respective DAGs.

      Thank you for this suggestion. To answer the question of how we were previously generating performance data: this was simply by treating all terms equivalently, regardless of their ontology.

      We agree that it would be helpful to the reader to split out results by ontology type, especially given clear differences in performance.

      We now provide PR-curve graphs split by ontology type.

      We have also added the following text:

      "The same trends for the relative performance of different approaches were seen for each of the direct-acyclic graphs that make up the GO ontology (biological process, cellular component and molecular function), but there were substantial differences in absolute performance (Fig S10). Performance was highest for molecular function (max F1: 0.94), followed by biological process (max F1:0.86) and then cellular component (max F1:0.84)."

      Figure 3 and lack of baseline methods: the text refers to Figures 3A and 3B, but I could only see one figure with no panels. Is there an error here? It is not possible at this point to talk about the results in this figure as described. It looks like Figure 3A is missing, with Fmax scores. In any case, Figure 3(b?) has precision-recall curves showing the performance of predictions is the highest on Isomerases and lowest in hydrolases. It is hard to tell the Fmax values, but they seem reasonably high. However, there is no comparison with a baseline method such as BLAST or Naive, and those should be inserted. It is important to compare Proteinfer with these baseline methods to answer the following questions: (1) Does Proteinfer perform better than the go-to method of choice for most biologists? (2) does it perform better than what is expected given the frequency of these terms in the dataset? For an explanation of the Naive method which answers the latter question, see: ( https://www.nature.com/articles/nmeth.2340 )

      We apologise for the errors in figure referencing in the text here. This emerged in part from the two versions of text required to support an interactive and legacy PDF version. We had provided baseline comparisons with BLAST in Fig. 5 of the interactive version (correctly referenced in the interactive version) and in Fig. S7 of the PDF version (incorrectly referenced as Fig 3B).

      We have now moved the key panel of Fig S7 to the main-text of the PDF version (new Fig 3B), as suggested also by the editor, and updated the figure referencing appropriately. We have also added a Naive frequency-count based baseline. This baseline would not appear in Fig 3B due to axis truncation, but is shown in a supplemental figure, new Fig S9. We thank the reviewer and the editor for raising these points.

      Reviewer #2 (Public Review):

      In this paper, Sanderson et al. describe a convolutional neural network that predicts protein domains directly from amino acid sequences. They train this model with manually curated sequences from the Swiss-Prot database to predict Enzyme Commission (EC) numbers and Gene Ontology (GO) terms. This paper builds on previous work by this group, where they trained a separate neural network to recognize each known protein domain. Here, they train one convolutional neural network to identify enzymatic functions or GO terms. They discuss how this change can deal with protein domains that frequently co-occur and more efficiently handle proteins of different lengths. The tool, ProteInfer, adds a useful new tool for computational analysis of proteins that complements existing methods like BLAST and Pfam.

      The authors make three claims:

      1) "ProteInfer models reproduce curator decisions for a variety of functional properties across sequences distant from the training data"

      This claim is well supported by the data presented in the paper. The authors compare the precision-recall curves of four model variations. The authors focus their training on the maximum F1 statistic of the precision-recall curve. Using precision-recall curves is appropriate for this kind of problem.

      2) "Attribution analysis shows that the predictions are driven by relevant regions of each protein sequence".

      This claim is very well supported by the data and particularly well illustrated by Figure 4. The examples on the interactive website are also very nice. This section is a substantial innovation of this method. It shows the value of scanning for multiple functions at the same time and the value of being able to scan proteins of any length.

      3) "ProteInfer models create a generalised mapping between sequence space and the space of protein functions, which is useful for tasks other than those for which the models were trained."

      This claim is also well supported. The print version of the figure is really clear, and the interactive version is even better. It is a clever use of UMAP representations to look at the abstract last layer of the network. It was very nice how each sub-functional class clustered.

      The interactive website was very easy to use with a good user interface. I expect will be accessible to experimental and computational biologists.

      The manuscript has many strengths. The main text is clearly written, with high-level descriptions of the modeling. I initially printed and read the static PDF version of the paper. The interactive form is much more fun to read because of the ability to analyze my favorite proteins and zoom in on their figures (e.g. Figure 8). The new Figure 1 motivates the work nicely. The website has an excellent interactive graphic showing how the number of layers in the network and the kernel size change how data is pooled across residues. I will use this tool in my teaching.

      We are grateful for these comments. We are excited that the reviewer hopes to use this figure for teaching, which is exactly the sort of impact we hoped for this interactive manuscript. We agree that the interactive manuscript is by far the most compelling version of this work.

      The manuscript has only minor weaknesses. It was not clear if the interactive model on the website was the Single CNN model or the Ensemble CNN model.

      We thank the reviewer for pointing out the ambiguity here. The model shown on the website is a Single CNN model, and is chosen with hyperparameters that achieve good performance whilst being readily downloadable to the user's machine for this demonstration without use of excessive bandwidth. We have added additional sentences to address this better in the manuscript.

      " When the user loads the tool, lightweight EC (5MB) and GO model (7MB) prediction models are downloaded and all predictions are then performed locally, with query sequences never leaving the user's computer. We selected the hyperparameters for these lightweight models by performing a tuning study in which we filtered results by the size of the model's parameters and then selected the best performing models. This approach uses a single neural network, rather than an ensemble. Inference in the browser for a 1500 amino-acid sequence takes < 1.5 seconds for both models "

      Overall, ProteInfer will be a very useful resource for a broad user base. The analysis of the 171 new proteins in Figure 7 was particularly compelling and serves as a great example of the utility and power of ProteInfer. It completes leading tools in a very valuable way. I anticipate adding it to my standard analysis workflows. The data and code are publicly available.

      Reviewer #3 (Public Review):

      In this work, the authors employ a deep convolutional neural network approach to map protein sequence to function. The rationales are that (i) once trained, the neural network would offer fast predictions for new sequences, facilitating exploration and discovery without the need for extensive computational resources, (ii) that the embedding of protein sequences in a fixed-dimensional space would allow potential analyses and interpretation of sequence-function relationships across proteins, and (iii) predicting protein function in a way that is different from alignment-based approaches could lead to new insights or superior performance, at least in certain regimes, thereby complementing existing approaches. I believe the authors demonstrate i and iii convincingly, whereas ii was left open-ended.

      A strength of the work is showing that the trained CNNs perform generally on par with existing alignment based-methods such as BLASTp, with a precision-recall tradeoff that differs from BLASTp. Because the method is more precise at lower recall values, whereas BLASTp has higher recall at lower precision values, it is indeed a good complement to BLASTp, as demonstrated by the top performance of the ensemble approach containing both methods.

      Another strength of the work is its emphasis on usability and interpretability, as demonstrated in the graphical interface, use of class activation mapping for sub-sequence attribution, and the analysis of hierarchical functional clustering when projecting the high-dimensional embedding into UMAP projections.

      We thank the reviewer for highlighting these points.

      However, a main weakness is the premise that this approach is new. For example, the authors claim that existing deep learning "models cannot infer functional annotation for full-length protein sequences." However, as the proposed method is a straightforward deep neural network implementation, there have been other very similar approaches published for protein function prediction. For example, Cai, Wang, and Deng, Frontiers in Bioengineering and Biotechnology (2020), the latter also being a CNN approach. As such, it is difficult to assess how this approach differs from or builds on previous work.

      We agree that there has been a great deal of exciting work looking at the application of deep learning to protein sequences. Our core code has been publicly available on GitHub since April 2019 , and our preprint has now been available for more than a year. We regret the time taken to release a manuscript and for it to reach review: this was in part due to the SARS-CoV-2 pandemic, which the first author was heavily involved in the scientific response to. Nevertheless, we believe that our work has a number of important features that distinguish it from much other work in this space.

      ● We train across the entire GO ontology. In the paper referenced by the reviewer, training is with 491 BP terms, 321 MF terms, and 240 CC terms. In contrast, we train with a vocabulary of 32,102 GO labels, and the majority of these are predicted at least once in our test set. ● We use a dilated convolutional approach. In the referenced paper the network used is instead of fixed dimensions. Such an approach means there is an upper limit on how large a protein can be input into the model, and also means that this maximum length defines the computational resources used for every protein, including much smaller ones. In contrast, our dilated network scales to any size of protein, but when used with smaller input sequences it performs only the calculations needed for this size of sequence.

      ● We use class-activation mapping to determine regions of a protein responsible for predictions, and therefore potentially involved in specific functions.

      ● We provide a TensorFlow.JS implementation of our approach that allows lightweight models to be tested without any downloads

      ● We provide a command-line tool that provides easy access to full models.

      We have made some changes to bring out these points more clearly in the text:

      "Since natural protein sequences can vary in length by at least three orders of magnitude, this pooling is advantageous because it allows our model to accommodate sequences of arbitrary length without imposing restrictive modeling assumptions or computational burdens that scale with sequence length. In contrast, many previous approaches operate on fixed sequence lengths: these techniques are unable to make predictions for proteins larger than this sequence length, and use unnecessary resources when employed on smaller proteins."

      We have added a table that sets out the vocabulary sizes used in our work (5,134 for EC and 32,109 for GO):

      "Gene Ontology (GO) terms describe important protein functional properties, with 32,109 such terms in Swiss-Pr ot (Table S6) that cov er the molecular functions of proteins (e.g. DNA-binding, amylase activity), the biological processes they are involved in (e.g. DNA replication, meiosis), and the cellular components to which they localise (e.g. mitochondrion, cytosol)."

      A second weakness is that it was not clear what new insights the UMAP projections of the sequence embedding could offer. For example, the authors mention that "a generalized mapping between sequence space and the space of protein functions...is useful for tasks other than those for which the models were trained." However, such tasks were not explicitly explained. The hierarchical clustering of enzymatic proteins shown in Fig. 5 and the clustering of non-enzymatic proteins in Fig. 6 are consistent with the expectation of separability in the high-dimensional embedding space that would be necessary for good CNN performance (although the sub-groups are sometimes not well-separated. For example, only the second level and leaf level are well-separated in the enzyme classification UMAP hierarchy). Therefore, the value-added of the UMAP representation should be something like using these plots to gain insight into a family or sub-family of enzymes.

      We thank the reviewer for highlighting this point. There are two types of embedding which we discuss in the paper. The first is the high-dimensional representation of the protein that the neural network constructs as part of the prediction process. This is the embedding we feel is most useful for downstream applications, and we discuss a specific example of training the EC-number network to recognise membrane proteins (a property on which it was not trained): "To quantitatively measure whether these embeddings capture the function of non-enzyme proteins, we trained a simple random forest classification model that used these embeddings to predict whether a protein was annotated with the intrinsic component of membrane GO term. We trained on a small set of non-enzymes containing 518 membrane proteins, and evaluated on the rest of the examples. This simple model achieved a precision of 97% and recall of 60% for an F1 score of 0.74. Model training and data-labelling took around 15 seconds. This demonstrates the power of embeddings to simplify other studies with limited labeled data, as has been observed in recent work (43, 72)."

      As the reviewer points out, there is a second embedding created by compressing this high-dimensional down to two dimensions using UMAP. This embedding can also be useful for understanding the properties seen by the network, for example the GO term s highlighted in Fig. 7 , but in general it will contain less information than the higher-dimensional embedding.

      The clear presentation, ease of use, and computationally accessible downstream analytics of this work make it of broad utility to the field.

    1. Author Response

      Reviewer #1 (Public Review):

      In this manuscript, Wei & Robles et al seek to estimate the heritability contribution of Neanderthal Informative Markers (NIM) relative to SNPs that arose in modern humans (MH). This is a question that has received a fair amount of attention in recent studies, but persistent statistical limitations have made some prior results difficult to interpret. Of particular concern is the possibility that heritability (h^2) attributed to Neanderthal markers might be tagging linked variants that arose in modern humans, resulting in overestimation of h^2 due to Neanderthal variants. Neanderthal variants also tend to be rare, and estimating the contribution of rare alleles to h^2 is challenging. In some previous studies, rare alleles have been excluded from h^2 estimates.

      Wei & Robles et al develop and assess a method that estimates both total heritability and per-SNP heritability of NIMs, allowing them to test whether NIM contributions to variation in human traits are similar or substantially different than modern human SNPs. They find an overall depletion of heritability across the traits that they studied, and found no traits with enrichment of heritability due to NIMs. They also developed a 'fine-mapping' procedure that aims to find potential causal alleles and report several potentially interesting associations with putatively functional variants.

      Strengths of this study include rigorous assessment of the statistical methods employed with simulations and careful design of the statistical approaches to overcome previous limitations due to LD and frequency differences between MH and NIM variants. I found the manuscript interesting and I think it makes a solid contribution to the literature that addresses limitations of some earlier studies.

      My main questions for the authors concern potential limitations of their simulation approach. In particular, they describe varying genetic architectures corresponding to the enrichment of effects among rare alleles or common alleles. I agree with the authors that it is important to assess the impact of (unknown) architecture on the inference, but the models employed here are ad hoc and unlikely to correspond to any mechanistic evolutionary model. It is unclear to me whether the contributions of rare and common alleles (and how these correspond with levels of LD) in real data will be close enough to these simulated schemes to ensure good performance of the inference.

      In particular, the common allele model employed makes 90% of effect variants have frequencies above 5% -- I am not aware of any evolutionary model that would result in this outcome, which would suggest that more recent mutations are depleted for effects on traits (of course, it is true that common alleles explain much more h^2 under neutral models than rare alleles, but this is driven largely by the effect of frequency on h^2, not the proportion of alleles that are effect alleles). Likewise, the rare allele model has the opposite pattern, with 90% of effect alleles having frequencies under 5%. Since most alleles have frequencies under 5% anyway (~58% of MH SNPs and ~73% of NIM SNPs) this only modestly boosts the prevalence of low frequency effect alleles relative to their proportion. Some selection models suggest that rare alleles should have much bigger effects and a substantially higher likelihood of being effect alleles than common alleles. I'm not sure this situation is well-captured by the simulations performed. With LD and MAF annotations being applied in relatively wide quintile bins, do the authors think their inference procedure will do a good job of capturing such rare allele effects? This seems particularly important to me in the context of this paper, since the claim is that Neanderthal alleles are depleted for overall h^2, but Neanderthal alleles are also disproportionately rare, meaning they could suffer a bigger penalty. This concern could be easily addressed by including some simulations with additional architectures to those considered in the manuscript.

      We thank the reviewers for their thoughtful comments regarding rare alleles, and we agree that our RARE simulations only moderately boosted the enrichment of rare alleles in causal mutations. To address this, we added new simulations, ULTRA RARE, in which SNPs with MAF < 0.01 constitute 90% of the causal variants. Similar to our previous simulations, we use 100,000 and 10,000 causal variants to mimic highly polygenic and moderately polygenic phenotypes, and 0.5 and 0.2 for high and moderately heritable phenotypes. We similarly did three replicated simulations for each combination and partitioned the heritability with Ancestry only annotation, Ancestry+MAF annotation, Ancestry+LD annotation, and Ancestry+MAF+LD annotation. Our Ancestry+MAF+LD annotation remains calibrated in this setting (see Figure below). We believe this experiment strengthens our paper and have added it as Fig S2.

      While we agree that these architectures are ad-hoc and are unlikely to correspond to realistic evolutionary scenarios, we have chosen these architectures to span the range of possible architecture so that the skew towards common or rare alleles that we have explored are extreme. The finding that our estimates are calibrated across the range that we have explored leads us to conclude that our inferences should be robust.

      More broadly, we concur with the reviewer that our results (as well as others in the field) may need to be revisited as our view of the genetic architecture of complex traits evolves. The methods that we propose in this paper are general enough to explore such architectures in the future by choosing a sufficiently large set of annotations that match the characteristics across NIMs and MH SNPs. A practical limitation to this strategy is that the use of a large number of annotations can result in some annotations being assigned a small number of SNPs which would, in turn, reduce the precision of our estimates. This limitation is particularly relevant due to the smaller number of NIMs compared to MH SNPs (around 250K vs around 8M).

      Reviewer #2 (Public Review):

      The goal of the work described in this paper is to comprehensively describe the contribution of Neanderthal-informative mutations (NIMs) to complex traits in modern human populations. There are some known challenges in studying these variants, namely that they are often uncommon, and have unusually long haplotype structures. To overcome these, the authors customized a genotyping array to specifically assay putative Neanderthal haplotypes, and used a recent method of estimating heritability that can explicitly account for differences in MAF and LD.

      This study is well thought-out, and the ability to specifically target the genotyping array to the variants in question and then use that information to properly control for population structure is a massive benefit. The methodology also allowed them to include rarer alleles that were generally excluded from previous studies. The simulations are thorough and convincingly show the importance of accounting for both MAF and LD in addition to ancestry. The fine-mapping done to disentangle effects between actual Neanderthal variants and Modern human ones on the same haplotype also seems reasonable. They also strike a good balance between highlighting potentially interesting examples of Neanderthal variants having an effect on phenotype without overinterpreting association-based findings.

      The main weakness of the paper is in its description of the work, not the work itself. The paper currently places a lot of emphasis on comparing these results to prior studies, particularly on its disagreement with McArthur, et al. (2021), a study on introgressed variant heritability that was also done primarily in UK Biobank. While they do show that the method used in that study (LDSR) does not account for MAF and LD as effectively as this analysis, this work does not support the conclusion that this is a major problem with previous heritability studies. McArthur et al. in fact largely replicate these results that Neanderthal variants (and more generally regions with Neanderthal variants) are depleted of heritability, and agree with the interpretation that this is likely due to selection against Neanderthal alleles. I actually find this a reassuring point, given the differences between the variant sets and methods used by the two studies, but it isn't mentioned in the text. Where the two studies differ is in specifics, mainly which loci have some association with human phenotypes; McArthur et al. also identified a couple groups of traits that were exceptions to the general rule of depleted heritability. While this work shows that not accounting for MAF and LD can lead to underestimating NIM heritability, I don't follow the logic behind the claim that this could lead to a false positive in heritability enrichment (a false negative would be more likely, surely?). There are also more differences between this and previous heritability studies than just the method used to estimate heritability, and the comparisons done here do not sufficiently account for these. A more detailed discussion to reconcile how, despite its weaknesses, LDSR picks up similar broad patterns while disagreeing in specifics is merited.

      We agree with the reviewer that our results are generally concordant with those of McArthur et al. 2021 and this concordance is reassuring given the differences across our studies. The differences across the studies, wherein McArthur et al. 2021 identify a few traits with elevated heritability while we do not, could arise due to reasons beyond the methodological differences such as differences in the sets of variants analyzed. We have partially explored this possibility in the revised manuscript by analyzing the set of introgressed variants identified by the Sprime method (which was used in McArthur et al. 2021) using our method: we continue to observe a pattern of depletion with no evidence for enrichment. We hypothesize that the reason why LDSR picks up similar overall patterns despite its limitations is indicative of the nature of selection on introgressed alleles (which, in turn, influences the dependence of effect size on allele frequency and LD). Investigating this hypothesis will require a detailed understanding of the LDSR results on parameters such as the MAF threshold on the regression SNPs and the LD reference SNPs and the choice of the LD reference panel.

      Not accounting for MAF and LD can underestimate NIM heritability but can both underestimate and overestimate heritability at MH SNPs. Hence, tests that compare per-SNP heritability at NIMs to MH SNPs can therefore lead to false positives both in the direction of enrichment and depletion.

      We have now written in the Discussion: “In spite of these differences in methods and NIMs analyzed, our observation of an overall pattern of depletion in the heritability of introgressed alleles is consistent with the findings of McArthur et al. The robustness of this pattern might provide insights into the nature of selection against introgressed alleles”

      In general this work agrees with the growing consensus in the field that introgressed Neanderthal variants were selected against, such that those that still remain in human populations do not generally have large effects on phenotypes. There are exceptions to this, but for the most part observed phenotypic associations depend on the exact set of variants being considered, and, like those highlighted in this study, still lack more concrete validation. While this paper does not make a significant advance in this general understanding of introgressed regions in modern populations, it does increase our knowledge in how best to study them, and makes a good attempt at addressing issues that are often just mentioned as caveats in other studies. It includes a nice quantification of how important these variables are in interpreting heritability estimates, and will be useful for heritability studies going forward.

    1. Author Responses

      Reviewer #1 (Public Review):

      The authors present a very detailed short report on a previously undocumented behaviour where flying squirrels are believed to have created grooves in various species of nuts to aid their secure storage in the crotch or forks of twigs. The behaviour is suggested to have evolved as an adaptive strategy in this population of flying squirrels because of the challenges for nut caching in a rainforest environment.

      Thanks

      Using detailed photographs, GPS locations, measurements and camera trap videos, the authors describe the behaviour in great depth providing a useful base for comparative and future studies. However, the weakest point of this study is that the authors did not detect any squirrels making the grooves and only monitored nuts once they were cached. Therefore more research needs to be done to ascertain who, how and where the grooves are produced in the first place.

      Three new videos are attached to show that two squirrel species are rotate and carving the nuts to create the grooves. By the new videos, we can also observe that squirrels re-fixed the nuts between the twigs by carving the nuts. These direct observations can support the claim better. See Supplementary Media files 6-8.

      This work will be of great interest to scholars of animal behaviour and cognition and draws attention to a novel behaviour that warrants further study in similar species.

      Yes, it is. Thanks

      Reviewer #2 (Public Review):

      The authors describe observations of an innovative food caching behavior attributed to two species of flying squirrels and likened the behavior to architectural joints used by humans. The discovery of nuts stored in the crook of shrub branches, facilitated by indented rings seemingly carved by squirrels, possibly represents an interesting food handling innovation that may function to prevent spoilage in a damp tropical ecosystem.

      Thanks!

      I applaud the efforts to survey the area multiple times after the initial discovery, and the use of trail cameras to try capture evidence of animal associations. For what is in essence a natural history note, the authors did a great job of trying to gather a variety of supporting evidence. The videos capturing squirrels visiting and retrieving the cached nuts were compelling, and the shaking of the shrubs demonstrating the difficulty in dislodging the nuts helps build the case that the nuts are cached effectively.

      Thanks!

      The most glaring gap in the evidence is that there is no direct observation of the squirrels actually performing this nut carving behavior, only associating with the nuts after they have been cached.There must be more documentation provided to explicitly link the causality between squirrels and this caching innovation.

      We have included three additional videos to demonstrate that squirrels of both species rotate and carve the nuts to create the grooves. These new videos also show that squirrels can fit the nuts between twigs by carving the nuts. We think that these direct observations clearly support our claim, but agree that it was oversight not to included them in the first draft. See Supplementary Media files 6-8.

      The second major weakness is more to do with writing style and could be addressed with significant revisions to phrasing and development of ideas. This is namely to do with the claim that this is somehow an evolved behavior, without providing evidence that 1) it is indeed the squirrels performing this behavior, 2) that is confers some kind of fitness benefit, and 3) hard evidence that this caching method does indeed prevent decomposition/germination in comparison to the more traditional caching methods of these species. Given the limited geographic range of the observations, I wonder how much of this is actually attributable to learning and/or innovation by these individuals. These ideas are not developed fully, and sometimes the writing wanders among learning and evolution without exploring the deep links among the two concepts.

      1) As above, three new videos establish that the squirrels do, in fact, carve the nuts. See Supplementary Media files 6-8.

      2) We added more description to suggest how this behavior likely confers fitness benefit in the discussion. At this point, however, it is correct to say that we have no hard evidence to demonstrate this, and thus, we’ve attempted to ‘tighten up’ the discussion accordingly so that our arguments (and its limitations) are more understandable.

      3) We revised the statistics about the proportion of nuts that were fresh during each of the surveys, and added some references about how long is required for the nuts to germinate in natural conditions. L163-172.

      Third, the connection to architecture is attention-grabbing, but I'd like to see this fleshed out a bit more with more text description (and a visual here would help immensely).

      We added more description about how the grooving, caching and checking processes were performed by squirrels and how the principles of this suspension are similar to the mortise-tenon joint as employed by humans. L186-202. As above, three new videos are attached.

      Ultimately this work stands to potentially contribute a fascinating piece of evidence into the growing literature on animal cognition, spatial awareness, caching behavior, innovation, and adaptation, but currently, the claims are unsupported by the evidence presented.

      Thank you for your comments about the potential importance of our work on this interesting system. In this version we try to focus more tightly on the aspects for which we have new information to interpret.

      Reviewer #3 (Public Review):

      The authors were trying to describe and document the grooving behaviour of nuts in two species of flying squirrels (Hylopetes Phayrei electilis and H. alboniger) as well as related such behaviour to tool use or that the squirrels are smart. To achieve these objectives, the authors conducted three field surveys. They also set out a camera later to capture animal species that interacted with these nuts. They found that these nuts with grooves are fixed between twigs and can be found in different small plant species. Both species of squirrels made grooves a nut. More shallow grooves are found in nuts that are fixed on alive than dead trees. Ellipsoid nuts have deeper grooves than oblate nuts. They concluded that these nut grooving behaviours are evolved or learned in those flying squirrel populations, and related these behaviours to tool use as well as that the squirrels are smart.

      Thanks!

      One strength of this work is that the data were collected in the field, which may provide hard evidence with video footage showing the two flying squirrel populations made grooves on nuts as well as fixing them between twigs. This evidence will induce new interests to understand the causes and consequences of such nut grooving behaviour. It may be bold to claim that such behaviour involves advance cognition or cognitive process without proper, systematic, experiments. Accordingly, whether the squirrels are 'smart' remains unclear. The authors did well in describing and documenting the nut grooving behaviours of the two species of flying squirrels, which has achieved their first aim. However, as mentioned above, whether such behaviour is 'smart' will need more systematic investigations.

      We have removed the description about cognition or cognitive process in the paper, and the paper is focused on the grooving behavious. “Smart” is also removed, with other words used instead.

    1. Author Response

      Reviewer #1 (Public Review):

      Because of the importance of brain and cognitive traits in human evolution, brain morphology and neural phenotypes have been the subject of considerable attention. However, work on the molecular basis of brain evolution has tended to focus on only a handful of species (i.e., human, chimp, rhesus macaque, mouse), whereas work that adopts a phylogenetic comparative approach (e.g., to identify the ecological correlates of brain evolution) has not been concerned with molecular mechanism. In this study, Kliesmete, Wange, and colleagues attempt to bridge this gap by studying protein and cis-regulatory element evolution for the gene TRNP1, across up to 45 mammals. They provide evidence that TRNP1 protein evolution rates and its ability to drive neural stem cell proliferation are correlated with brain size and/or cortical folding in mammals, and that activity of one TRNP1 cis-regulatory element may also predict cortical folding.

      There is a lot to like about this manuscript. Its broad evolutionary scope represents an important advance over the narrower comparisons that dominate the literature on the genetics of primate brain evolution. The integration of molecular evolution with experimental tests for function is also a strength. For example, showing that TRNP1 from five different mammals drives differences in neural stem cell proliferation, which in turn correlate with brain size and cortical folding, is a very nice result. At the same time, the paper is a good reminder of the difficulty of conclusively linking macroevolutionary patterns of trait evolution to molecular function. While TRNP1 is a moderate outlier in the correlation between rate of protein evolution and brain morphology compared to 125 other genes, this result is likely sensitive to how the comparison set is chosen; additionally, it's not clear that a correlation with evolutionary rate is what should be expected. Further, while the authors show that changes in TRNP1 sequence have functional consequences, they cannot show that these changes are directly responsible for size or folding differences, or that positive selection on TRNP1 is because of selection on brain morphology (high bars to clear). Nevertheless, their findings contribute strong evidence that TRNP1 is an interesting candidate gene for studying brain evolution. They also provide a model for how functional follow-up can enrich sequence-based comparative analysis.

      We thank the reviewer for the positive assessment. With respect to our set of control genes and the interpretation of the correlation between the evolution of the TRNP1 protein sequence and the evolution of brain size and gyrification, we would like to mention the following: we do think that the set is small, but we took all similarly sized genes with one coding exon that we could find in all 30 species. Furthermore, the control genes are well comparable to TRNP1 with respect to alignment quality and average omega (Figure 1-figure supplement 3). Hence, we think that the selection procedure and the actual omega distribution make them a valid, unbiased set to which TRNP1’s co-evolution with brain phenotypes can be compared to. Moreover, we want to point out that by using Coevol, we correlate evolutionary rates, that is the rate of protein evolution of TRNP1 as measured with omega and the rate of brain size evolution that is modeled in Coevol as a Brownian motion process. We think that this was unclear in the previous version of our manuscript, and appreciate that the reviewer saw some merit in our analyses in spite of it.

      Finding conclusive evidence to link molecular evolution to concrete phenotypes is indeed difficult and necessarily inferential. This said, we still believe that correlating rates of evolution of phenotype and sequence across a phylogeny is one of the most convincing pieces of evidence available.

      Reviewer #2 (Public Review):

      In this paper, Kliesmete et al. analyze the protein and regulatory evolution of TRNP1, linking it to the evolution of brain size in mammals. We feel that this is very interesting and the conclusions are generally supported, with one concern.

      The comparison of dN/dS (omega) values to 125 control proteins is helpful, but an important factor was not controlled. The fraction of a protein in an intrinsically disordered region (IDR) is potentially even more important in affecting dN/dS than the protein length or number of exons. We suggest comparing dN/dS of TRNP1 to another control set, preferably at least ~500 proteins, which have similar % IDR.

      Thank you for this interesting suggestion. As mentioned in the public response to Reviewer #1, we are sorry that we did not explain the rationale of the approach very well in the previous version of the manuscript. As also argued above, we think that our control proteins are an unbiased set as they have a comparable alignment quality and an average omega (dN/dS) similar to TRNP1 (Figure 1-figure supplement 3). While IDR domains tend to have a higher omega than their respective non-IDR counterparts, we do not think that the IDR content should be more relevant than omega itself as we do not interpret this estimate on its own, but its covariance with the rate of phenotypic change. Indeed, the proteins of our control set that have a higher IDR content (D2P2, Oates et al. 2013) do not show stronger evidence to be coevolving with the brain phenotypes (IDR content vs. absolute brain size-omega partial correlation: Kendall's tau = 0.048, p-value = 0.45; IDR content vs. absolute GI-omega partial correlation: Kendall’s tau = -0.025, p-value = 0.68; 88 proteins (71%) contain >0% IDRs; 8 proteins contain >62% (TRNP1 content) IDRs.

      Reviewer #3 (Public Review):

      In this work, Z. Kliesmete, L. Wange and colleagues investigate TRNP1 as a gene of potential interest for the evolution of the mammalian cortex. Previous evidence suggests that TRNP1 is involved in self-renewal, proliferation and expansion in cortical cells in mouse and ferret, making this gene a good candidate for evolutionary investigation. The authors designed an experimental scheme to test two non-exclusive hypotheses: first, that evolution of the TRNP1 protein is involved in the apparition of larger and more convoluted brains; and second, that regulation of the TRNP1 gene also plays a role in this process alongside protein evolution.

      The authors report that the rate of TRNP1 protein evolution is strongly correlated to brain size and gyrification, with species with larger and more convoluted brains having more divergent sequences at this gene locus. The correlation with body mass was not as strong, suggesting a functional link between TRNP1 and brain evolution. The authors directly tested the effects of sequence changes by transfecting the TRNP1 sequences from 5 different species in mouse neural stem cells and quantifying cell proliferation. They show that both human and dolphin sequences induce higher proliferation, consistent with larger brain sizes and gyrifications in these two species. Then, the authors identified six potential cis-regulatory elements around the TRNP1 gene that are active in human fetal brain, and that may be involved in its regulation. To investigate whether sequence evolution at these sites results in changes in TRNP1 expression, the authors performed a massively parallel reporter assay using sequences from 75 mammals at these six loci. The authors report that one of the cis-regulatory elements drives reporter expression levels that are somewhat correlated to gyrification in catarrhine monkeys. Consistent with the activity of this cis-regulatory sequence in the fetal brain, the authors report that this element contains binding sites for TFs active in brain development, and contains stronger binding sites for CTCF in catarrhine monkeys than in other species. However, the specificity or functional relevance of this signal is unclear.

      Altogether, this is an interesting study that combines evolutionary analysis and molecular validation in cell cultures using a variety of well-designed assays. The main conclusions - that TRNP1 is likely involved in brain evolution in mammals - are mostly well supported, although the involvement of gene regulation in this process remains inconclusive.

      Strengths:

      • The authors have done a good deal of resequencing and data polishing to ensure that they obtained high-quality sequences for the TRNP1 gene in each species, which enabled a higher confidence investigation of this locus.

      • The statistical design is generally well done and appears robust.

      • The combination of evolutionary analysis and in vivo validation in neural precursor cells is interesting and powerful, and goes beyond the majority of studies in the field. I also appreciated that the authors investigated both protein and regulatory evolution at this locus in significant detail, including performing a MPRA assay across species, which is an interesting strategy in this context.

      Weaknesses:

      • The authors report that TRNP1 evolves under positive selection, however this seems to be the case for many of the control proteins as well, which suggests that the signal is non-specific and possibly due to misspecifications in the model.

      • The evidence for a higher regulatory activity of the intronic cis-regulatory element highlighted by the authors is fairly weak: correlation across species is only 0.07, consistent with the rapid evolution of enhancers in mammals, and the correlation in catarrhine monkeys is seems driven by a couple of outlier datapoints across the 10 species. It is unclear whether false discovery rates were controlled for in this analysis.

      • The analysis of the regulatory content in this putative enhancer provides some tangential evidence but no reliable conclusions regarding the involvement of regulatory changes at this locus in brain evolution.

      We thank the reviewer for the detailed comments. Indeed, TRNP1 overall has a rather average omega value across the tree and hence also the proportion of sites under selection is not hugely increased compared to the control proteins. This is good because we want to have comparable power to detect a correlation between the rate of protein evolution (omega) and the rate of brain size or GI evolution for TRNP1 and the control proteins. Indeed, what makes TRNP1 special is the rather strong correlation between the rate of brain size change and omega, which was only stronger in 4% of our control proteins. Hence, we do not agree with the weakness of model misspecification for TRNP1 protein evolution.

      We agree that the correlation of the activity induced by the intronic cis regulatory element (CRE) with gyrification is weak, but we dispute that the correlation is due to outliers (see residual plot below) or violations of model assumptions (see new permutation analysis in the Results section). There are many reasons why we would expect such a correlation not to be weak, including that a MPRA takes the CRE out of its natural genomic context. Our conclusions do not solely rest on those statistics, but also on independent corroborating evidence: Reilly et al (2015) found a difference in the activity of the TRNP1 intron between human and macaque samples during brain development. Furthermore, we used their and other public data to show that the intron CRE is indeed active in humans and bound by CTCF (new Figure 4 - figure supplement 2).

      We believe that the combined evidence suggests a likely role for the intron CRE for the co-evolution of TRNP1 with gyrification.

    1. Author Response

      Reviewer #2 (Public Review):

      By now, the public is aware of the peculiarities underlying the omicron variants emergence and dissemination globally. This study investigates the mutational biography underlying how mutation effects and epistasis manifest in binding to therapeutic receptors.

      The study highlights how epistasis and other mutation effect measurements manifest in phenotypes associated with antibody binding with respect to spike protein in the omicron variant. It rigorously tests a large suite of mutations in the omicron receptor binding domain, highlighting differences in how mutation effects affect binding to certain therapeutic antibodies.

      Interestingly, mutations of large effect drive escape from binding to certain antibodies, but not others (S309). The difference in the mutational signature is the most interesting finding, and in particular, the signature of how higher-order epistasis manifests in the partial escape in S309, but less so in the full escape of other antibodies.

      The results are timely, the scope enormous, and the analyses responsible.

      My only main criticisms walk the stylistic/scientific line: many of the others have pioneered discussions and methods relating to the measurement of epistasis in proteins and other biomolecules. While I recognize that the purpose of this study is focused on the public health implications, I would have appreciated more of a dive into the peculiarity of the finding with respect to epistasis. I think the authors could achieve this by doing the following:

      a) Reconciling discussions around the mutation effects in light of contemporary discussions of global epistasis "vs" idiosyncratic epistasis, etc. Several of the authors of the manuscript have written other leading manuscripts of the topic. I would appreciate it if the authors couched the findings within other studies in this arena.

      We added a discussion related to global epistasis at the end of the “Epistasis Analysis” methods section. We tried to highlight that the cause and relevance of global epistasis phenomena are quite different at molecular and at organismic level.

      B) While the methods used to detect epistasis in the manuscript make sense, the authors surely realize that methods used to measure is a contentious dimension of the field. I'd appreciate an appeal/explanation as to why their methods were used relative to others. For example, the Lasso correction makes sense, but there are other such methods. Citations and some explanation would be great.

      We added more context and justification in the methods section (Epistasis Analysis). We used Lasso correction not particularly to obtain a sparser representation of the epistasis coefficients (an assumption that is not always valid, particularly within proteins) but rather to reduce instabilities created by the Tobit model inference. In this inference, the model coefficients are unbounded. Thus, if one mutation causes a complete binding loss, all epistatic terms associated with this mutation are not constrained and can become very large in magnitude. A Lasso term with a small coefficient constrains these coefficients but will have a limited influence on the other coefficients.

      Lastly (somewhat relatedly), I found myself wanting the discussion to be bolder and more ambitious. The summary, as I read it, is on the nose and very direct (which is appropriate), but I want more: What do the findings say for greater discussions surrounding evolution in sequence space? For discussions of epistasis in proteins of a certain kind? In, my view, this data set offers fodder for fundamental discussion in evolutionary biology and evolutionary medicine. I recognize, however, the constraints: such topics may not be within the scope of a single paper, and such discussions may distract from the biomedical applications, which are more relevant for human health.

      But I might say something similar about the biomedical implications: the authors do a good job outlining exactly what happened, but what does this say about patterns (the role of mutations of large effect vs. higher-order epistasis) in some traits vs others? Why might we expect certain patterns of epistasis with respect to antibody binding relative to other pathogenic virus phenotypes?

      We agree that these are interesting questions, and have added a paragraph in the discussion to explore these points.

      In summary: rigorous and important work, and I congratulate the authors.

    1. Author Response

      Reviewer #1 (Public Review):

      In this work, the authors investigate a means of cell communication through physical connections they call membrane tubules (similar or identical to the previously reported nanotubes, which they reference extensively). They show that Cas9 transfer between cells is facilitated by these structures rather than exosomes. A novel contribution is that this transfer is dependent on the pair of particular cell types and that the protein syncytin is required to establish a complete syncytial connection, which they show are open ended using electron microscopy.

      The data is convincing because of the multiple readouts for transfer and the ultrastructural verification of the connection. The results support their conclusions. The implications are obvious, since it represents an avenue of cellular communication and modifications. It would be exciting if they could show this occurring in vivo, such as in tissue. The implication of this would be that neighboring cells in a tissue could be entrained over time through transfer of material.

      Thank the reviewer for his/her comments and suggestion. It’s possible that the thick tubular connections found in this study also exist in vivo. A previous study reported that TNT-like structures were found in mouse or human primary tumor cells (PMID: 34494703; PMID: 34795441). Our transfer assays could be adopted to evaluate such transfer in primary cultures and in vivo. We anticipate this for future work.

      Reviewer #2 (Public Review):

      There is a lot of interest in how cells transfer materials (proteins, RNA, organelles) by extracellular vesicles (EV) and tunneling nanotubes (TNTs). Here, Zhang and Schekman developed quantitative assays, based on two different reporters, to measure EV and direct contact-dependent mediated transfer. The first assay is based on transfer of Cas9, which then edits a luciferase gene, whose enzymatic activity is then measured. The second assay is based on a split-GFP system. The experiments on EV trafficking convincingly show that purified exosomes, or any other diffusible agent, are unable to transfer functional Cas9 (either EV-tethered or untethered) and induce significant luciferase activity in acceptor cells. The authors suggest a plausible model by which Cas9 (with the gRNA?) gets "stuck" in such vesicles and is thus unable to enter the nucleus to edit the gene.

      To test alternative pathways of transfer, e.g. by direct cell-cell contact, the authors co-cultured donor and acceptor cells and detect significant luciferase activity. The split GFP assay also showed successful transfer. The authors further characterize this process by biochemical, genetic and imaging approaches. They conclude that a small percentage of cells in the population produce open-ended membrane tubules (which are wider and distinct from TNTs) that can transfer material between cells. This process depends on actin polymerization but not endocytosis or trogocytosis. The process also seems to depend on endogenously expressed Syncytin proteins - fusogens which could be responsible for the membrane fusion leading to the open ends of the tubules.

      The paper provides additional solid evidence to what is already known about the inefficiency of EV-mediated protein transport. Importantly, it provides an interesting new mechanism for contact-dependent transport of cellular material and assigns valuable new information about the possible function of Syncytins. However, the evidence that the proteins and vesicles transfer through the tubules is incomplete and a few more experiments are required. In addition, certain inconsistencies within the paper and with previous literature need to be resolved. Finally, some parts of the text, methods and the figures require re-writing or additional information for clarity.

      Major comments

      1) In Figure 1F, the authors compare the function of exosome-transported SBP-Cas9-GFP vs. transient transfection of SBP-Cas9-GFP. It is not clear if the cells in the transiently transfected culture also express the myc-str-CD63 and were treated with biotin. It is important to determine if CD63-tethering itself affects Cas9 function.

      Thank the reviewer for his comments and suggestions. We now show in Figure 1- figure supplement 1D that CD63-tethering itself does not affect Cas9 function.

      2) The authors do not rule out that TNTs are a mode of transfer in any of their experiments. Their actin polymerization inhibition experiments are also in-line with a TNT role in transfer. This possibility is not discussed in the discussion section.

      Yes, the results in this study do not rule out a role for TNTs in the transfer. At present, we are not aware of conditions that would functionally distinguish transfer mediated by TNTs and thick tubules. We have now included this in the Discussion section.

      3) Issues with the Split GFP assay:

      a) On page 4, line 176, the authors claim that "A mixture of cells before co-culture should not exhibit a GFP signal". However, this result is not presented.

      The results of mixture experiment are included in Figure 2-figure supplement 1D, E.

      b) The authors show in Figure 2C and F that in MBA/HEK co-culture or only HEK293T co-culture, there are dual-labeled, CFP-mCherry, cells. First - what is the % of this sub-population? Second, the authors dismiss this population as cell adhesion (Page 5, line 192) - but in the methods section they claim they gated for single particles (page 17, line 642), supposedly excluding such events. There is a simple way to resolve this - sort these dual labeled cells and visualize under the microscope. Finally - why do the authors think that the GFP halves can transfer but not the mature CFP or mCherry?

      The plot in the Figure 2C and F are displayed in an all-cell mode, not in singlet mode. The percentage of dual-labeled CFP-mCherry in singlet was 0-0.2%. Thus, most of the signal was from doublet, or cell adhesion. We did not claim that the mature CFP or mCherry cannot be transferred. We suggested that the GFP signal of split-GFP recombination may be a more accurate reflection of cytoplasmic transfer between cells. In contrast, mature CFP or mCherry may simply attach to the cell surface but not enter into the other cells.

      c) In the Cas9 experiments - the authors detect an increase in Nluc activity similar in order of magnitude that that of transient transfection with the Cas9 plasmid - suggesting most acceptor cells now express Nluc. However, only 6% of the cells are GFP positive in the split-GFP assay. Can the authors explain why the rate is so low in the split-GFP assay? One possibility (related to item #2 above) is that the split-GFP is transferred by TNTs.

      The Cas9-based Nluc activity assay is more sensitive as it measures an enzyme with a very high turnover number. The split-GFP assay requires a transfer of GFP fragments to produce intact GFP molecules where the signal is not amplified. We think this explains the dramatic increase in a signal once Cas9 is transferred. Our cell sorting results suggest that at least 6% of the receptor cells are transferred in the co-cultures. Of course, nothing in either analysis rules out a role for TNTs in this transfer.

      4) The membrane tubules, the membrane fusion and the transfer process are not well characterized:

      a) The suggested tubules are distinct from TNTs by diameter and (I presume, based on the images) that they are still attached to the surface - whereas TNTs are detached. However, how are these structures different from filopodia except that they (rarely) fuse?

      We used TIRF microscopy and found that the thick tubules are not attached to the surface (not shown). Filopodia are much closer in diameter to TNTs (0.1-0.4 micron). The thick tubules we observe are much thicker (2-4 micron in diameter).

      b) Figure 5E shows that the acceptor cells send out a tubule of its own to meet and fuse. Is this the case in all 8 open-ended tubules that were imaged? Is this structure absent in the closed-ended tubules (e.g. as seen in Figures 6 & 8)?

      Around half of open-ended tubules appeared to emanate from acceptor cells. Likewise, for closed-ended tubules, for example, in Figure 6E where a recipient HEK293T cell projected a short tubule.

      c) The authors suggest a model for transport of the proteins tethered to vesicles (via CD63 tethering). However, the data is incomplete.

      i) They show only a single example of this type of transport, without quantification. How frequent is this event?

      The transport of the proteins tethered to vesicles (via CD63 tethering) were found in all 8 open-ended tubules that we detected in this study.

      ii) Furthermore, the labeling does not conclusively show that these are vesicles and not protein aggregates. Labeling of the vesicle - by dye or protein marker will be useful to determine if these are indeed vesicles, and which type.

      In Figure 4B, the moving punctum in a tubular connection appears to contain SBP-Cas9-GFP, Streptavidin-CD63-mCherry, and the cell surface WGA conjugate that may have been internalized into a donor cell endosome, which indicates that the moving punctum is vesicle type. Nonetheless, in general we cannot distinguish the forms of Cas9 that are transferred and become localized to the nucleus of target cells and we make no claim other than to suggest this possibility that Cas9 may be transferred as an aggregate.

      iii) The data from Figure 2 suggest (if I understand correctly) transfer of the CD63-tethered half-GFP, further strengthening the idea of vesicular transfer. However, the authors also show efficient transfer of untethered Cas9 protein (Figure 2A and other figures). Does this mean that free protein can diffuse through these tubules? The Cas9 has an NLS so the un-tethered versions should be concentrated in the nucleus of donor cells. How, then, do they transfer? The authors do not provide visual evidence for this and I think it is important they would.

      Based on the results using the Cas9-based luciferase assay (His- or SBP-tagged Cas9) (Figure 2A) and split-GFP assay (free GFP1-10) (Figure 2G), we suggest that free protein could be transferred between cells. Our current imaging approach is not designed to quantify protein diffusion. However, we are able to detect from images that Cas9-GFP does not colocalize exclusively with CD63 or concentrate in the nucleus, but also appears in the cytoplasm. These data indicate that both vesicle association and free diffusion may mediate the transfer through tubules. We thank the referee for emphasizing this issue which we will consider for future work to distinguish the transfer types through tubules.

      iv) In Figures 6 & 8, where transfer is diminished, there are still red granules in acceptors cells (representing CD63-mcherry). Does this mean that vesicles do transfer, just not those with Cas9-GFP? Is this background of the imaging? The latter case would suggest that the red granule moving from donor to acceptor cells in figure 4 could also be "background". This matter needs to be resolved.

      There are a few red puncta in the acceptor cell in Figure 6B. Since the acceptor cell is close to and overlapped with other donor cells containing CD63-mCherry, the red signal may, as the reviewer suggests, be from donor cells and not as a result of transfer through tubular connections. However, donor-acceptor cultures of HEK293T where transfer is not observed, little CD63-mCherry signal, for example, in Figure 6a, was seen in acceptor cells, even during several hours of observation (Figure 6- figure supplement video). A minor red signal could arise from exosomes secreted by donor cells that are internalized by acceptor cells. Images of single-culture receptor cells were added in Figure 4- figure supplement 1.

      For Figure 8, we used MDA-MB-231 syncytin-2 knock-down cells containing Fluc:Nluc:mCherry as the receptor cell, thus in these experiments the red signal most likely represents mCherry expressed in the acceptor cells.

      In Figure 4, we observed moving punctum in a tubular connection which contained co-localized green, red, and purple signals, corresponding to SBP-Cas9-GFP, streptavidin-CD63-mCherry, and the WGA conjugate, respectively. The video of punctum transport (Figure 4-figure supplement video) suggests that the red signal is not “background”.

      5) Why do HEK293T do not transfer to HEK293T?

      a) A major inexplicable result is that HEK293T express high levels of both Syncytin proteins (Figure 7 - supp figure 1A) yet ectopic expression of mouse Syncytin increases transfer (Figure 7E). Why would that be? In addition, Fig 3A shows high transfer rates to A549 cells - which express the least amount of Syncytin. The authors suggest in the discussion that Syncytin in HEK293T might not be functional without real evidence.

      We cannot yet explain why the basal level of syncytin expressed in HEK293 cells is insufficient to promote open-ended tubular connections between these cells. It could be that the proteins are not well represented in a processed form at the cell surface. Nonetheless, ectopic expression of mouse syncytin-A in HEK293T produced some increased transfer but less than when syncytin-A is ectopically expressed in MDA-MB-231 cells (up to 4-fold vs. 30-fold change of Nluc/Fluc signal) (Figure 7E). Furthermore, we have added new results which show that apparent furin-processed forms of syncytin-A, -1 and -2 can be detected by cell surface biotinylation in transfected MDA-MB-231 cells (Figure 8-figure supplement 1D). All we demonstrate is that syncytin in the acceptor cell is required for fusion and we make no claim that it is the only protein or lipid at the cell surface in the acceptor cell required for fusion. Clearly, more work is essential to establish the complexity of this fusion reaction.

      For A549 cells, syncytin-1 is highly expressed in A549 cells, thus it is possible that syncytin-1 in A549 plays crucial roles in the process.

      b) In addition - previous publications (e.g. PMID: 35596004; 31735710) show that over expression of syncytin-1 or -2 in HEK293T cells causes massive cell-cell fusion. The authors do not provide images of the cells, to rule out cell-cell fusion in this particular case.

      Overexpression of syncytin-1 or -2 in cells indeed causes massive cell-cell fusion, while overexpression of syncytin-A induced much less cell fusion than syncytin-1, or -2. We have now added new images shown in Figure 8-figure supplement 1A-C to document these observations. It may be that overexpressed human syncytins are better represented in a furin-processed form in both cell types. In contrast, we did not observe donor-acceptor cell fusion at basal levels of expression of syncytin in HEK293T and MDA-MB-231. For example, the Figure 4-figure supplement video shows that tubular structures were seen to form and break during the course of visualization with a tubule fusion event but no cell fusion to form heterokaryons.

      Reviewer #3 (Public Review):

      In this manuscript, Zhang and Schekman investigated the mechanisms underlying intercellular cargo transfer. It has been proposed that cargo transfer between cells could be mediated by exosomes, tunneling nanotubes or thicker tubules. To determine which process is efficient in delivering cargos, the authors developed two quantitative approaches to study cargo transfer between cells. Their reporter assays showed clearly that the transfer of Cas9/gRNA is mediated by cell-cell contact, but not by exosome internalization and fusion. They showed that actin polymerization is required for the intercellular transfer of Cas9/gRNA, the latter of which is observed in the projected membrane tubule connections. The authors visualized the fine structure of the tubular connections by electron microscopy and observed organelles and vesicles in the open-ended tubular structure. The formation of the open-ended tubule connections depends on a plasma membrane fusion process. Moreover, they found that the endogenous trophoblast fusogens, syncytins, are required for the formation of open-ended tubular connections, and that syncytin depletion significantly reduced cargo Cas9 protein transfer.

      Overall, this is a very nice study providing much clarity on the modes of intercellular cargo transfer. Using two quantitative approaches, the authors demonstrated convincingly that exosomes do not mediate efficient transfer via endocytosis, but that the open-ended membrane tubular connections are required for efficient cargo transfer. Furthermore, the authors pinpointed syncytins as the plasma membrane fusogenic proteins involved in this process. Experiments were well designed and conducted, and the conclusions are mostly supported by the data. My specific comments are as follows.

      1) The authors showed that knocking down actin (which isoform?) in both donor and acceptor cells blocked transfer, and more so in the acceptor cells perhaps due to the greater knockdown efficiency in these cells. However, Arp2/3 complex knockdown in donor cells, but not recipient cell, reduced Cas9 transfer. It would be good to clarify whether the latter result suggests that the recipient cells use other actin nucleators rather than Arp2/3 to promote actin polymerization in the cargo transfer process. Are formins involved in the formation of these tubular connections?

      We thank the reviewer for his/her comments and suggestions. Beta-actin was knocked down in this study. We tried a formin inhibitor, SMIFH2 which resulted in a decrease the Cas9 transfer between cells (Figure 3F).

      2) The authors provided convincing evidence to show that the tubular connections are involved in cargo transfer. Intriguingly, in Figure 4-figure supplement video (upper right), protein transfer appeared to occur along a broad cell-cell contact region instead of a single tubular connection. How often does the former scenario occur? Is it possible that transfer can happen as long as cells are contacting each other and making protrusions that can fuse with the target cell?

      In the Figure 4-figure supplement video (upper right), it may be that several membrane tubes from several different donor cells contact at sites close to one another on the recipient cell resulting in the appearance a broad cell-cell contact. This was a rare observation. In our quantification, only 8 connections were open-ended in 120 cell-cell contact junctions. Once open-ended, or plasma membrane fused, cargo transfer is observed.

      3) The requirement of MFSD2A in both donor (HEK293T) and recipient (MDA-MB-231) cells is consistent with a role for syncytin-1 or 2 in both types of cells. Since HEK293T cells contain both syncytins and MFSD2A but cargo transfer does not occur among these cells, does this suggest that syncytins and/or MFSD2A are only trafficked to the HEK293T cell membrane in the presence of MDA-MB-231 cells?

      A proper answer to this question requires the visualization of syncytins and MFSD2A. The commercial syncytin antibodies were inadequate for immunofluorescence. In advance of the more detailed effort required to tag the genes for endogenous syncytin 1 and 2, we performed live cell imaging and surface biotin labeling of cells transiently transfected to express fluorescently-tagged forms of syncytin-1, -2 and -A. We now show that syncytin-A, -1, and -2 partially localize to the plasma membrane or the cell surface of MDA-MB-231 and at points of cell-cell contact. In fact, overexpression of codon-optimized human syncytin-1, and -2 induced dramatic HEK293T cell-cell fusion. However, at basal levels of syncytin expression, HEK293T could not form open-ended tubular connections, which may be because the basal level of syncytins are not well represented in a processed form at the cell surface or their activity is limited by unknown factors.

      As an independent test of cell surface localization, we used surface biotinylation to show that a fraction of the syncytins can be labeled externally (Figure 8-figure supplement 1D). This fraction shows evidence of proteolytic processing consistent with furin cleavage whereas the overwhelming majority of transfected syncytins detected in a blot of lysates suggests that most remain in the unprocessed precursor form, consistent with the punctate and reticular fluorescence images (Figure 8-figure supplement 1A-C).

      We used IF and GFP-tagged MFSD2A and found this protein partially localized to the plasma membrane of HEK293T cells (Figure 9E, F). Given the results reveal that cargos could be transferred among MDA-MB-231 cells (Figure 2G), syncytin and its receptor appear to function in transfer among these cells.

    1. I mentioned that I regarded this as a discovery. I would like to amplify that statement. We have known for centuries that catharsis and emotional release were helpful. Many new methods have been and are being developed to bring about release, but the principle is not new. Likewise, we have known since Freud's time that insight, if it is accepted and assimilated by the client, is therapeutic. The principle is not new. Likewise we have realized that revised action patterns, new ways of behaving, may come about as a result of insight. The principle is not new.

      Rogers acknowledges the success of the client focused therapy approach. In this passage he explains that the client focused therapy approach has developed expectations a guide so to speak. The therapist has a great idea or as he put it predicted the outcome of the therapy. I think this is important because the therapist can say if therapy is indeed working. Blueprint.

    1. Racism, sexism, ageism, homophobia, some social movements asserted,are distinct forms of oppression with their own dynamics apart from the dynamics of class, even though they may interact with class oppression.

      I like how the author highlights different forms of oppression that people may not think of. We usually automatically think of racism when it comes to oppression, but oppression comes in all types of forms. Referring back to the beginning of 'Oppression as a structural concept,' I think people would not term oppression to many situations because it is thought that to be oppressed there must be some type of dictator/tyrant in rule, but that is not the case. We fail to forget the meaning of oppression which from Merriam-Webster Dictionary means "unjust or cruel exercise of authority or power." We see this unjust authority of power everyday in society in which people exercise their privilege in unjust ways towards the less privileged. One example of this in our society is the murder of George Floyd in which one person unjustly and inhumanely took advantage of their power/authority over another. This is oppression, and we tend to not think of situations like this one, that happen on a daily basis, as such.

    2. Here she does answer one of my questions. I was wondering how to criteria was going to be applied. According to her, presence of any of these categories is grounds for labeling the group as oppressed, which makes sense. However, I do think this may be slightly problematic because it almost equalizes oppression between groups. This is not to say that we should be comparing who is more oppressed than who BUT I do think there are some groups that need a little more attention than others in certain aspects of life.

    3. Social justice, I shall arguein later chapters, requires not the melting away of differences, but institutionsthat promote reproduction of and respect for group •differences withoutoppression

      I agree with this. I don't believe it possible to eliminate groups and I don't think it is necessary. We are all unique and our uniqueness helps us identify who we are. We can't strip that away. Just the way we view other peoples uniqueness need to change. Our differences shouldn't be criticized. Some people may not agree with some differences but that doesn't mean they should disrespect it either. It is unrealistic to think people will accept everything but as long as it is respected than we can have group differences without oppression.

    1. Author Response

      Reviewer #1 (Public Review):

      1) The authors show that there are several classes of Snf1 targets (Fig. 3e), most notably some that are phosphorylated immediately after Snf1 activation by glucose (<5 min) and others that are only phosphorylated after 15 min. In a simple view, all direct Snf1 targets should be phosphorylated immediately after Snf1 activation. Is that the case? What is the overlap between the direct targets found using the OBIKA assay and the slow and fast responding in vivo targets? What about the phosphorylation motif, does it differ between the groups? These points are not discussed in the text except to point out that the direct Snf1 target Msn4 is among the slowly phosphorylated group.

      This is a very good point and we have performed the suggested analysis, which resulted in an interesting finding that we describe now in the text as follows:

      “Notably, of the 145 confirmed target sites, 81 (i.e. 72%) were significantly regulated after both 5 min and 15 min. Of the remaining 64 sites, 32 responded only after 5 min, while the other 32 responded only after 15 min. Some of the former residues are located within Snf1 itself, the -subunit of the Snf1 complex (i.e. Sip1), the Snf1-targeting kinase Sak1, or Mig1, while some of the latter are located within the known Snf1-interacting proteins such as Gln3, Msn4, and Reg1. These observations indicate that Snf1-dependent phosphorylation initiates, as expected, within the Snf1 complex and then progresses to other effectors. Interestingly, based on the residues that responded exclusively after 5 min, we retrieved a perfect Snf1 consensus motif (i.e. an arginine residue in the -3 position and a leucine residue in the +4 position; Supplementary figure 2A). The one retrieved for the residues that respond exclusively at 15 min, in contrast, significantly deviated from this consensus motif (Supplementary figure 2B). The slight temporal deferral of Snf1 target phosphorylation may therefore perhaps in part be explained by reduced substrate affinity due to consensus motif divergence.”

      2) The data showing that Snf1-dependent phosphorylation of Pib2 plays a key role in triggering inhibition of TORC1 is convincing but is entirely dependent on a rescue of the TORC1 inhibition defect seen in cells where Snf1 is inhibited. That is, TORC1 is normally inactivated during glucose starvation; this does not occur when Snf1 is inhibited by 2nm-pp1 but does occur when Snf1 is inhibited in a strain carrying a phosphomimetic version of Pib2 (Pib2SESE). This indicates that Pib2 phosphorylation is sufficient to replace Snf1 signaling and inhibit TORC1 during glucose starvation. However, in a simple model, a phosphodead version of Pib2 (SASA) should have the opposite effect. That is TORC1 should remain active during glucose starvation in the Pib2SASA strain-but that is not the case (Fig. 4g). This point is not discussed in the paper; why do the authors think that TORC1 is inhibited normally in the SASA mutant inhibits TORC1 normally?

      We fully agree with this statement and have highlighted and discussed this issue now in the last paragraph of the results section (where we think this fits best) as follows:

      “In contrast, the separated and combined expression of Sch9S288A and Pib2S268A,S309A showed, as predicted, no significant effect in the same experiment. Unexpectedly, however, the latter combination did not result in transient reactivation of TORC1, like we observed in glucose-starved, Snf1-compromised cells. This may be explained if TORC1 reactivation would rely on specific biophysical properties of the non-phosphorylated serines within Sch9 and Pib2 that may not be mimicked by respective serine-to-alanine substitutions. Alternatively, Snf1 may employ additional parallel mechanisms (perhaps through phosphorylation of Tco89, Kog1, and/or other factors; see above) to prevent TORC1 reactivation even when Pib2 and Sch9 cannot be appropriately phosphorylated. While such models warrant future studies, our current data still suggest that Snf1-mediated phosphorylation of Pib2 and Sch9 may be both additive and together sufficient to appropriately maintain TORC1 inactive in glucose-starved cells”

      Reviewer #2 (Public Review):

      1) Because PIB2 is a major focus of the manuscript, I was surprised that it was not discussed in the introduction. I think it would be appropriate to discuss prior evidence linking this protein to TORC1.

      We thank the reviewer for this suggestion. Pib2 and its role in TORC1 control is now described in the introduction.

      2) The authors introduce mutations into PIB2 at two sites determined to be phosphorylated by SNF1, at S268 and S309. Somewhat confusing results are obtained, in that the PIB2 null and phosphomimic mutants (S268E and S309E) confer a similar TORC1 phenotype, compared to the S268A S308A mutant. These results require further explanation than simply that "TORC1 inactivation defect in SNF1-compromised cells is due to a defect in PIB1 phosphorylation". This is particularly intriguing given that the opposite results are observed with the SCH9 mutants, where the null and alanine mutants confer a similar phenotype compared to the S to E mutants.

      The finding that both loss of Pib2 and expression of the phosphomimetic allele yield the same phenotype is indeed counterintuitive. Hence, we fully agree with the criticism put forward here. We believe that the underlying reason for our observation is based on the unique property of Pib2 in having both a C-terminal TORC1-activating domain (CAD) and an-N-terminal TORC1-inhibitory domain (NID). We have addressed this point briefly in the discussion ("Our current data favor a model according to which Snf1-mediated phosphorylation of the Kog1-binding domain in Pib2 weakens its affinity to Kog1 and thereby reduces the TORC1-activating influence of Pib2 that is mediated by the C-terminal TORC1-activating (CAD) domain via a mechanism that is still largely elusive"), but now also address this issue in the results section as suggested.

      3) The authors conclude, based on the co-IP data in Figure 4H, that interactions between KOG1 and PIB2 are direct. However, it remains possible that interactions between these proteins are mediated by other components of TORC1 or within cells. This should be addressed.

      Please note that the Kog1-Pib2 interaction has previously been demonstrated by different methods. Accordingly, Pib2 has not only been shown to interact with Kog1 (or TORC1) in co-IP studies in vivo (PMID: 30485160, PMID: 29698392), but also by co-IP studies in vitro (PMID: 29698392, PMID: 28483912, PMID: 34535752). In addition, the interaction between Kog1-Pib2 has also been dissected (down to defined domains) by classical two hybrid analyses (PMID: 28481201). All of these studies are cited now in the introduction where Pib2 is discussed.

      4) The authors demonstrate convincingly that the PIB2 and SCH9 SNF1-specific phospho-site mutants have a detectable effect on TORC1, primarily by examining TORC1-dependent phosphorylation of SCH9. What is unclear is whether phosphorylation at these sites has a significant physiological impact on cells. It appears that the rapamycin hyper-sensitivity displayed in Figure 6E is the only data presented to address this question. It would be appropriate for the authors to comment further on the significance of SNF1-dependent phosphorylation of these two substrates.

      To further address the physiological role of the Snf1-dependent phosphorylation of Sch9 and Pib2 combined, we newly assessed the growth rate of the strain that expresses the Sch9SE and Pib2SESE alleles combined. Accordingly, we found the snf1as pib2SESE sch9SE strain to exhibit a significantly higher doubling time than the snf1as strain on both low-nitrogen-containing media and standard synthetic complete media. This is now included in the text (results section).

      Reviewer #3 (Public Review):

      1) Conceptually, the manuscript shows that Snf1 activity is important for the acute inhibition of TORC1 during glucose starvation. However, this is mainly restricted to 10 and 15 minutes of glucose starvation. After 20 minutes, TORC1 is inhibited by some unknown mechanisms independent of Snf1 (Hughes Hallet et al). This raises concern regarding the physiological relevance of Snf1-mediated TORC1 inhibition during acute glucose stress. The authors show that this regulation is important for the survival of cells under TORC1 inhibition. How do the authors envision that the acute role of Snf1 plays an important long-term physiological relevance during rapamycin treatment? Providing more support for the physiological relevance of this regulation will make this study of interest to a broad readership.

      Please see our response to point 4 of reviewer #2.

      2) Another major concern of the manuscript is the inconsistencies between the various representative immunoblots and their quantifications. The effect of AMPK activity on TORC1 signaling under glucose starvation seems very subtle. A few specific concerns are mentioned below:

      a) In figure 1A, the increase in TORC1 activity upon inhibition of analogue sensitive Snf1as by 2NM-PP1 is very marginal. Although quantification shows a significant increase, a representative western blot figure should be shown.

      We have replaced the original immunoblots with more representative ones in Figure 1A.

      b) Does deleting Snf1 itself have any effect on TORC1 activity? Lane 4 of figure 1A shows reduced activity compared to lane 1.

      TORC1 activity is generally assessed as the ratio between phosphorylated Sch9 and total Sch9 (see also below under (e)). Accordingly, based on the quantification of 6 blots (we added two more experiments to address this point; Figure 1B), loss of Snf1 has no significant impact on TORC1 activity in exponentially growing cells, as we expected.

      c) To show the effect of Snf1 on the repression of TORC1, the time-course experiments are run on two separate gels in figure 1C. Hence, it is difficult to compare the effect of Snf1 on unscheduled reactivation of TORC1 under glucose starvation.

      Please note that the data of the two blots were cross-normalized to the sample from exponentially growing cells (labeled “Exp”; i.e. the same sample was loaded on the two blots) in order to compare and quantify the effects of Snf1.

      d) In figure 1E, the effect of Reg1 deletion on TORC1 activity seems minor as both phospho- and total levels of Sch9 are reduced.

      As correctly pointed out by this reviewer, we consistently found the total Sch9 levels to be lower in reg1Δ cells when compared to wild-type cells. To assess TORC1 activity, we therefore always determine the ratio between phosphorylated Sch9 and total Sch9, and the respective ratio is significantly different in reg1∆ cells when compared to wild-type cells. We speculate that the reduced Sch9 levels in this mutant are caused by the reduced growth rate (PMID: 22140226) and hence lower protein synthesis rate (to which translation of SCH9 mRNA may be specifically sensitive).

      Since further mechanistic insights are based on these initial findings of figure 1, solidifying these observations is very important.

      3) In figure S1, the analogue sensitive Snf1as shows significant reduction in its activity (reduced S79 phosphorylation of ACC1-GFP). This raises the concern of whether this genetic background is an ideal system to resolve the mechanism of TORC1 suppression.

      The Snf1as allele is indeed hypomorphic, which we acknowledge appropriately in the text. We would like to point out however, that we took great care in each experiment to include the DMSO control that allowed us to unequivocally assign any observed effects to the specific drug-mediated inhibition of Snf1as. Importantly, we think that the hypomorphic nature of the Snf1as allele (which allows normal growth on non-fermentable carbon sources) represents a minor trade-off when compared to the advantages that this allele provides over the use of a snf1∆ strain, which exhibits a fundamentally reprogrammed transcriptome/proteome (PMID: 17981722). Accordingly, this allele allows the assessment of Snf1 inhibition on very short time scales while minimizing confounding large-scale proteome rearrangements that may indirectly affect the studies. Moreover, use of the Snf1as allele also allowed us to compare our results more directly with other phosphoproteome studies that used the same allele (PMID: 25005228, PMID: 28265048). Finally, please also note that our main conclusions (on Snf1-mediated control of TORC1) are corroborated by additional genetic data such as the ones in Figure 1A/E where we use snf1∆ and reg1∆ cells.

      4) In figure 2, during glucose restimulation, there is increased retention of Snf1as-pThr210 in the presence of 2NM-PP1. This suggests that the upstream glucose sensing pathway as well as Snf1 might be more active than in DMSO-treated cells. This also raises concerns regarding the suitability of the genetic background for the study. Can authors comment on why this phosphorylation persists? Does the phosphoproteomic analysis give any hint for this phenotype?

      This is a very good point. In fact, we forgot to mention in the text that the observed effect of the 2NM-PP1 treatment on Snf1-Thr210 phosphorylation has already been studied and mechanistically explained earlier (PMID: 23184934). Accordingly, the entry of the drug into the broader catalytic cleft of the Snf1as mutant causes the catalytic domain to be stabilized in a conformation, which prevents dephosphorylation of pThr210 by the dedicated Glc7-Reg1 phosphatase heterodimer. This can be observed each time when we compared 2NM-PP1- and DMSO-treated cells and probed for Snf1-Thr210 phosphorylation. This is, in fact, an independent control for proper 2NM-PP1 functioning. We have now added a sentence (including reference) that pinpoints this issue in the text.

      5) In figure 4H, where authors claim reduced binding of Kog1 to Pib2SESE, levels of Kog1 in input are also reduced. Can authors provide further support using colocalization studies? Also, does Pib2SESE has any defect in forming Kog1 bodies?

      We took great care to load equal amounts of IPed Pib2-myc variants and then normalized the co-IPed Kog1-HA on the IPed Pib2-myc variant levels. The Kog1-HA input levels vary a bit between the 4 experiments, but they are on average not significantly lower in Pib2SESE-myc-expressing cells when compared to WT cells. In addition, in our Co-IP experiments, the beads are saturated with Pib2-myc variants and Kog1-HA levels are generally not limiting. We therefore deem it fair to say that the Pib2SESE has a reduced affinity for Kog1. Based on our experience with other co-localization studies of membrane-bound proteins and protein complexes (e.g. TORC1 versus EGOC), we find it extremely difficult to quantify local interactions by fluorescence microscopy (unless they are close to all or nothing). In this case, where we have a partial defect in the interaction between Kog1 and Pib2SESE, we anticipate that such analyses will not allow us to draw additional conclusions.

      Regarding the issue of Kog1/TORC1-body formation: all of our mutations in PIB2 and SCH9 were introduced (by CRISPR-Cas9) in the genome of our snf1as strain, which was used throughout this study. To analyze Kog1/TORC1-bodies, we have therefore first tried to C-terminally tag KOG1 with GFP in the genome of our strain background (similarly as was done in the original description of Kog1 bodies; PMID: 26439012). However, because all our attempts failed to create KOG1-GFP in our strain, we assumed that this construct may be lethal in our strain background. This is not completely unexpected, as it is known that the Kog1-GFP allele is hypomorphic and temperature sensitive (PMID: 19144819). In an alternative approach, we have therefore set out to study TORC1 body formation in our strains by using a GFP-TOR1 allele that can be integrated into the genome and that expresses functional TORC1 (PMID: 25046117). As we have described earlier, the respective GFP-Tor1 construct localized on vacuolar membranes and on foci that we previously have shown to correspond to signaling endosomes (PMID: PMID: 30732525, 30527664). Unexpectedly, however, when we starved the respective cells for glucose, the number of GFP-Tor1 foci did only marginally increase (20%) in our strain background over a period of up to 1 hour. Given these various unexpected issues, we prefer to not include any of these preliminary data in the current version of our manuscript, but to rather follow up on these observations in a separate study. We deem this particularly justified as the current literature on TORC1-body and TOROID formation also appears controversial and may need further clarification. For instance, while TORC1-body formation has been suggested to represent a Snf1-dependent process that is dispensable for TORC1 inhibition (PMID: 30485160), TOROID formation has been suggested to represent a Snf1-independent process that is mechanistically linked to TORC1 inhibition (PMID: 28976958).

      6) In figure 5F, where the authors claim the Sch9SE mutant has lower TORC1 activity, the difference is very minor. Furthermore, corresponding lanes also show reduced levels of Snf1as expression. Hence, improved blots are required here. Also, an in vitro kinase assay with full-length Sch9 KD with and without the Ser288 mutation could solidify the observation that phosphorylation of Ser288 indeed affects TORC1-mediated phosphorylation.

      We have replaced the blots in Figure 5F with an alternative set that more clearly highlights the (statistically significant) differences, while also exhibiting more equal levels of Snf1as levels. Regarding the in vitro kinase assays: we have repeatedly tried to perform TORC1 kinase assays on full length Sch9KD without success. We currently believe that proper TORC1-mediated phosphorylation of Sch9 may have to occur on membranes to which both TORC1 and Sch9 are tethered through phospholipid interactions (PMID: 29237820). We are trying to set up such a system on liposomes, but we assume that this will be a major effort that cannot be resolved in due time.

      7) In figure 6E, the Sch9SE mutant shows no effect in the presence of rapamycin. Thus, in vivo, phosphorylation at Ser288 may not be perturbing the phosphorylation of Sch9 by TORC1.

      When cells are grown on glucose where TORC1 is highly active (as in Fig. 6E or 6A/B in Exp), expression of Sch9SE has no significant effect indeed. However, in glucose-starved cells, where TORC1 activity is low, expression of the Sch9S288E allele clearly and significantly contributes to inhibition of Sch9-Thr737 phosphorylation by TORC1 (Figure 6A/B and Figure 5F/G).

      8) According to the author's proposed mechanism, TORC1 activity in Pib2SASA or Pib2SASA/Sch9SA backgrounds should be higher during glucose starvation compared to the control strains. However, glucose starvation shows a similar level of reduction in TORC1 activity in these backgrounds. This raises concern regarding the proposed mechanism. The authors mainly base their conclusions on Ser to Glutamate mutants. The authors should be cautious that Ser to Glutamate changes may also affect the protein structure which can confer similar phenotypes. How do the authors justify this discrepancy?

      Please see our response to point 2 of reviewer #1.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors sequence some of the oldest maize macroremains found to date, from lowland Peru. They find evidence that these specimens were already domesticated forms. They also find a lack of introgression from wild maize populations. Finally, they find evidence the Par_N16 sample already carried alleles for lowland adaptation.

      Overall I think this is an interesting topic, the study is well-written and executed for the most part. I have a variety of comments, most important of which revolve around methodological clarity. I will give those comments first.

      1) The authors should say in the Results section how "alleles previously reported to be adaptive to highlands and lowlands, specifically in Mesoamerica or South America" were identified in Takuno et al. 2015. What method was used? I see this partly comes in the Discussion eventually, but it would help to have it in the Results with more detail. The answer to this question would help a skeptical reader decide the appropriateness of the resource, given that many selection scans have been performed on maize genomes, the choice would ideally not be arbitrary.

      This was explained in more detail in the Material and Methods section, to keep the Results and Discussion sections more concise. However, we agree that adding a brief explanation in the Results section would be useful and we have modified the revised version accordingly. Now the relevant part of the section Specific adaptation to lowlands in Mesoamerica and South America reads as follows: “To assess this, we identified in Par_N16 all covered SNPs with alleles previously reported to be adaptive to highlands and lowlands, specifically in Mesoamerica or South America by Takuno and coworkers (Takuno et al., 2015). These authors used genome-wide SNP data from 94 Mesoamerican and South American landraces and identified SNPs with significant FST values to infer which allele was likely adaptive. For example, those SNPs showing significant FST only in Mesoamerica, were characterized as adaptive for lowlands if they were at high frequency in the lowland population and at low frequency in the highland population, and vice versa. The same was applied for South America (Takuno et al., 2015). They identified 668 Mesoamerican and 390 South American previously reported adaptive SNPs, from which 32 and 20 were covered in Par_N16, respectively.”

      2) How were the covered putative adaptive SNPs distributed in the genome? Were any clustered and linked? The random sampled SNPs should be similarly distributed to give an appropriate null.

      The SNPs in Takuno et al. (2015) are in general at a median distance of 353 bp from each other. The 20 adaptive sites covered in Par_N16 for South America (SA) are at a median distance of 8,301,843 bp (approximately 8.3 Mbp), while the 32 for Mesoamérica (MA) are at a median distance of 24,295,968 bp (approximately 24.3 Mbp). SNPs in five pairs from Mesoamerica are closer than 100 bp between them, but each pair is at a considerable distance (beyond 1 cM) from each other and from other SNPs covered in Par_N16. This same happens for only one SNP pair from South America. Then, in general, the covered adaptive SNPs are not clustered. For our random samples, the range of genomic distances between SNPs is similar to those of adaptive SNPs. This shows that our null distributions are adequate for our statistical purposes. The genomic positions of covered adaptive sites in Par_N16 are now included in a new Table in the revised version (Supplementary File 2). We have included these observations in the main text (section Specific adaptation to lowlands in Mesoamerica and South America), as follows: “In general, adaptive SNPs represented in Par_N16 were not clustered. The 20 South American adaptive SNPs are at a median distance of 8,301,843 bp, while the 32 Mesoamerican SNPs are at a median distance of 24,295,968 bp (Supplementary File 2). SNPs in five pairs from MA are closer than 100 bp between them, but each pair is at a considerable distance (beyond 1 cM) from each other and from other SNPs. This same happens for only one SNP pair from SA. Then, although at low proportions, the adaptive SNPs in Par_N16 are a bona fide representation of different genomic responses to selection pressures...” and “We analyzed some of these random samples and observed a similar behavior as the adaptive SNPs regarding the range of distances between SNPs (Fig, S18).”

      3) How is genetic similarity calculated? It should be briefly described in the Results.

      This is formally explained in the Material and Methods section, but now we have included a brief description in the Results section (Specific adaptation to lowlands in Mesoamerica and South America) as follows: “The allelic similarity is the average of the frequencies of the Par_N16 alleles in the intersected sites with each test population (see Material and Methods).”

      4) It would help for the authors to state why they focus on Par_N16, I did not see this in my reading. Presumably, the analyses done are because of the higher quality data, but it would also help to mention why Par_N16 was sequenced in an additional run.

      Indeed, Par_N16 has an endogenous DNA content of 1.1 %, while the other two samples presented a very low DNA content (0.2%). Therefore, we decided to invest more in the best sample, as a cost/benefit decision for additional sequencing. We have included brief explanations of this in the revised text. In the Results section Paleogenomic characterization of ancient maize samples, it reads as follows: “Due to its higher endogenous DNA content (one order of magnitude larger, we further sequenced the Par_N16 library, obtaining 459M additional reads, to generate a total of 851M for this sample (Table 2).” and “To determine if the specific elimination of C to T and G to A modifications could bias the results in favor of maize rather than teosinte alleles, an additional database was generated in which all transitions were eliminated (i.e., only transversions were included) in Par_N16 only, because it was the only sample with enough sequencing data to conduct this experiment.” While in the section Tests of gene flow from mexicana, is as follows: “Par_N16 was the only sample with enough DNA sequence data to perform this analysis. All the samples showed the same phylogenetic position; therefore, Par N 16 was considered to be representative of ancient Paredones maize.”

      5) In the sections on phylogenetic analysis, introgression, and D statistics, the authors could do a better job specifically indicating how the results support their conclusions.

      Precise indications of how our results support our conclusions are given in the Discussion section. Nevertheless, we added relevant sentences in the specified sections. In the section Relationship between ancient maize, extant landraces, and Balsas teosinte, we added the following: “Thus, based on genome-wide relatedness, Paredones maize clusters with extant domesticated Andean landraces, supporting both, a single origin for maize and that these Peruvian samples were already domesticated.” In the section on introgression and D-statistics (Tests of gene flow from mexicana), we improved the last sentence as follows: “These results consistently show the absence of significant gene flow between Par_N16 and mexicana, implying that the lineage that gave rise to Paredones maize left Mesoamerica without relevant introgressions from this teosinte.”

      Reviewer #2 (Public Review):

      In this foundational article, the authors conduct an ancient DNA characterization of maize unearthed in archaeological contexts from Paredones and Huaca Prieta in the Chicama river valley of Peru. These maize specimens were recovered by painstakingly controlled excavation. Their context would appear to be beyond reproach though the individual radiocarbon determinations should be subject to further scrutiny.

      1) Radiocarbon determination for at least one of the maize cobs analyzed for aDNA is not a direct date, but dates associated material. The authors should provide a table of the direct dates on the specimens that were analyzed for ancient DNA. They should also specify the type and quantity of material sent and whether the cob, glumes, pith, or husks were submitted for dates. Include δ13C determinations for each cob with laboratory analysis numbers because there is justifiable concern that at least one of these cob dates has a δ13C value suggesting the material dated is not maize. Generally, the δ13C for maize ranges from -14 to -7. One or more of the specimens subjected to ancient DNA analysis in this paper have δ13C values far outside of this confidence interval.

      The indirect radiocarbon date on a maize cob was derived from a single piece of wood charcoal in a hearth directly associated with the analyzed cob, both embedded in a thin intact floor in Unit 20 at the Paredones site. The assay on the charcoal and the floor are in an undisturbed stratigraphic context and are in agreement with assays on other maize and charcoal remains in floors both above and below the hearth. We have included this information in Table 1 in the revised version. The information sought by Reviewer 2 on the studied cobs was published previously in Grobman et al. 2012 and in Dillehay 2017. Since details of the cobs were published, we decided to submit only what we thought were pertinent data for this manuscript.

      As for the δ13C reading of one cob outside of the confidence interval for maize, the dated specimen with this value is a maize husk fragment. Both the macro- and micro-morphology and the ancient DNA analysis of the husk demonstrated it was maize. We do not understand what affected the δ13C value for this specimen. Similarly, three human skeletons from deeper site levels have δ13C values greater than the expected range for human remains.

      2) From the perspective of future scientists being able to repeat the analyses performed here, I would hope that all details of specimen treatment, extraction methods, read length and quality would need to be assiduously described. Routine analytical results should be reported so that comparisons with earlier and future results are facilitated, and not made difficult to decipher or search for.

      The general procedures for accurate ancient DNA extraction were described in Vallebueno-Estrada et al. 2016 and we do not see the need to repeat this information in this article. Specific aspects of sample treatment and DNA extraction of the samples analyzed here are described in the Material and Methods, section on Extraction and sequencing of ancient samples. Results on quality (percentage of endogenous DNA, quality-filtered reads, mapped reads to either repetitive or unique regions, amount of sequence mapped, mapping Phred scores, estimated error rates, percentage of deamination, fragment median lengths, percentage of sites with signatures of molecular damage, number of unique genomic sites covered and their corresponding average sequencing depth) are described in the Results, section Paleogenomic characterization of ancient maize samples. This section also includes the number of SNPs in relation to the reference and the number of intersected SNPs between our samples and the HapMap3 database. In addition, complementary information to this section is included in Tables 2-4 and supplementary Figures S2-S6, as properly referenced in the last mentioned section.

      3) The aDNA analysis may or may not be affected by the anomalous δ13C values but one would anticipate that standard aDNA extraction and analysis protocols would provide a means by which the specimen's preservation of the specimens could be ascertained, for example, perhaps deamination and fragmentation rates could be compared or average read length evaluated with modern-contemporary materials so that preservation of the Paredones samples relative to that of maize in the CIMMYT germplasm bank and the San Marcos specimens investigated by the same researchers can be evaluated.

      Average read length from contemporary material depends more on the sequencing platform than sample preservation. For example, Illumina can only read fragments of hundreds of base pairs, while MinIon or PacBio can read fragments in the order of kb. Also, deamination is not an issue in DNA extracted from modern material (unless bisulfite is used for methylation detection). Comparison with San Marcos samples indicates that Paredones samples are heavily degraded, although this is not a function of time only (humidity, temperature, and pH are among other relevant factors). Therefore, to avoid misleading interpretations, we are not including a comparison with San Marcos samples in the revised version.

      4) The size and shape of the cobs depicted are similar to specimens occurring much later in Mesoamerican assemblages. For example, the approximate rachis diameter of the San Marcos specimens depicted by Valle-Bueno et al. (2016: Fig.1) averages less than 0.5cm while the specimens depicted in Valle-Bueno et al. (this manuscript) average 1.0 cm. The former - San Marcos - specimens are dated at 5300-4970 BP cal while the larger - Paredones - specimens date roughly 6777 - 5324 BP cal. The considerable disparity among the smaller more recent specimens compared to the very much larger putatively older specimens suggests the Paredones specimen's radiocarbon determinations are equivocal. The authors point this out but repeatedly state these cobs are the most ancient; a conundrum that should be resolved.

      Radiocarbon determinations in Paredones are not equivocal, on the contrary, they are perfectly in agreement with and supported by the unimpeachable stratigraphy of the site and by more than 150 other radiocarbon and OSL dates from Paredones and nearby excavated contexts. The difference in morphology between the more recent samples from Tehuacan and the more ancient samples from Paredones is exactly the paradox we try to address. Our results indicate that the rapid migration and adaptation of maize to the coast of Peru in comparison with a slower migration and adaptation to Tehuacan lands explains this apparent conundrum. This rapid movement and migration allowed the presence of more “modern” maize in Peru than in Tehuacan on the respective dates. This more rapid maize development also coincides with more rapid and advanced socio-cultural transformations in Peru, including proto-urbanism (i.e, first cities), early religious symbolism, long-distance irrigation canals, and other major innovations that far exceed what was happening in Mesoamerica at the time.

      5) I would suggest the authors consider redating these three specimens and if they do, hope that they will prepare the laboratory personnel with depositional environment information. MacNeish was skeptical about late dates on maize at Tehuacan, at first. Adovasio was initially certain about maize's associated dates from Meadowcroft. One would prefer to be reasonably certain the foundation this article creates is solid; the author's repeated reference to these cobs as the most ancient in the Americas should be reaffirmed so retraction will not be necessary.

      As discussed in Grobman et al. 2012 and in Dillehay 2017, we do not confide in C14 dating of unburned corn remains due to the possible intrusion of fungi in the soft cellular structure of cobs. The chrono-stratigraphically acceptable dates on cobs and other maize remains were taken on burned and hard tissue remains, such as husks. See detailed discussion in Supplementary Materials.

      MacNeish and Adovasio were excavating cave and rock shelter sites, which are known to often have areas of stratigraphically disturbed deposits. Paredones, Huaca Prieta, SR-18 and other Preceramic sites excavated in the study area here contain late to early varieties of maize and radiocarbon assays that are in chrono-stratigraphic agreement. As noted in the main text and in prior publications, these sites are open air localities with clear stratigraphy defined by intact floor and fill sequences, with no tree root, animal burrowing, or other major taphonomic disturbances.There were occasional hearths and pits (i.e., human burials) that intruded into deeper floor-fill sequences but none of the assayed and studied maize samples were derived from these contexts. Once again, we encourage readers to examine the stratigraphy shown in the main text and in Grobman et al. (2012) and Dillehay (2017). Moreover, as noted in the text, there is a growing number of Preceramic sites in South America that date between 6800 and 6000 years ago and later that contain micro-maize remains (see Kistler et al., 2018). Not all of these sites are well-dated and present reliable contexts, but several have good chrono-stratigraphic settings and micro-evidence (e.g., phytoliths, starch grains) indicative of a maize presence at or prior to 6000 years ago.

    1. First, I am a big fan of Chris’ posts. He is our best historian. Second, I did not challenge his ideas but asked for clarification about some terms which I believe are of general interest. Chris is well-positioned to answer my questions. Third, statistical mechanics is more about microscopic systems that do not evolve. As we know, ideas (from concepts to theories) evolve and generally emerge from previous ideas. Emergence is the key concept here. I suggested Phenomics as a potential metaphor because it represents well the emergence of some systems (phenotypes) from pre-existing ones (genotypes).

      reply to u/New-Investigator-623 at https://www.reddit.com/r/antinet/comments/10r6uwp/comment/j6wy4mf/?utm_source=reddit&utm_medium=web2x&context=3

      Ideas, concepts, propositions, et al. in this context are just the nebulous dictionary definitions. Their roots and modern usage have so much baggage now that attempting to separate them into more technical meanings is difficult unless you've got a solid reason to do so. I certainly don't here. If you want to go down some of the rabbit hole on the differences, you might appreciate Winston Perez' work on concept modeling which he outlines with respect to innovation and creativity here: https://www.youtube.com/watch?v=gGQ-dW7yfPc.

      I debated on a more basic framing of chemistry or microbiology versus statistical mechanics or even the closely related statistical thermodynamics, but for the analogy here, I think it works even if it may scare some off as "too hard". With about 20 linear feet of books in my library dedicated to biology, physics, math, engineering with a lot of direct focus on evolutionary theory, complexity theory, and information theory I would suggest that the underlying physics of statistical mechanics and related thermodynamics is precisely what allows the conditions for systems to evolve and emerge, for this is exactly what biological (and other) systems have done. For those intrigued, perhaps Stuart Kauffman's Origins of Order (if you're technically minded) or At Home in the Universe (if you're less technically oriented) are interesting with respect to complexity and emergence. There's also an interesting similar analogy to be made between a zettelkasten system and the systems described in Peter Hoffman's book Life's Rachet. I think that if carefully circumscribed, one could define a zettelkasten to be "alive". That's a bigger thesis for another time. I was also trying to stay away from the broad idea of "atomic" and drawing attention to "atomic notes" as a concept. I'm still waiting for some bright physicist to talk about sub-atomic notes and what that might mean... I see where you're going with phenomics, but chemistry and statistical mechanics were already further afield than the intended audience who already have issues with "The Two Cultures". Getting into phenomics was just a bridge too far... not to mention, vastly more difficult to attempt to draw(!!!). 😉 Besides, I didn't want Carol Greider dropping into my DMs asking me why didn't I include telomeres or chancing an uncomfortable LAX-BWI flight and a train/cab ride into Baltimore with Peter Agre who's popped up next to me on more than one occasion.

      Honestly, I was much less satisfied with the nebulousness of "solution of life"... fortunately no one seems to be complaining about that or their inability to grapple with catalysis. 🤷🏼

  3. blogs.baruch.cuny.edu blogs.baruch.cuny.edu
    1. objective,impersonal,formal,

      This may be surprising to hear from a journalism major, but I don't believe that objectivity truly doesn't exist in writing. We can try to remain as impartial as we can, but we are human and prone to error. Bias will slip in. However, a work that is truly objective would just be boring! Then there's the question of "seems objective to who?" To journalists like Wesley Lowrey, objectivity skews to what a perceived white and male audience will think. When one considers the realm of academia, do these "many professors" think the same? What does that say about academia itself?

    1. Set Clear Classroom Expectations For AI-Generated Writing

      I think this is crucial, along with educating the public, parents, students, other teachers about what it can and can't do. The more familiar people are then scarmongering and negative attitudes towards its use might be addressed quickly. Everyone is an expert and have their own views based on what they have read or been told so we may as well do what we can to promote ethical and sensible/useful examples of its application support teaching and learning.

    1. Are there symbols for 'supported by' or 'contradicted by' etc. to show not quite formal logical relations in a short hand?

      reply to u/stjeromeslibido at https://www.reddit.com/r/Zettelkasten/comments/10qw4l5/are_there_symbols_for_supported_by_or/

      In addition to the other excellent suggestions, I don't think you'll find anything specific that that was used historically for these, but there are certainly lots of old annotation symbols you might be able to co-opt for your personal use.

      Evina Steinova has a great free cheat sheet list of annotation symbols: The Most Common Annotation Symbols in Early Medieval Western Manuscripts (a cheat sheet).

      More of this rabbit hole:

      (Nota bene: most of my brief research here only extends to Western traditions, primarily in Latin and Greek. Obviously other languages and eras will have potential ideas as well.)

      Tironian shorthand may have something you could repurpose as well: https://en.wikipedia.org/wiki/Tironian_notes

      Some may find the auxiliary signs of the Universal Decimal Classification useful for some of these sorts of notations for conjoining ideas.


      Given the past history of these sorts of symbols and their uses, perhaps it might be useful for us all to aggregate a list of common ones we all use as a means of re-standardizing some of them in modern contexts? Which ones does everyone use?

      Here are some I commonly use:

      Often for quotations, citations, and provenance of ideas, I'll use Maria Popova and Tina Roth Eisenberg's Curator's Code:

      • ᔥ for "via" to denote a direct quotation/source— something found elsewhere and written with little or no modification or elaboration (reformulation notes)
      • ↬ for "hat tip" to stand for indirect discovery — something for which you got the idea at a source, but modified or elaborated on significantly (inspiration by a source, but which needn't be cited)

      Occasionally I'll use a few nanoformats, from the microblogging space, particularly

      • L: to indicate location

      For mathematical proofs, in addition to their usual meanings, I'll use two symbols to separate biconditionals (necessary/sufficient conditions)

      • (⇒) as a heading for the "if" portion of the proof
      • (⇐) for the "only if" portion

      Some historians may write 19c to indicate 19th Century, often I'll abbreviate using Roman numerals instead, so "XIX".

      Occasionally, I'll also throw drolleries or other symbols into my margins to indicate idiosyncratic things that may only mean something specifically to me. This follows in the medieval traditions of the ars memoria, some of which are suggested in Cornwell, Hilarie, and James Cornwell. Saints, Signs, and Symbols: The Symbolic Language of Christian Art 3rd Edition. Church Publishing, Inc., 2009. The modern day equivalent of this might be the use of emoji with slang meanings or 1337 (leet) speak.

  4. Jan 2023
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2021-01111

      Corresponding author(s): Esther Stoeckli

      [Please use this template only if the submitted manuscript should be considered by the affiliate journal as a full revision in response to the points raised by the reviewers.

      If you wish to submit a preliminary revision with a revision plan, please use our "Revision Plan" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements [optional]

      Dear editors at Review Commons

      Thanks for your patience. We have finally carried out a full revision of our originally submitted manuscript summarizing our findings on the role of Cables1 in axon guidance.

      In our study, we provide in vitro and in vivo evidence for a role of Cables1 as a linker between axon guidance signaling pathways. Commissural axons in the developing spinal cord leave their intermediate target, the floor plate, due to a switch from attraction to repulsion mediated by the specific trafficking of Robo1 receptors to the growth cone surface. The presence of Robo1 on growth cones after contact with the floor plate allows them to respond to Slit, the negative guidance cue associated with the floor plate. After leaving the floor plate on the contralateral side, growth cones respond to a Wnt gradient along the antero-posterior axis. The responsiveness to Wnt of post- but not pre-crossing axons is regulated by the trafficking of Fzd3 receptors to the growth cone membrane of post-crossing axons (Alther et al., 2016), but also by the specific phosphorylation of β-Catenin at tyrosine Y489 by Abl kinase. Cables1 mediates this phosphorylation by transferring Abl kinase from the C-terminus of Robo1 to β-Catenin (this study).

      The revised version of the manuscript contains additional experiments in vitro, in vivo and ex vivo combined with live imaging to further support our conclusion about the role of Cables1 as a linker between Robo/Slit and Wnt signaling.

      It took as longer than expected to carry out these new experiments, as Nikole Zuñiga, the first author of the paper, left the lab after her PhD defense to take up a job in industry. Unfortunately for the study, but fortunately for Giuseppe Vaccaro, he also got a job soon after taking over the project. Therefore, the revision was delayed again. We hope that the additional experiments will solve the issues that were raised by the reviewers. We thank them for their contributions and suggestions.

      Best regards

      Esther Stoeckli

      2. Point-by-point description of the revisions

      This section is mandatory. *Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. *

      Point to point response to reviewers’ comments

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): In this work by Zuñiga et al. the authors study the role of the adaptor protein Cables1 on the guidance of post-comissural spinal cord neurons. They hypothesize that commissural axons need Cables1 to leave the floor plate and turn to ascend to the brain. They propose that during this process, Cables1 acts as a linker of two key axon guidance pathways, Slit and Wnt. Cables1 would localize β-catenin phosphorylated at tyrosine 489 to the distal axon and this would be necessary for the correct turning and navigation of post-crossing commissural axons. Although the work may be potentially interesting, there are major issues that authors need to address in order to state their claims:

      -Fig. 2. To visualize the axonal phenotype after downregulation of Cables1 the authors use DiI labelling. This difficults the interpretation of the results as both electroporated and non- electroporated axons are labelled. Since the authors have a Math1::tdTomatoF reporter construct (as in Fig. 3), it would be desirable to use this construct Math1::tdTomatoF in combination with the dsCables1 plasmid to better visualize the phenotype. Alternatively and less preferred, GFP signal should be also shown in Fig.2B experiments.

      We respectfully disagree. Most likely, the reviewer thinks about a defined nerve that has a particular trajectory and then when labelled with a fluorescent marker, deviations from this pathway, or defasciculated growth can be easily visualized. However, in the spinal cord, the dI1 axons run ventrally more like a ‘curtain’. Therefore, the aberrant behavior of axons is difficult to see. We therefore, opted for the alternative suggestion and added the GFP images to visualize clearly that the axons labelled with DiI are from the injected area. We also would like to add that we are extremely careful in injecting DiI only to the dI1 population of commissural axons to avoid mixing populations with different trajectories. As the analysis is done by a person blind to the experimental condition, we are convinced that our way of analyzing the phenotype is valid. An approach that has been successfully used by many groups for decades now. Please also keep in mind that we are always comparing groups of embryos with each other. Furthermore, having axons traced by DiI which were not targeted by dsRNA electroporation would not increase but rather decrease the likelihood of aberrant behavior. Therefore, we are convinced that our method of quantification is valid.

      However, we have added new experiments using live-imaging which also demonstrate that many axons in embryos electroporated with dsCables1 fail to turn properly at the floor-plate exit site (see Movie 2). These experiments provide additional evidence for the validity of our results.

      -Fig. 2B and Supp.Fig.3. Comparable DiI labellings should be shown in the different conditions. The three examples shown in this panel despite different amount of DiI-labeled axons making it difficult to compare them.

      We have exchanged the image of the control-treated embryo in Figure 2 to have more comparable DiI injection sites. However, as we detail in our Material & Method section, the quantification was done in such a way that the number of axons does not matter. We rephrased this paragraph to make this point more clear (lines 630ff). Please also refer to the GFP-expressing control sample shown in Figure 6A.

      We counted a DiI injection site as showing floor-plate stalling when at least 50% of the fibers entering the floor plate failed to reach the exit site. Similarly, ‘No turn’ means that at least 50% of the axons at the exit site failed to turn rostrally. Because, these two phenotypes are not independent of each other (100% stalling prevents the analysis of the turning phenotype), we only did a statistical analysis for the DiI injection sites with correctly turning axons. We also would like to point out that we hardly had injection sites where it was difficult to decide whether the 50% threshold was reached or not.

      -Fig. 2D. An scheme depicting the different phenotypes: "normal", "FP stalling" and "no turn" would help to understand the results. They can use schemes similar to those shown in Fig. 2K Parra et al. 2010.

      We have added a scheme outlining the different phenotypes, as suggested to Figure 2A.

      -Fig. 3A. The open-book drawing is confusing. It seems that they are analyzing open-book preparations in this experiment when this is not the case.

      Now Figure 4: We have changed the schematic explaining our experimental design. We wanted to illustrate that we only took the dorsal-most part of the spinal cord, dissected from open-book preparations of the spinal cord, as explants to avoid the inclusion of other cell types.

      -Fig. 3B. Authors claim that Cables1 is not required in pre-crossing axons as dsCables electroporation does not affect axonal growth of DiI neurons taken at HH22. However, to be sure that Cables1 mRNA levels are downregulated in pre-crossing axons, relative levels of Cables1 mRNA and/or protein should be also determined at HH22 not only at HH25.

      We have clarified the quantification of downregulation efficiency. The qPCR data are taken from HH23, that is one day after electroporation. The Western blot data show differences in protein levels at HH25, that is 2 days after electroporation. In both cases, the downregulation efficiency is about 50%. This means that we got rid of all Cables1 mRNA, as we successfully transfected 50% of the cells in the targeted area (52.5% in n=4 embryos). The cell numbers were determined by counting the ratio of GFP-positive cells from transfected spinal cords in a single cell suspension.

      -Fig. 4. The incapacity of Slit to induce axonal retraction in dsCables1 neurons is used to conclude that Cables1 is required to respond to Slit. However, downregulation of Cables1 by itself is even more effective inhibiting axonal growth than Slit treatment. Upon this strong effect as a background, it is difficult to assay slit response. Authors should point this observation in the manuscript.

      We disagree. There is no significant difference between the neurite lengths between the control neurons in the presence of Slit and the neurons lacking Cables1 (dsCables1), p=023, or the neurons lacking Cables1 in the presence of Slit (dsCables1 and Slit), p>0.9999. As seen in the images and also from the measured neurite lengths, axons still show growth and further reduction would have been possible. We would also like to point out that the conclusion from this experiment is that Cables1 is required for the response of axons to either Slit or Wnt.

      To support our claims, we have added another experiment addressing the need for Cables1 for post-crossing axons’ responsiveness to Slit by downregulation of Robo receptors (Figure 10). These experiments confirmed that Slit/Robo signaling is required for the effect of Cables1 on post-crossing axons, in line with our final conclusion that Slit binding to Robo triggers internalization and Cables then transfers Abl from the C-term of Robo to β-Catenin. This results in phosphorylation of β-Catenin at tyrosine489 (β-Catenin pY489) and responsiveness to Wnt5a.

      -Fig. 5B. In this Figure they do not differentiate between FP stalling or no turn phenotypes. A quantification taking into account the different phenotypes as shown in Fig.2D should be included.

      Done, as suggested. This is Figure 6C in the revised manuscript.

      -Fig. 6D,E. As postulated in the manuscript and based on the Rhee, et al. paper, the β-catenin phosphorylation is triggered by Abl quinase upon Slit-Robo signaling. How the authors explain then that isolated cells with axons growing on a plate recapitulate specific distal phosphorilation of β-catenin at Y489 in the absence of Slit signaling? This experiment shows that postcrossing axons contain more phosphorylated β-catenin as an intrinsic characteristic rather than as a consecuence of contact with floor plate signals. Authors should try a similar experiment but exposing the neurons (or explants) to Slit. Also, why β-catenin phosphorylation was not measured at the growth cone?

      In Figure 6D and E (now Figure 7D,E), we compare pre- and post-crossing axons. Post-crossing axons do have ‘a memory’ of their contact with the floor plate, as this contact has changed the localization of Robo receptors to the surface (Philipp et al., 2012; Alther et al., 2016). Floor-plate contact also initiates differences in gene expression (e.g. Hhip expression in a Shh-and Glypican-dependent manner; Wilson and Stoeckli, 2013). The difference in Robo localization has also been described by others (Pignata et al., Cell Rep 29(2019)347).

      In fact, the distal localization of pY-489 β-Catenin is in perfect agreement with our results: The localization of Robo1 on the distal portion of the axon is in line with published data from our own lab but also from the Castellani and the Tessier-Lavigne lab. Our results suggest that Cables is recruited to Abl bound to the C-term of Robo. Cables transfers Abl then to β-Catenin which is phosphorylated by Abl. Thus pY-489 β-Catenin would be localized predominantly where Robo is localized, i.e. the distal axon. In support of these results, experiments added to the revised version of the manuscript indicate that the response to Slit is required for the increase in β-Catenin pY489 (Figure 10B).

      -Fig. 7. CAG::hrGFP electroporation is not specific for dl1 neurons. This experiment should be performed with Math1::tdTomatoF in order to analyze β-cat pY489 with or without dsCables1 specifically in dl1 neurons. Also, why GFP staining at the growth cones in Fig.7B is not visible in the axon?

      As indicated in our schematic drawing (Figure 7A) we only cultured explants from the dorsal-most part of open-book preparations of spinal cords, making sure that our cultures are not mixtures with more ventral populations of neurons. We opted for CAG::hrGFP because Math1 is a weak promoter and the expression of GFP was very difficult to see after dissociating cells and culturing them in vitro. We used a GFP version that is not farnesylated to avoid interference with axonal staining of pY-489 β-Catenin. Therefore, GFP is not visible in axons with the imaging conditions used.

      -Fig. 8. This experiment does not distinguish whether phosphorylated β-Cat is necessary for the correct navigation of post-crossing commissural axons (as it is claimed in the abstract) or it is also required for midline crossing. As it has been previously shown, correct navigation of post-crossing commisusal axons is a Wnt5 dependent process. As dsCables1 abrogates Wnt5a responsiveness (Fig.4B,C), does the phosphomimetic β-catenin Y489E construc rescue the Wnt5a response in dsCables1 electroporated neurons? Moreover, can the phosphomimetic β-catenin Y489E construc rescue the Slit response in dsCables1 electroporated neurons? Testing these effects on explants as in Fig. 4B,C but including phosphomimetic β-catenin, will help to understand to what extend phosphorylation of β-catenin is important for crossing, turning or both processes.

      Yes, the phosphomimetic Y489E version of β-Catenin reduces the percentage of DiI injections sites with aberrant axonal navigation to control levels (Figure 9 in the revised manuscript). In contrast, a mutant version of β-Catenin that cannot be phosphorylated, β-CateninY489F, cannot rescue the axon guidance phenotype seen in the absence of Cables1.

      -How do the authors envision the mechanism of Cables1/β-catenin mediated crossing and turning? A working model summarizing their hypothesis would help the reader to understand the results.

      **Minor points:** -Homogeneize the term "scale bars" or "bars" in the Figure Legends.

      done

      -Scale bar of insets in Fig.1C is missing.

      The scale bar is now added, we apologize for the mistake.

      -The antisense control for Cables probe should be shown at HH-22/24. Otherwise is not possible to distinguish whether they do not detect signal because is a negative control or because Cables1 is not expressed at HH25.

      We have added the image of an adjacent section hybridized with the sense probe for HH25, in addition to HH22 to clarify that Cables expression is higher during floorplate crossing, exiting and turning rostrally but then levels decrease when post-crossing axons have initiated their growth along the rostro-caudal axis.

      -Figure legend for Fig. 2D is missing

      corrected

      -Fig. 8B right panel is contaminated with growthing axons coming from the below DiI injection. Please replace the picture.

      We have changed the outline of this figure.

      -The quantification of the different phenotypes "FP stalling", "no turn" should be better explained in the Mat and Met section. The sentence " more than 50% of the axons...." is not clear. Was this measured by eye? Otherwise, please indicate the soIware used to measure.

      Yes, as mentioned above, it was hardly ever a close call. It is very easy for a person blind to the experimental condition to go through the DiI injection sites of an open-book preparation and to assess whether 50% or more of the axons that enter the floorplate reach the exit site, or not. Similarly, it is very easy to do the same for the turning behavior. We have changed the text describing this method of quantification to be more explicit (lines 630ff).

      -Provide the quantification of the WB in Supplementary Fig. 2B normalising to Gapdh.

      Added as Supplementary Figure 2C.

      Reviewer #1 (Significance (Required)): Previous results have demonstrated that Slit-induced modulation of adhesion is mediated by cables that links Robo-bound Abl kinase to N-cadherin-bound betacat (Rhee et al., 2007). Here the authors propose that a similar mechanism is operating in commissural neurons leave the midline after crossing and turn immediately after. The role of Cables in the process has not been previously addressed. Thus, after proper addressing of my main concerns, I consider this paper may advance in our knowlege of how growing axons navigate intermediate targets.

      We appreciate this positive evaluation of our study and hope that the additional experiments and more detailed explanations have helped clarify open questions of the reviewer.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): In this paper entitled "Cables1 links Slit/Robo and Wnt/Frizzled signaling in commissural axon guidance", authors aim to the find the mechanisms the coordinate the floor plate exit and the rostral turning of commissural axons. During development thousands of axons have to navigate long distances to reach their targets and build functional circuits. To facilitate their journey, their paths is divided into small portions by intermediated targets. The most studied intermediated target is the floorplate (FP) at the midline of the ventral spinal cord. Glia cells forming the FP express plethora of guidance cues. Commissural neurons, which have their cells bodies located in the dorsal part of the spinal cord, send their axons towards the FP. These axons are first attracted by the FP which facilitate their entry within the FP. However, they switch this attractive response into a repulsive one in order to exit the FP and turn rostrally to connect their brain targets. In order to ensure that this process will go smoothly, commissural axons have to adapt the composition of their receptors and the signaling pathways to switch from attractiveness to repulsion. So far, many processes have been involved such as the alternative splicing of receptors (Robo3; Chen et al Neuron 2008), protease regulation of receptor expression (Nawabi et al Genes & Dev 2010), trafficking of receptors, or their interaction profiles (Delloye et al Nat Neuro 2015). However, it is still not clear how 2 events (here exit from the FP and rostral turning) are linked. Authors propose an original mechanism that involved the adaptor protein Cables 1. This protein has been shown to link the Robo/Slit1 signaling to Cadherins. Cables regulates the repulsive response to Slit and adhesion by the phosphoryla4on of b-Catenin by the kinase Abelson (Rhee et al Nat Cell Biol 2007). The story developed here is very original and interesting: Cables would link the exit of FP (mediated by Robo/Slit signaling) and the rostral turning of the commissural axons (controlled by the Wnt/Fzd pathway. Below I'm proposing some experiments as many questions raised upon reading this beautiful work. The experiments are sound and could be reproducible. The statistic analysis looks fine.

      We thank the reviewer for this positive assessment of our study.

      I would suggest some experiments to strengthen the whole work: •Authors might want to consider to perform some biochemistry experiments to show that Cables is able to interact with Robo1 and Fzd3: are these proteins in the same molecular complex? They could do 2 experiments: one in vitro by transfecting a cell line (such as HEK293 or cos cells) with plasmids coding for Robo1, Cables and Fzd3 or at least Cables and Fzd3 (as for Robo1/Cables they could refers to Rhee et al 2007). Another one would be in vivo: extracting proteins from the pre-crossing stage, the FP and post crossing stage; immunoprecipitation of Cables1 and see whether Robo1 and/or Fzd are pull down with Cables 1.

      We decided not to do these experiments, as we felt that this would go beyond the current study. In fact, for our effects it is not necessary that Cables interacts physically with Robo or Fzd3. The important aspect is that Abl bound to Robo is transferred by Cables to β-Catenin. A direct interaction with Fzd3 is not necessary.

      • From the pictures it seems that most of the axons are stalling in the FP when embryos are electroporated with dsCables1. It would be nice to show more examples of axons that are able to exit the FP but have turning problems. Given the data, as it is presented, it seems that Cables regulates more the FP exit (and therefore, as it was shown in Rhee et al, the responsiveness to Robo/Slit signaling).

      The major phenotype is ‘no turn’. However, as we describe in response to reviewer 1 and in the manuscript, the ‘floorplate stalling’ and the ‘no turn’ phenotypes are not independent of each other. At DiI injection sites, where almost all axons stall in the floorplate, the turning cannot be assessed. Thus, the ‘no turn’ phenotype tends to be underestimated in conditions where floorplate crossing is also affected, as is the case after silencing Cables1.

      In the same line, in Fig 4, Authors need to add a condition using dsCables and ds Fzd in order to see the effect of Cables on axon turning (response to Wnt). As it is this figure supports the role of Cables on FP exit but it's hard to make the link with commissural axon responsiveness to Wnt.

      We belief that experiment 4 clearly demonstrates the absence of the Wnt responsiveness, as axons fail to grow in response to Wnt when they extend from neurons transfected with dsCables1 (Figure 4C). Because dsCables1 alone already abolishes all responsiveness to Wnt, the removal of Fzd at the same time would not change anything.

      • Authors aim to show that Cables is a linker between 2 events: maybe it should be nice to try to disconnect these events. One way would be (if technically possible) to modulated expression of Cables at different stages. What would happen if Cables was down regulated upon FP crossing? Would axons still be able to respond to Wnt? The question I'm wondering about is whether the responsiveness to Slit and Wnt is acquired at the same time or whether axons should become sensitive to Slit and this event will prime them to respond to Slit. In order to address the following experiment could be performed: explants from HH22-HH23 embryos, could be treated with medium containing Slit first and then Wnt or vice et versa and perform some collapse assay.

      Unfortunately, the experiment as proposed by the reviewer is not possible. The axons take on average 5.5 hours to cross the floorplate (entry – exit; Dumoulin et al., 2021). Most importantly, the protein that is already made before axons are at the exit site, could not be removed. Therefore, it is not possible to prevent the production of Cables only after axons have crossed the midline. As shown in Figure 1, Cables1 mRNA is present at HH22, that is when axons have reached and are about to enter the floorplate. We also do not belief that the in vitro experiment suggested by the reviewer would work. We would have to wash cell intensively to remove Slit added to the medium. This would interefere with their potential to grow in response to Wnt immediately after addition. However, we added experiments where we looked at the effect of Wnt after removal of Robo (Figure 10). These experiments demonstrate that responsiveness to Wnt can only be established when axons can respond to Slit, i.e. when Robo is activated.

      • In Fig3 I was wondering whether post crossing axons were growing less because of the change in the regulation of adhesion: Rhee et al shows that Cables is able to modulate adhesion through N-cadherin. It would be interesting to perform immunostaining on these explant cultures to assess any change in adhesion molecules.

      We have not found any changes in the expression levels of Contactin-2 (Axonin-1), NrCAM, or most importantly β1-Integrin, as our cultures grow on laminin.

      • It is not clear whether Robo1 and/or Fzd induces the phosphorylation of b-catenin: is the Robo1/Slit binding induce the phosphorylation of b-cat and this event will prime the axons to respond to Wnt/Fzd? Or Wnt/Fzd is also able to control b-cat phosphorylation?

      We have added an experiment, where we remove Robo1 from commissural neurons and compare pY489 β-Catenin levels (Figure 10). Furthermore, we demonstrate that in the absence of Robo1, Wnt has no stimulatory effect on axons (Figure 10C,D). These experiments supports our conclusion that Cables1 transfers Abl kinase from the C-terminal part of Robo to β-Catenin, which gets phosphorylated and thus is ready to act in the Wnt signaling pathway.

      • The staining with the antibody needs to be detailed: as it is reported this antibody recognizes "a domain of Cables1 that is 90% identical to the corresponding region of Cables2": it seems that the Cables protein enrichment in the floor plate (around the central canal) is Cables 2 as its mRNA expression matches this profile of expression. The one expressed in the crossing axons might be Cables 1: one way to verify this, is to perform the staining on sections from embryos electroporated with dsCables 1. This is a very important control of the antibody to reinforce this point of the paper.

      We belief that the staining of the cells around the central canal could be due to endfeet of precursors spanning the neural tube from the apical to the basal side. All cells seem to express some Cables1 (Figure 1B,C). As we did not find any effect of Cables2 on commissural axon navigation and we do not use antibodies to functionally interfere with Cables1 function, we did not do this experiment, as the antibody is not able to distinguish the two proteins. Most likely, there is little, if any, Cables2 expressed in the spinal cord during this time window. We still did some functional analyses but found no effect on axon guidance (Supplementary Figure 3).

      • In Figures 3-4: why not performing some co culture of spinal cord explants with COS or HEK 293 cells expressing Slit1 or Wnt? This experiment will provide a clear-cut response to see the role of Cables in axon guidance. As there it is, Fig3 shows a role of Cables in axon growth but not guidance.

      We respectfully disagree that in vitro experiment would help to show guidance versus growth. Guidance can only be shown in vivo. This is what we do. Our in vitro results are only included to address specific responsiveness of axons or expression changes in total β-Catenin or pY489 β-Catenin. But all our conclusions about the role of Cables in axon guidance are demonstrated in vivo. Experiments using co-cultures of axons with COS or HEK cells would be impossible to control for timing and amount of Slit or Wnt release.

      • In Figure 6: my understanding of axon guidance is that every guidance decision happens at the level of the growth cone. However, it seems that in post crossing stage, there is a strong decrease of b-cat and phosphor- b cat within the growth cone compared to the precrossing stage. If beta cat is the effector of Cables to link Robo/Slit and Wnt/Fzd signaling I would expect it to be localized at the growth cone. I think authors should discuss this point. Regarding the normalization, it would be better to counterstaing the neurons with actin and use the measure of its fluorescence to normalize phopho-beta cat.

      There must be a misunderstanding. We do not demonstrate or claim that there is a decrease in β-Catenin or pY489 β-Catenin between pre- and post-crossing axons. We only demonstrate that the distribution of pY489 β-Catenin is clustered in distal post- but not pre-crossing axons. This change in localization of pY489 β-Catenin is supporting our model that Cables1 transfers Abl kinase to β-Catenin and phosphorylates it and prepare it for signaling in the Wnt pathway. And, as demonstrated pY489 β-Catenin and β-Catenin are in the growth cone. However, for quantification we concentrated on the axon, as the difference in growth cone morphology would have complicated the quantification.

      **Minor points:** •In figure 2: it seems that there are few axons labelled with DiI in the dsCables1 condition (Fig2B): it would be the choice of the picture or maybe the downregulation of Cables 1 interfere with the survival of dl1 neurons (even though in supp 1C it is shown that most of the populations are still there with no difference with the control side) or maybe some axons are delayed to reach to FP on time: the picture is focused on the FP: are there any axons still growing in the side of the open book preparation? Again, the picture that could be misleading.

      We have exchanged the images for alternatives with a better matched number of DiI-labelled axons. There is indeed no evidence for cell death, as axons are still there at normal numbers when we analyze open-book preparations a day later than usually. The difference in the number of axons labelled by DiI is only due to the variability in the amount of DiI injected per injection site.

      • In Fig1 legends, maybe Authors wanted to write "At HH18 dl1 commissural neurons start to extend their axons in the ventral spinal cord"?

      No, what we mean is, as shown in Figure 1A, that axons emerge from the cell body at this time. They reach the ventral spinal cord by HH21 and the floor plate by HH22.

      • I would also remove the yellow shadow on the Fig1A: it could be misleading as at first glance the reader might wonder whether there are 2 populations of dl1 neurons.

      We have done as suggested to make the image clearer.

      Reviewer #2 (Significance (Required)): It is still not clear how axons cross the midline. So far, many processes have been involved such as the alternative splicing of receptors (Robo3; Chen et al Neuron 2008), protease regulation of receptor expression (Nawabi et al Genes & Dev 2010), trafficking of receptors, or their interaction profiles (Delloye et al Nat Neuro 2015). However, it is still not clear how 2 events (here exit from the FP and rostral turning) are linked. Authors propose an original mechanism that involved the adaptor protein Cables 1. This protein has been shown to link the Robo/Slit1 signaling to Cadherins. Cables regulates the repulsive response to Slit and adhesion by the phosphorylation of b-Catenin by the kinase Abelson (Rhee et al Nat Cell Biol 2007). The audience that will be interested in this work is the neurodevelopment filed, axon regeneration field and overall people interested in neuronal circuit formation and function. My field of expertise is molecular and cellular neuroscience applied to axon guidance (crossing the FP) in mice models, axon regeneration and circuit formation.

      We are happy to learn about the positive assessment of our work by a specialist.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): In their manuscript, Zuniga and Stoeckli characterize the role of Cables in commissural axon guidance in the developing chick spinal cord. Based on a combination of in vitro outgrowth assays and in vivo dye-tracing experiments, the authors propose that Cables participates in both normal repulsive responses to Slit and attractive responses to Wnt5. Using combinations of low-does knock down of cables/robo and or B-catenin, the author suggest an in vivo link between these pathways. Using IF with phospho-specific antibodies to B-catenin, the authors suggest that there is elevated P-Bcatenin in the post-crossing segments of distal axons. While potentially interesting, the present study falls short of adequately supporting the major claims. In addition, there are several instances where experiments lack appropriate controls.

      **Specific Comments** The conclusions reached by the authors are over-stated given the experiments performed. For example, the authors describe 'silencing' cables throughout the paper; however, the knock down that they achieve is approximately 50%. Indeed, it is quite surprising that such strong effects on growth/guidance can be achieved with a two-fold depletion of the gene product. Nevertheless, the rescue experiments provide nice evidence that dsRNA for Cables is causing a phenotype. This partial knockdown precludes strong conclusions, like for Figure 3, where they state that 'Cables is not required for pre-crossing.' The language needs to be tempered.

      We rephrased the paragraph where we describe the effect of Cables 1 and the efficiency of downregulation to stress that the parameters that we use for electroporation result in around 50% of the cells successfully transfected (lines 154 – 162, and legend of Supplementary Figure 2). Therefore, to find mRNA levels and protein levels reduced to about half indicates that our method is extremely efficient and removes the targeted mRNA and the protein almost completely. We need to point out here that we always analyze the temporal expression pattern to electroporated embryos before the protein of interest has accumulated, as in ovo RNAi obviously does not remove protein but only prevents translation and therefore the synthesis of new protein. As proteins can be extremely stable compared to the time line of embryonic development, we inject and electroporate dsRNA before we find expression of mRNA.

      Figure 4: the authors use bath application of Slit and Wnt to test effects of cables on Slit and Wnt responses. The observed effect sizes are very small and a single assay of this type does not allow such strong conclusions like 'loss of Cables prevents responsiveness.' Again, it is difficult to imagine that 50% reduction would completely prevent responses, raising questions about the suitability of this assay for measuring responsiveness- perhaps growth cone collapse would give more convincing results.

      As mentioned above, we are almost completely eliminating the targeted protein in the transfected neurons. For the explants, we only looked at the neurons expressing td-Tomato driven by the Math1 promoter. Thus, these neurons were transfected. Obviously, we cannot be sure that 100% of our cells took up the plasmid and the dsRNA, but the chances are very high that this is the case based on the ration between plasmid and dsRNA.

      Figure 5: The authors should more clearly document the effects they are seeing in these manipulations. As written, all we know is that there are 'significant effects on axon guidance.' What are these effects? Do they see the predicted differences between robo/cables and Bcatenin/cables phenotypes? e.g re-crossing defects in the case of robo and anterior turning defects in the case of B-catenin?

      We have added the analysis of the detailed axon guidance problems seen in the absence of Robo1, Cables1, βCatenin, or combinations, now Figure 6C. Indeed, we find that the phenotype ‘no turn’ is more prevalent in the condition with loss of both Cables and βCatenin. However, as mentioned above in response to a question raised by Reviewer 2, the two phenotypes are not independent of each other. Stalling in the floor plate of the majority of axons prevents the analysis of the turning phenotype. That is why we only use the ‘normal’ DiI injection sites for the statistical analysis.

      Also related to Figure 5: The authors do not validate the dsRNA knockdown of either Robo or B catenin. It is unclear what the interpretation or expectation of the triple knock down condition is.

      We have used the same ESTs to produce dsRNA derived from Robo and βCatenin in our previous publications (Alther et al., Development 143(2016)994; Avilés and Stoeckli Dev Neurobiol 76(2016)190). Therefore, we only repeated the functional experiments to verify reproducibility of the effect but we did not quantify the efficiency of downregulation in detail again.

      Figure 6: For this reviewer images showing enhanced P-Catenin in post-crossing distal axons is not convincing. The differences are not obvious by eye and the quantification suggests an ~30% increase. In contrast a nearly 4-fold increase is reported in Figure 7 for this same measurement. This raises concerns about the reproducibility of this 'phenotype.'

      Staining intensities are subject to batch-to-batch variability. Therefore, the experiments shown in Figure 7 (Figure 6 in the original manuscript) cannot be directly compared to the levels in Figure 8 (previously Figure 7). However, within the experiments, we carefully normalized data. We do not make any claims about absolute staining intensities.

      Also related to Figure 6: No validation of antibody specificity is provided or described.

      Again, please keep in mind that we do not make any claims about absolute values. All are results are based on stainings with the same antibody and comparison between different areas of the same axons. Therefore, the specificity of the antibody is important but not a fundamental aspect of our results.

      Figure 8: As for figure 5, phenotypic documentation is incomplete. In addition, no controls are shown to assure that the different mutant forms of B-catenin are comparably expressed, nor is there an unmutated wild-type control. The authors state that expression of these constructs alone has no effect on normal guidance; however, the supplemental data 6B would seem to indicate that both forms lead to increases abnormal phenotypes.

      There is an increase in the number of injection sites with aberrant axon guidance, however, this was not significant. We cannot exclude the possibility that premature expression, or overexpression of βCatenin pY489E or βCatenin pY489F does interfere with the endogenous βCatenin pY489. We still decided to keep these experiments in the revised version of the manuscript as they support our conclusion that Cables1 is required for axonal responsiveness to Slit and Wnts, and that this effect is mediated by phosphorylation of βCatenin at Y489. We are aware that this experiment in isolation is not sufficient.

      Reviewer #3 (Significance (Required)): The work builds on in vitro observa4ons from Rhee, 2007 about links between Robo signaling and Cables func4on. If adequately demonstrated, integra4on and coordina4on of Robo and Wnt axon guidance pathways is quite significant.

      We thank the reviewer for this positive assessment.

    1. When we talk here about the poverty level, then, keep in mind that we are talking only about official poverty and that there are many families and individuals living in near poverty who have trouble meeting their basic needs, especially when they face unusually high medical expenses, motor vehicle expenses, or the like.

      This is very eye-opening to think about. Although one may not "officially" be living in poverty, meeting basic needs can still be difficult. Even for those whose basic needs are met, I often wonder when is the last time one under this category was able to buy an extra or "luxury" for themself such as a dessert or a new piece of clothing. Just because someone is not living in poverty does not mean they are not struggling. Even if all basic needs are met, it can be discouraging when you can never splurge on something special for yourself or your family.

    1. And yet researchers have found significant positive results when black and Hispanic students have teachers who match their race or ethnicity: better attendance, fewer suspensions, more positive attitudes, and higher test scores, graduation rates and college attendance. Teachers of color also have higher expectations for students of color, which may fuel the other gains.

      Coming from a white student from a predominantly white school, I had never even considered the relatability aspect of teaching concerning race. I think it is encouraging to know that while ethnic students may feel underrepresented in some schools across America, we as future educators and principals can seek out ethic teachers to support and relate to students of their own race, as well as try to include students that are different from us when we can.

    1. Author Response

      Reviewer #1 (Public Review):

      Animal colour evolution is hard to study because colour variation is extremely complex. Colours can vary from dark to light, in their level of saturation, in their hue, and on top of that different parts of the body can have different colours as well, as can males and females. The consequence of this is that the colour phenotype of a species is highly dimensional, making statistical analyses challenging.

      Herein the authors explore how colour complexity and island versus mainland dwelling affect the rates of colour evolution in a colourful clade of birds: the kingfishers. Island-dwelling has been shown before to lead to less complex colour patterns and darker coloration in birds across the world, and the authors hypothesise that lower plumage complexity should lead to lower evolutionary rates. In this paper, the authors explore a variety of different and novel statistical approaches in detail to establish the mechanism behind these associations.

      There are three main findings: (1) rates of colour evolution are higher for species that have more complex colour phenotypes (e.g. multiple different colour patches), (2) rates of colour evolution are higher on island kingfishers, but (3) this is not because island kingfishers have a higher level of plumage complexity than their mainland counterparts.

      I think that the application of these multivariate methods to the study of colour evolution and the results could pave the way for new studies on colour evolution.

      We appreciate this positive comment about our manuscript.

      I do, however, have a set of suggestions that should hopefully improve the robustness of results and clarity of the paper as detailed below:

      1) The two main hypotheses tested linking plumage complexity and island-dwelling to rates of colour evolution seem rather disjointed in the introduction. This section should integrate these two aspects better justifying why you are testing them in the same paper. In my opinion, the main topic of the paper is colour evolution, not island-mainland comparisons. I would suggest starting with colours and the challenges associated with the study of colour evolution and then introducing other relevant aspects.

      We implemented this suggestion by reorganizing the introduction to introduce color/and challenges with studying it (para 1), then we discuss plumage complexity (para 2). We follow this with a paragraph about the importance of islands in testing evolutionary hypotheses (para 3), and onto kingfishers as a model system (para 4) and our hypothesis/predictions (para 5).

      2) Title: the title refers to both complex plumage and island-dwelling, but the potential effects of complexity should apply regardless of being an island or mainland-dwelling species, am I right? Consider dropping the reference to islands in the title.

      We removed “island” from the title.

      3) The results encompass a large variety of statistical results some closely related to the main hypothesis (eg island/mainland differences) tested and others that seem more tangential (differences between body parts, sexes). Moreover, quite a few different approaches are used. I think that it would be good to be a bit more selective and concentrate the paper on the main hypotheses, in particular, because many results are not mentioned or discussed again outside the Results section.

      We removed analyses that we felt were distracting from our main point (e.g., MCMCglmm) and streamlined our approach to use PGLS methods for both rates (phylolm) and multivariate color patterns (d-PGLS). The relevance of sex differences in coloration is also made more clear, as we added details about how we tested for a relationship between male and female coloration and that we use this strong correlation as a justification for averaging color by species (e.g., see lines 369-375).

      4) Related to the previous section, the variety of analytical approaches used is a bit bewildering and for the reader, it is unclear why different options were used in different sections. Again, streamlining would be highly desirable, and given the novel nature of the analytical approach (as far as I know, many analytical approaches are applied for the first time to study colour evolution) it would be good to properly explain them to the reader, highlighting their strengths and weaknesses.

      We appreciate the suggestion and have now included a workflow diagram, as suggested (see Figure 1). We further added considerable detail to the Methods (old length = 502 words, new length = 1355 words) and mention caveats of the approaches we have taken (e.g., line 308: “We used photosensitivity data for the blue tit (Hart et al., 2000) due to the limited availability of sensitivity data for other avian species”).

      5) The Results section contains quite a bit of discussion (and methods) despite there being a separate Discussion section. I suggest either separating them better or joining them completely.

      We appreciate this. We were following other eLife articles that include more discussion within the Results, therefore we would prefer to leave these aspects in place. However, we did move a considerable amount of information from the Results section to the Methods section. In addition, we also reorganized the Results to better match the logical flow of the Introduction. The end result, we hope, is a Results section that is considerably more streamlined.

      6) The main analyses of colour evolutionary rates only include chromatic aspects of colour variation. Why was achromatic variation (i.e. light to dark variation) not included in the analyses? I think that such variation is an important part of the perceived colour (e.g. depending on their lightness the same spectral shape could be perceived as yellow or green, black or grey or white). I realize that this omission is not uncommon and I have done so myself in the past, but I think that in this case, it is highly relevant to include it in the analyses (also because previous work suggests that island birds are darker than their mainland counterparts). This should be possible, as achromatic variation may be estimated using double cone quantum catches (Siddiqi et al., 2004) and the appropriate noise-to-signal ratios (Olsson et al., 2018). Adding one extra dimension per plumage patch should not pose substantial computational difficulties, I think.

      We incorporated this suggestion and we have now fully integrated achromatic color variation into all of our analyses. These new analyses let us compare results to previous work showing that island birds are darker than mainland counterparts. We further discuss the caveats of chromatic and achromatic channels (e.g., lines 313-317: “Although it is possible, in theory, to combine chromatic and achromatic channels of color variation in a single analysis (Pike, 2012), we opted to analyze them separately, as these different channels are likely under different selection pressures (Osorio and Vorobyev, 2005).”).

      7) The methods need to be much better explained. Currently, some methods are explained in the main text and some in the methods section. All methods should be explained in detail in the methods section and I suggest that it would be better to use a more traditional manuscript structure with Methods before Results (IMRaD), to avoid repetition (provided this is allowed by the journal). Whenever relevant the authors need to explain the choice of alternative approaches. Many functions used have different arguments that affect the outcome of the analyses, these need to be properly explained and justified. In general, most readers will not check the R script, and the methods should be understandable to readers that are not familiar with R. This is particularly important because I think that the methodological approach used will be one of the main attractions of the manuscript, and other researchers should be able to implement it on their own data with ease. Judging from the R script, there are quite a few analyses that were not reported in the manuscript (e.g. multivariate evolutionary rates being higher in forest species). This should be fixed/clarified.

      We clarified several methodological details in the manuscript (e.g., added package versions throughout, mention the permutation option used for compare.evol.rates, cited RPANDA) and modified the Methods section considerably to make logical connections among the sections. We also checked and cleaned up the R markdown file to ensure the analyses were in sync with the manuscript analyses.

      Reviewer #2 (Public Review):

      In "Complex plumages spur rapid color diversification in island kingfishers (Aves: Alcedinidae)", Eliason et al. link intraspecific plumage complexity with interspecific rates of plumage evolution. They demonstrate a correlation here and link this with the distinction between island and mainland taxa to create a compelling manuscript of general interest on drivers of phenotypic divergence and convergence in different settings.

      This will be a fantastic contribution to the literature on the evolution of plumage color and pattern and to our understanding of phenotypic divergence between mainland and island taxa. A few key revisions can help it get there. This paper needs to get, fairly quickly, up to a point where the difference between plumage complexity and color divergence is defined carefully. That should include hammering home that one is an intraspecific measure, while one is an interspecific measure. It took me three reads of the paper to be able to say this with confidence. Leading with that point will greatly improve the paper if that point gets forgotten then the premise of the paper feels very circular.

      We hope our considerable modifications throughout–including explicitly mentioning that complexity is an intraspecific measure whereas rates are interspecific (e.g., see lines 65, 140, 170, 667)–have made the premise of the paper more clear. We also added a new workflow figure (Figure 1) that includes example species pairs showing cases in which intraspecific plumage complexity and interspecific color divergence could show a negative relationship, rather than a positive one as we predict in the manuscript. We discuss this detail in lines 159-161 (“However, this is not necessarily the case, as there are examples within kingfishers that show simple plumages yet high color divergence, as well as complex plumages with little evolutionary divergence (Figure 1B).”).

      Also importantly, somewhere early on a hypothesized causal pathway by which insularity, plumage complexity, and color divergence interact needs to be laid out. The analyses that currently follow are good ones, and not wrong, but it's challenging to assess whether they are the right ones to run because I'm not following the authors' reasoning very well here. I think it's possible a more holistic analysis could be done here, but I'll refrain from any such suggestions until I better get what the authors are trying to link.

      We overhauled the Introduction. This included adding lines that connect the ideas of complexity and insularity (lines 65-58: “intraspecific plumage complexity (i.e., the degree of variably colored patches across a bird's body) could be a key innovation that drives rates of color evolution in birds and should be considered alongside ecological and geographic hypotheses.”) and insularity and color divergence (lines 69-85). We also rethought the analyses and now include PGLS analyses using tip-based rates that allow us to account for both insularity and complexity in the same analysis.

      We also need something near the top that tells us a bit more about the biogeography of kingfishers. Are kingfisher species always allopatric? I know the answer is no, but not all readers will. What I know less well though is whether your insular species are usually allopatric. I suspect the answer is yes, but I don't actually know.

      Great point. We have added details to the manuscript to clarify this (e.g., line 214: “The number of sympatric lineages ranged from 1–9 on islands, and 6–38 for mainland taxa.”).

      In short, how do the authors think allopatry/sympatry/opportunity for competition link to mainland vs. island link to plumage complexity? And rates of color evolution? Make this clear upfront.

      We believe our revised introduction makes these connections much clearer.

    1. Prosecuting the officer who shot Michael Brown, or investigating and integrating Ferguson’s police department,

      Single entities or individuals are much easier for people to take on and feel like they're achieveing than something systemic. It's important, but too often people miss the bigger picture. In a uhl class I took over the summer a couple of years ago, we discussed how it's often very very difficult for people to really conceptualize what a large amount of data means, and to apply that to how it affects individuals. The people being affected become nothing more than numbers. It is therefore easier to focus on Michael Brown or a single police department than it is to reckon with the prospect of millions upon millions of people facing discriminatory practices and policies. The individual like Michael Brown of course deserves attention and justice, but too often we let the isolated incident or the fact that not as many similar incidents go viral get in the way of seeing a wider pattern. I think it also has something to do with guilt. Tackling systems that harm people of color to an extent also means admitting that you yourself may be benefiting from those systems or admitting that you have privelege, which is not always an easy thing for everyone to do.

    1. The digital,like any tool, institution, or system across society, from law and medicineto the academy, will be radical or transformative only to the extent that

      This is a crucial point and I think it is something that anybody studying any type of history should be very aware of. As valuable as the data that we find may be as far as developing a factual and statistical basis, it is imperative that we manage to distinguish our analysis of said data by taking into consideration the way others might have viewed it at the time it was gathered.

  5. evergreen0-my.sharepoint.com evergreen0-my.sharepoint.com
    1. So, why should we ask students to read in digital envi-ronments given all that we know about how they feel aboutreading books? If student preferences and dispositions areclear, why should we try to advance practices that may runcontrary to those preferences? The conclusion to Baron etal.’s (2017) intensive study may be a good starting point tothis conversation, as they suggest that “we should devoteserious research attention to the question of what kinds oftexts or subject matters make most educational sense inwhich formats. As in so much of education, one size likelydoesn’t fit all” (p. 603)

      I think this is a key idea, we need to look not for a one size all solution to our influx of information, but instead we should make it easily adaptable to anyone looking to learn. Some texts make sense to hold in your hands, other don't. But I think we should digitze everything for record and accessibility, as well as a comparison of how it is written and how it ages.

    1. A little child of six years was extremely [26] sick in the Mission of saint Michel. His mother was unable to contain her tears, seeing the excess of his pain, and the approach of death to this her only son. " My mother," said to her this child, " why do you weep your tears will not give me back my health; but rather let us pray to God together, so that I may [page 111] be very happy in Heaven." After some prayers, his mother said to him, " My son, I must carry thee to Sainte Marie, so that the French may restore thee thy health." " Alas! my mother," said to her thief little innocent, " I have a fire burning in my head could they indeed quench it? I no longer think o life,—have no desire of it for me; but I will wart you of my death, and, when it is near, I will pray you to carry me to Sainte Marie, for I wish to die there, and to be buried there with the excellent Christians." In fact, some days later, this child warned his mother that his death was near, and that it was time to carry him to us. It is the custom in these countries, when any one is near death, to make a solemn feast to which are invited all the friends and the most considerable persons,—about a hundred. The mother would not [27] fail in this obligation,—desiring also to apprise all the people of the sentiments which her son had toward the Faith. This child, having seen the preparations for the feasts said to her: " What! my mother, would you have me sin so nigh to my death? I renounce all these superstitions of the country; I wish to die a good Christian." This child believed that that custom was among the number of those forbidden; and although his mother, an excellent Christian, assured him that there was no evil in that, he would never believe her, and could not resolve to comply with her wish, until the Father who has charge of that Mission had assured him that in that feast there was no sin. This little Angel was brought to us; and he died in our arms, praying even till death, and telling us that he was going straight to Heaven, and that he would pray to God for us; and he even asked his mother [page 113] for which of his relatives she wished him to pray chiefly, when he should be near God,—saying that no doubt he would be heard. He has been; for, shortly after his death, an uncle of his, one of those most rebellious against the Faith in these countries, and an aunt of his, asked us for instruction, and have become Christians. [28] A little girl of five years, at the Mission of saint Ignace, of Infidel parents, came every day to prayers, morning and evening. She had so constantly adhered to this duty, even against the wishes and the prohibitions of her parents, that we could not refuse her Holy Baptism,—seeing that the spirit of the Faith was abundantly compensating in her for the years that she might lack in order freely to dispose of herself in a matter wherein grace has more right than nature. Some time after, this child fell sick; the Infidel parents, having recourse to the superstitions of the country, sent to fetch the Magician,—or, to speak more correctly, an impostor who made profession of that trade of hell. This juggler does not fail to say, as is his wont, that a certain Demon had reduced their daughter to that state; and that, in order to expel him, it was necessary to present the patient with some embellishments and ornaments of clothing, of which the girls of that age are sufficiently desirous. The little sick girl, although she was very low, nevertheless had strength enough, and her faith gave her courage enough, to belie this impostor. " I am a Christian," she said to her parents; " the Devils have no longer [29] any power over me. I do not consent to the sin that you have just committed, in consulting the Demons; I do not wish their remedies. God alone will cure me; let [page 115] this Magician go away." The father and mothers and all those present, were much astonished at this rebuke,—so innocent, but yet so efficacious that they made that juggler withdraw, not wishing to grieve this sick child. But their astonishment increased when, on that very day, this child asked to be carried to the Church, asserting that she would get well,—as, in fact, it happened. This event has beers the means of converting the father and the mother, who have adopted their daughter's faith, and have received Baptism after her,—blessing God for having called them with so much gentleness. A young girl of fifteen years, among the most accomplished in the country, still a Catechumen, had been taken captive toward the end of last year's disinter; the enemies, however, had spared her life, and she remained with them in her captivity. She was the daughter and sister of two excellent Christians, who had no greater regret in the loss which they had incurred, than that this poor captive had not [30] yet been baptized. She, too, in her captivity did not forget her faith and often exclaimed to God: " My God,—and the God of my mother and my sister, who know you better than I, and who serve you so faithfully,—have pity on me! I have not been baptized; grant me this favor before I diets One day, when this poor afflicted one was in a field of Indian corn, which she was planting for those whose slave she was, she heard voices from Heaven which were singing a ravishing music in the air, from the chant of our Vespers, which she had formerly heard. She looks about her, supposing that some Frenchmen would accost her; but she sees nothing else. she kneels down, and prays to God [page 117] with all her heart; and she conceives a hope of seeing herself delivered from her captivity, though she sees neither means nor any probability of this Some days afterward, the same thing happens to her; she kneels again, with the same sentiments. Finally, having for the third time heard these same voices from Heaven,—and feeling her confidence increased, and her courage more animated,—she prays to God and hastens into a road which she [31] did not know, in order to return to these countries, without victuals, without provisions, without escort, but not without the guidance of him alone who had inspired her, and who gave her sufficient strength to arrive here, having traveled more than eighty leagues without any evil encounter. She asked us for Baptism from the day of her arrival; and, seeing the hand of God over her with so much love, we could not put her off. she had come straight to this house of Sainte Marie, although her shorter way would have carried her to the village to which her parents belonged. Since then, she has continually increased in fervor, and cannot grow weary with relating to every one the mercies of God. Often, in her captivity, she found herself solicited to what she could not grant without losing innocence; but never could they draw from her lips even a single word of agreement. She even carried this so far that, seeing her in this disposition, which was not pleasing to those shameless Barbarians, some had often spoken of beating her to death; and she was awaiting that death with patience, preferring to die rather than to commit any sin. This chapter would have no end, if I [32] should relate the effects of grace upon these poor Savages,—[page 119] which we admire every day, and for which we will bless God forever in Heaven, without weariness and without distaste. I cannot, however, omit a sufficiently prevailing sentiment of many good Christians, who—having lost all their property, their children, and what they had most precious in this world, and being even upon the point of undergoing a voluntary exile from their country which they were forsaking in order to avoid the cruelty of the Iroquois, their enemies—thanked God for it, and said to him: " My God, may you be blessed; I cannot regret these losses, since the Faith has taught me that the love which you have for the Christians is not in regard to the goods of this world, but for eternity. I bless you in my losses, with as good a heart as I have ever done; for you are my Father, and it is enough that I know that you love me, that I should be content with all the evils which can happen to me.

      stories about devout convert Christians- mostly children

    1. Others attribute this fall to another cause, which seems to have some relation to the case of Adam, but falsehood makes up the greater part of it. They say that the husband of Aataentsic, being very sick, dreamed that it was necessary to cut down a certain tree from which those who abode in Heaven obtained their food; and that, as soon as he ate of the fruit, [page 127] he would be immediately healed. Aataentsic, knowing the desire of her husband, takes his axe and goes away with the resolution not to make two trips of it; but she had no sooner dealt the first [88] blow than the tree at once split, almost under her feet, and fell to this earth; whereupon she was so astonished that, after having carried the news to her husband, she returned and threw herself after it. Now, as she fell, the Turtle, happening to raise her head above water, perceived her; and, not knowing what to decide upon, astonished as she was at this wonder, she called together the other aquatic animals to get their opinion. They immediately assembled; she points out to them what she saw, and asks them what they think it fitting to do. The greater part refer the matter to the Beaver, who, through courtesy, hands over the whole to the judgment of the Turtle, whose final opinion was that they should all promptly set to work, dive to the bottom of the water, bring up soil to her, and put. it on her back. No sooner said than done, and the woman fell very gently on this Island. Some time after, as she was with child when she fell, she was delivered of a daughter, who almost immediately became pregnant. If you ask them how, you puzzle them very much. At all events, they tell you, she was pregnant. Some throw the blame upon some strangers, [89] who landed on this Island. I pray you make this agree with what they say, that, before Aataentsic fell from the Sky, there were no men on earth. However that may be, she brought forth two boys, Tawiscaron and Iouskeha, who, when they grew up, had some quarrel with each other; judge if this does not relate in some way to the murder of Abel. They came to blows, but with very different [page 129] weapons. Iouskeha had the horns of a Stag; Tawiscaron, who contented himself with some fruits of the wild rosebush, was persuaded that, as soon as he had struck his brother, he would fall dead at his feet. But it happened quite differently from what he had expected; and Iouskeha, on the contrary, struck him so rude a blow in the side, that the blood came forth abundantly. This poor wretch immediately fled; and from his blood, with which the land was sprinkled, certain stones sprang up, like those we employ in France to fire a gun,—which the Savages call even to-day Tawiscara, from the name of this unfortunate. His brother pursued him, and finished him. This is what the greater part believe concerning the origin of these Nations.

      comparable attributes to other religious stories

    2. They recognize as head of their Nation a certain woman whom they call Ataentsic, who fell among them, they say, from Heaven. For they think the Heavens existed a long time before this wonder; but they cannot tell you when or how its great bodies were drawn from the abysses of nothing. They suppose, even, that above the arches of the Sky there was and still is a land like ours, with woods, lakes, rivers and fields, and Peoples who inhabit them. They do not agree as to the manner in which this so fortunate descent occurred. [87] Some say that one day, as she was working in her field, she perceived a Bear; her dog began to pursue it and she herself afterwards. The Bear, seeing himself closely pressed, and seeking only to escape the teeth of the dog, fell by accident into a hole; the dog followed him. Aataentsic, having approached this precipice, finding that neither the Bear nor the dog were any longer to be seen, moved by despair, threw herself into it also. Nevertheless, her fall happened to be more favorable than she had supposed; for she fell down into the waters without being hurt, although she was with child,—after which, the Waters having dried up little by little, the earth appeared and became habitable. Others attribute this fall to another cause, which seems to have some relation to the case of Adam, but falsehood makes up the greater part of it. They say that the husband of Aataentsic, being very sick, dreamed that it was necessary to cut down a certain tree from which those who abode in Heaven obtained their food; and that, as soon as he ate of the fruit, [page 127] he would be immediately healed. Aataentsic, knowing the desire of her husband, takes his axe and goes away with the resolution not to make two trips of it; but she had no sooner dealt the first [88] blow than the tree at once split, almost under her feet, and fell to this earth; whereupon she was so astonished that, after having carried the news to her husband, she returned and threw herself after it. Now, as she fell, the Turtle, happening to raise her head above water, perceived her; and, not knowing what to decide upon, astonished as she was at this wonder, she called together the other aquatic animals to get their opinion. They immediately assembled; she points out to them what she saw, and asks them what they think it fitting to do. The greater part refer the matter to the Beaver, who, through courtesy, hands over the whole to the judgment of the Turtle, whose final opinion was that they should all promptly set to work, dive to the bottom of the water, bring up soil to her, and put. it on her back. No sooner said than done, and the woman fell very gently on this Island. Some time after, as she was with child when she fell, she was delivered of a daughter, who almost immediately became pregnant. If you ask them how, you puzzle them very much. At all events, they tell you, she was pregnant. Some throw the blame upon some strangers, [89] who landed on this Island. I pray you make this agree with what they say, that, before Aataentsic fell from the Sky, there were no men on earth. However that may be, she brought forth two boys, Tawiscaron and Iouskeha, who, when they grew up, had some quarrel with each other; judge if this does not relate in some way to the murder of Abel. They came to blows, but with very different [page 129] weapons. Iouskeha had the horns of a Stag; Tawiscaron, who contented himself with some fruits of the wild rosebush, was persuaded that, as soon as he had struck his brother, he would fall dead at his feet. But it happened quite differently from what he had expected; and Iouskeha, on the contrary, struck him so rude a blow in the side, that the blood came forth abundantly. This poor wretch immediately fled; and from his blood, with which the land was sprinkled, certain stones sprang up, like those we employ in France to fire a gun,—which the Savages call even to-day Tawiscara, from the name of this unfortunate. His brother pursued him, and finished him. This is what the greater part believe concerning the origin of these Nations.

      It is interesting how similar the story is to the Christian creation belief, and many others.

    3. The Captains of the village, having heard these stories, sent for me and said, "My nephew, here is what so-and-so says; what dost thou answer to it? We are ruined, for the corn will not ripen. If at least we should die by the hands and arms of our enemies who are ready to burst upon us, well and good,—we should not at any rate pine away; but if, having escaped from their fury, we are exposed to famine, that would be to go from bad to worse. What dost thou think of it? Thou dost not wish to be the cause of our death? besides, it is of as much importance to thee as to us. We are of the opinion that thou shouldst take down that Cross, and hide it awhile in thy Cabin, or even in the lake, so that the thunder and the clouds may not see it, and no longer fear it; and then after the harvest thou mayest set it up again." To this I answered, " As for me, I shall never take down nor hide the Cross [30] where died he who is the cause of all our blessings. For yourselves, if you wish to take it down, consider the matter well; I shall not be able to hinder you, but take care that, in taking it down, you do not make God angry and increase your own misery

      this is very interesting.

    4. They seek Baptism almost entirely as an aid to health. We try to purify this intention, and to lead them to receive from the hand of Cod alike sickness and health, death and life; and teach them that the life-giving waters of Holy [6] Baptism principally impart life to the soul, and not to the body. However, they have the opinion so deeply rooted that the baptized, especially the children, are no longer sickly, that soon they will have spread it abroad and published it everywhere. The result is that they are now bringing us children to baptize from two, three, yes, even seven leagues away

      It seems the language barrier played a huge part in the understanding of baptism. Makes me think the missionaries may have moved too fast in their efforts to convert the natives.

    1. It’s also easy to blame colleges for not meeting the literacyneeds of the populace. Ironically, many state-supported univer-sities are no longer able to offer remedial courses for studentswho may need some additional help to succeed in college, in partbecause state legislatures, ready to trim university budgets, do notwant to pay for courses that may limit a student’s ability to finisha bachelor’s degree in four years. So the courses that have oftenhelped students prepare for the rigor of academic writing and thesophistication of writing informed by knowledge of rhetorical prin-ciples are actually being cut even as the public continues to declarethat literacy is in decline.Rather than thinking of writing instruction as a form of triage,inoculation, or clinical diagnostic generated to protect the middleclass from the ravages of illiteracy, we benefit from thinking ofwriting instruction as a means of helping students improve theirabilities to engage in public discourse in all its varied forms. Whatwriting teachers have known for generations is that writing is notan end in itself—it is a method of invention that gives shape to ourview of the world and empowers us to engage in discourse with ourfellow humans. There are few things more important than that.There is no literacy crisis. Instead, the concept of literacy contin-ues to become more complex as we expect people to know howto produce and understand texts in multiple forms, whether writ-ten, visual, or otherwise. Like all human institutions, education isinherently flawed, and teachers, students, parents and others mustalways consider ways and initiatives to improve literacy education.Further ReadingFor more about the study of literacy in the United States, seeDeborah Brandt’s Literacy in American Lives (Cambridge UniversityPress), which offers several case studies of how Americans gainliteracy by what Brandt calls sponsors of literacy, people or thingsthat control individuals’ access to literacy instruction. Additionally,see the New London Group’s Multiliteracies (Routledge). The NewLondon Group, a group of ten scholars, acknowledges that tech-nology plays a significant role in how literacy expectations haveshifted.For more on how writing scholars are thinking about the transfer

      Why is the government cutting budgets in the universities for the courses that help students to prepare for a rigor of academic writing? Did they not interesting in literary courses or they think it won't be necessary to put a lot money while other subject like science and technology are there?

    1. Author Response

      Reviewer #1 (Public Review):

      In this study, the protein composition of exocytotic sites in dopaminergic neurons is investigated. While extensive data are available for both glutamatergic and GABA-ergic synapses, it is far less clear which of the known proteins (particularly proteins localized to the active zone) are also required for dopamine release, and whether proteins are involved that are not found in "classical" synapses. The approach used here uses proximity ligation to tag proteins close to synaptic release sites by using three presynaptic proteins (ELKS, RIM, and the beta4-subunit of the voltage-gated calcium channel) as "baits". Fusion proteins containing BirA were selectively expressed in striatal dopaminergic neurons, followed by in-vivo biotin labelling, isolation of biotinylated proteins and proteomics, using proteins labelled after expression of a soluble BirAconstruct in dopaminergic neurons as reference. As controls, the same experiments were performed in KO-mouse lines in which the presynaptic scaffolding protein RIM or the calcium sensor synaptotagmin 1 were selectively deleted in dopaminergic neurons. To control for specificity, the proteomes were compared with those obtained by expressing a soluble BirA construct. The authors found selective enrichments of synaptic and other proteins that were disrupted in RIM but not Syt1 KO animals, with some overlap between the different baits, thus providing a novel and useful dataset to better understand the composition of dopaminergic release sites.

      Technically, the work is clearly state-of-the-art, cutting-edge, and of high quality, and I have no suggestions for experimental improvements.

      We thank the reviewer for this summary and for pointing out the high quality of the work.

      On the other hand, the data also show the limitations of the approach, and I suggest that the authors discuss these limitations in more detail. The problem is that there is very likely to be a lot of non-specific noise (for multiple reasons) and thus the enriched proteins certainly represent candidates for the interactome in the presynaptic network, but without further corroboration it cannot be claimed that as a whole they all belong to the proteome of the release site.

      We fully agree with the reviewer. Most importantly, we have changed the final section from “Conclusions” to “Summary of conclusions and limitations” (lines 501-518) to summarize the limitations with equal weight to the conclusions. In the revised manuscript, we also included many specific additional points in this respect throughout the discussion and the results: many hits could be noise (lines 458, 478-479), thresholding affects the inclusion of proteins in the release site dataset (lines 208-215), the seven-day time window could deliver interactors from the soma to the synapse (lines 493-495), specific oddities (for example histones, lines 482-485), iBioID does not deliver an interactome per se but is simply based on proximity (lines 505-508), and several more. We also clearly state that each specific hit needs follow-up studies (lines 501-503: ” Each protein will require validation through morphological and functional characterization before an unequivocal assignment to dopamine release sites is possible.”), and a similar statement was added on lines 374-375.

      Reviewer #2 (Public Review):

      The Kaiser lab has been on the forefront in understanding the mechanism of dopamine release in central mammalian neurons. assessing dopamine neuron function has been quite difficult due to the limited experimental access to these neurons. Dopamine neurons possess a number of unique functional roles and participate in several pathophysiological conditions, making them an important target of basic research. This study here has been designed to describe the proteome of the dopamine release apparatus using proximity biotin labeling via active zone protein domains fused to BirA, to test in which ways its proteome composition is similar or different to other central nerve terminals. The control experiments demonstrating proper localization as well as specificity of biotinylation are very solid, yielding in a highly enriched and well characterized proteome data base. Several new proteins were identified and the data base will very likely be a very useful resource for future analysis of the protein composition of synapse and their function at dopamine and other synapses.

      We thank the reviewer for this positive assessment of our work.

      Major comment:

      The authors find that loss of RIM leads to major reduction in the number of synaptically enriched proteins, while they did not see this loss of number of enriched proteins in the Syt1-KO's, arguing for undisrupted synaptome. Maybe I missed this, but which fraction of proteins and synaptic proteins are than co-detected both in the Syt1 and control conditions when comparing the Venn diagrams of Fig2 and Fig 3 Suppl. 2? This analysis may provide an estimate of the reliability of the method across experimental conditions.

      We thank the reviewer for proposing to be clear in the comparison of the control and Syt-1 cKODA data. A direct comparison of hit numbers is included on lines 323-324, with 37% overlap between control and Syt-1 cKODA (vs. 15% between control and RIM cKODA). A direct mapping of this overlap is included in Fig. 4E. We think that this direct comparison is complicated by a number of factors, as outlined below, and did our best to include these complications in the discussion, including the last section (lines 501-518).

      First, to assess overall similarity, the initial comparison should be to assess axonal proteins identified in the BirA-tdTomato samples. These datasets are quite similar, with 671 (control) and 793 (Syt-1 cKODA) proteins detected, and a high overlap of 601 proteins. We think that this indicates that the experiment per se is quite reproducible. The comparison of the release site proteome between control and Syt-1 cKODA is more complicated. We think that the main point of this comparison is that the overall number of hits is quite similar, with 450 hits in the Syt-1 cKODA proteome and 527 hits in the control proteome, and we now show that this similarity holds across multiple thresholds (lines 298-301; ≥ 1.5: Syt-1 cKODA 602 hits, control 991, ≥ 2.0: 450/527, ≥ 2.5: 252/348). Detailed analyses of overlap reveals that known active zone proteins such as Bassoon, CaV2 channels, RIMs, and ELKS proteins are present in both proteomes, but the overlap is partial and incomplete with 191 proteins found in both proteomes. As discussed throughout and summarized on lines 501-518, the reasons for this partial overlap may be manifold. Trivially, it could be explained by noise or non-saturation (“incompleteness”) of the proteome. We also think that the Syt-1 proteome is not expected to be identical because there is a strong release deficit in these mice. If Syt-1 has a dopamine vesicle docking function (which it does at conventional synapses [4]), this could influence the proteome. We note that protein functions in the dopamine axon are not well established, but inferred from studies of classical synapses.

      We have scrutinized the manuscript to not express that the control and Syt-1 cKODA proteomes are identical; we know they are not and discuss the example of α-synuclein specifically (Fig. 6, lines 347-362). Rather, the striking part is that the extent of the proteomes with high hit number, much higher than RIM cKODA, are similar. Specific hits have to be assessed in a detailed way, one hit at a time, in future studies, as expressed unequivocally on lines 501-503).

      Reviewer #3 (Public Review):

      In this study Kershberg et al use three novel in vivo biotin-identification (iBioID) approaches in mice to isolate and identify proteins of axonal dopamine release sites. By dissecting the striatum, where dopamine axons are, from the substantia nigra and VTA, where dopamine somata are, the authors selectively analyzed axonal compartments. Perturbation studies were designed by crossing the iBioID lines with null mutant mice. Combining the data from these three independent iBioID approaches and the fact that axonal compartments are separated from somata provides a precise and valuable description of the protein composition of these release sites, with many new proteins not previously associated with synaptic release sites. These data are a valuable resource for future experiments on dopamine release mechanisms in the CNS and the organization of the release sites. The BirA (BioID) tags are carefully positioned in three target proteins not to affect their localization/function. Data analysis and visualization are excellent. Combining the new iBioID approaches with existing null mutant mice produces powerful perturbation experiments that lead and strong conclusions on the central role of RIM1 as central organizers of dopamine release sites and unexpected (and unexplained) new findings on how RIM1 and synaptotagmin1 are both required for the accumulation of alpha-synuclein at dopamine release sites.

      We thank the reviewer for assessing our paper, for summarizing our main findings, and for expressing genuine enthusiasm for the approach and the outcomes.

      It is not entirely clear how certain decisions made by the authors on data thresholds may affect the overall picture emerging from their analyses. This is a purely hypothesis-generating study. The authors made little efforts to define expectations and compare their results to these. Consequently, there is little guidance on how to interpret the data and how decisions made by the authors affect the overall conclusions. For instance, the collection of proteins tagged by all three tagging strategies (Fig 2) is expected to contain all known components of dopamine release sites (not at all the case), and maybe also synaptic vesicles (2 TM components detected, but not the most well-known components like vSNAREs and H+/DA-transporters), and endocytic machinery (only 2 endophilin orthologs detected). Whether or not a more complete collection the components of release sites, synaptic vesicles or endocytic machinery are observed might depend on two hard thresholds applied in this study: (a) "Hits" (depicted in Fig 2) were defined as proteins enriched {greater than or equal to} 2-fold (line 178) and peptides not detected in the negative control (soluble BirA) were defined as 0.5 (line 175). How crucial are these two decisions? It would be great to know if the overall conclusions change if these decisions were made differently.

      We agree with the reviewer that the thresholding decisions are important and have now better incorporated the rationale for these decisions in the manuscript.

      Two-fold enrichment threshold. As outlined in the response to point 1 in the editorial decision letter, we now include figure supplements to illustrate the composition of the control proteome if we apply 1.5- or 2.5-fold enrichment thresholds (Fig. 2 – figure supplements 1 and 2) instead of the 2.0-fold threshold used in Fig. 2. This leads to more or less hits (991 and 348, respectively) compared to the 2.0-fold threshold (527 hits). It is noteworthy that the SynGO-overlap is the highest with the 2.0 threshold (37% vs. 31% at 1.5 and 33% at 2.5, Fig. 2 – figure supplement 3), justifying this threshold experimentally in addition to what was done in previous work [1,2]. These data are now described on lines 208-215 of the manuscript. When we apply these different thresholds to RIM and Syt-1 cKODA datasets, the finding that RIM ablation disrupts release site assembly persists. The following hit numbers were observed in the mutants at the 1.5, 2.0 and 2.5 enrichment thresholds, respectively: RIM cKODA 268, 198 and 82 hits; Syt cKODA 602, 450 and 252 hits. Hence, the extent of the release site proteome remains much smaller after RIM ablation independent of the enrichment threshold, bolstering the conclusion that RIM is an important scaffold for these release sites. This is included in the revised manuscript on lines 298301.

      Undetected peptides in BirA-tdTomato. We did not express this well enough in the manuscript. The undetected proteins were set to 0.5 such that a protein that was detected with a specific bait but not with BirA-tdTomato could be illustrated with a specific circle size, not to determine inclusion in the analyses. If the average peptide count across repeats with a specific bait was 1, this resulted in inclusion in Fig. 2 and consecutive analyses with the smallest circle size. Hence, this decision was made to define circle size. It did not affect inclusion in Fig. 2 beyond the following two points. If one were to further decrease it, this might result in including peptides that only appeared once as a single peptide for some of the experiments, which we wanted to avoid. If one would set it higher (to 1), this artificial threshold would be equal to proteins that were actually detected experimentally multiple times, which we wanted to avoid as well. We have now clarified this on lines 165-167 and lines 1119-1121.

      Expected proteins. In general, interpreting our dataset with a strong prior of expected proteins is difficult. The literature on release site proteins specifically characterized for dopamine is limited. We have found Bassoon, RIM, ELKS and Munc13 to be present using 3D-SIM superresolution microscopy [5,6], and we indeed found these proteins in the data as discussed on lines 227-232 and lines 423-445 in the revised manuscript. The prediction for vesicular and endocytic proteins is complicated. Release sites are sparse [5,7], and vesicle clusters are widespread in the dopamine axon, in some cases filling most of the axon (for example, see extended vesicle clusters filling much of the dopamine axon in Fig. 7E of [5]). Furthermore, docking in dopamine axons has not been characterized, and it is unclear how frequently vesicles are docked. Hence, it is not clear whether vesicular proteins should be concentrated at release sites compared to the rest of the axon (the BirA-tdTomato proteome we use for normalization). Similar points can be made for proteins for endocytosis and recycling of dopamine vesicles. Within the dopamine system, it is unclear whether the recycling pathway is close to the exocytic sites. One consistent finding across functional studies is that depletion after activity is unusually long-lasting in the dopamine system, for tens of seconds, even after only mild stimulation [5,8–13]. Hence, endocytosis and RRP replenishment might be very slow in these axons. It is not certain that endocytic factors are predeployed to the plasma membrane, and if they are, it is unclear how close to release sites they would be. As such, we agree with the reviewer that the proteome we describe is a hypothesisgenerator. With the limited knowledge on dopamine release, predictions beyond the previously characterized proteins in dopamine axons are difficult to make.

      We thank the reviewer for suggesting to include a better analysis of different thresholds and for giving us the opportunity to clarify the other points that were raised.

      Given the good separation of the axonal compartment from the somata (one of the real experimental strengths of this study), it is completely unexpected to find two histones being enriched with all three tagging strategies (Hist1h1d and 1h4a). This should be mentioned and discussed.

      We agree with the reviewer and have addressed this point in the manuscript. This could either reflect noise, or there could be more specific reasons behind it. The manuscript now states on lines 482-485: “It is surprising that Hist1h1d and Hist1h4a, genes encoding for the histone proteins H1.3 and H4, were robustly enriched (Fig. 2A). These hits might be entirely unspecific, or their co-purification could be due to biotinylation of H1 and H4 proteins (Stanley et al., 2001). It is also possible that there are unidentified synaptic functions of some of the unexpected proteins.” Ultimately, we do not know why these proteins are enriched, and we state clearly in the section “Summary of conclusions and limitations” that each new hit has to be validated in future studies (lines 501-503).

      It would also help to compare the data more systematically to a previous study that attempted to define release sites (albeit not dopamine release sites) using a different methodology (biochemical purification): Boyken et al (only mentioned in relation to Nptn, but other proteins are observed in both studies too, e.g. Cend1).

      We agree with the reviewer that Boyken et al, 2013 [14] is an important resource for our paper and for the assessment of the proteomic composition of release sites. We have now introduced links and citations to this paper multiple times (for example, on lines 231, 241, 430, 443, 481) and have expanded the discussion of overlap between these proteomes, including on Cend1 (lines 479482).

      We think that a systematic comparison with Boyken et al, 2013 [14] is complicated because (1) so little is known about dopamine release mechanics and (2) because the approach is very different between the two papers. In respect to (1), most prominently, it is not certain how frequently vesicles are docked in the dopamine axon. Only ~25% of the varicosities contain these release sites, and vesicle docking has not been characterized in striatal dopamine axons to the best of our knowledge. Hence, how a docking site at a classical synapse compares to a dopamine release site remains unclear at the outset. For point (2), the key difference is that “within dataset normalizations” are very different in these two studies. In our iBioID dataset, we normalize to soluble proteins defined as proximity to BirA-tdTomato. In ref. [14], the authors express enrichment over “light”, regular synaptic vesicles purified with the same approach. This has a major impact on the proteome that strongly influences a direct comparison of hits, because there are large differences in the normalization. While each normalization makes sense for the respective paper, it complicates direct comparison.

      With these points in mind, we have compared hits across both datasets class-by-class. For some classes, the datasets have reasonable overlap for ≥ 2-fold enriched proteins: for example for active zone proteins (3 of 7 hits in [14] appear in our control proteome) and adhesion and cell surface proteins (8 of 18). For other classes, the overlap is limited: for example for nucleotide metabolism/protein synthesis (0 of 16 hits in [14] appear in our dataset) and cytoskeletal proteins (5 of 29). We hope the reviewer agrees, that given these factors, the analyses and discussion needed for a systematic comparison goes beyond the scope of our paper. We have instead added a number of references to Boyken et al., 2013 [14], as outlined above, when direct comparison is meaningful.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01758

      Corresponding author(s): Harbison, Susan and Souto-Maior, Caetano

      [Please use this template only if the submitted manuscript should be considered by the affiliate journal as a full revision in response to the points raised by the reviewers.

      If you wish to submit a preliminary revision with a revision plan, please use our "Revision Plan" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements [optional]

      This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

      We thank the reviewers for their time and care in evaluating our manuscript. They raise several important points, which we have addressed, resulting in a greatly improved manuscript. Please note that we numbered the comments from both reviewers for ease of reference, as we cross-referenced comments in some cases. Reviewer comments are in italics; our responses are provided in plain text.

      2. Point-by-point description of the revisions

      This section is mandatory. *Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. *

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      *Summary*:

      *The authors of this work generated a Sleep Advanced Intercross Population from 10 extreme sleeper Drosophila Genetics Reference Panel. This new out-bred population was subjected to a artificial selection with the aim of understanding the genes underlying the sleep duration differences between three populations: short-sleep, unselected, and long-sleep. Using analysis of variance the authors identified up to nearly 400 of genes that were significant selected over the various generations and showed opposite trends for long and short sleep, thus potentially relevant for the regulation of sleep duration. 85 of these genes were consistent between male and females sub-populations, suggesting a small number of genetic divergences may underlie sex-independent mechanisms of sleep.

      Given the time-course nature of the generational data obtained, the authors studied potential correlations and interactions between these 85 identified candidate genes. Initially, the authors used pairwise Spearman correlation, noticing how this method could not filter most of pairwise interaction (around 40% of all possibilities were significant). To overcome the linear limitations of the previous approach, the authors implemented a more complex, non-linear Gaussian process model able to account for pairwise interactions. This new approach was able to identify a smaller number of different, and potentially more informative, correlations between the candidate genes previously identified.

      Lastly, with genetic manipulations, the authors show in vivo that a subset of the candidate genes is causally related with the sleep duration as well as partially validating some of the correlation identified by their new model.

      The authors conclude that, given the non-linear and complex nature of biological systems, simplistic linear approaches may not suffice to fully capture underlying mechanisms of complex traits such as sleep.

      *Major comments*

      1. Most of the the work presented focus on the computational and statistical analysis of different populations submitted (or not) to a process of artificial selection for short or long sleep duration. As such, the amount of potentially relevant biological conclusions to be tested is mostly unfeasible. The authors already present additional experiments to partially support some, though not all, of their findings. Given the manuscript is written as a method innovation, these additional experiments illustrate the potential uses of the method described. *

      Our response: The reviewer raises a very important point, one that is at the very impetus of our work. We agree that it is not possible to test all combinations of genes in all contexts to determine whether they influence sleep or not. In contrast to the situation for circadian rhythms, where the core clock is controlled by just four genes, recent work has concluded that sleep is a set of complex traits influenced by large numbers of genes. Robust computational methods are needed to identify the complex interactions among genes. The current manuscript is a first step towards achieving this goal.

      *(OPTIONAL) However, since the one of the focuses of this work in identifying potential gene interactions, it would be interesting if the authors could test a "double knockout" and perhaps demonstrate evidence for epistasis between two of the identified genes. Having access to single mutants, this experiment should be realistic. However, I have no hands-on experience working with Drosophila and I am unable to accurately estimate the amount of resources and time such and experiment could take. My initial guess would be 3-6 months work should suffice. *

      Our response: The reviewer makes an interesting proposal. While such an experiment would provide some additional information, our method does not make any prediction about what a double knockout would do, either to the sleep phenotypes or to gene expression.

      2. In regards to the gene CG1304, it seems to be an important example used throughout the manuscript. It should be carefully re-analyzed as was considered for interaction analyses without showing opposite trends for short- and long-sleep populations (see minor comments on figure 2).

      Our response: We are not entirely certain that we understand the reviewer’s point. We note that significant genotype-by-selection-scheme interactions may not manifest as opposite trends and this is not what is being tested for significance. The likelihood ratio is a test for a significant effect of including sel x gen coefficients for both short and long schemes; therefore, GLM significance may mean that either one or the two selection schemes are significantly different from controls, not from each other. We could, for instance, apply three different tests: one (i) comparing between long and short flies; the second (ii) __comparing short flies to controls; and the third (iii) __comparing long to controls and find that the first test is significant — i.e. short is different from long — and that the two others are not — i.e. neither scheme is found to be different from controls. The opposite could also happen: short and long flies may not be different from each other, but with both being different from controls.

      Since we are interested in identifying differences of either to controls, our choice of statistical test is equivalent to performing tests (ii) __and (iii)__ without the need to perform and correct for multiple tests. While there are caveats to this choice (like all choices), linear model-based differential expression analysis has its own caveats, and has limited ability to pick up arbitrary trends, so it serves as a coarse-grained filter for large shifts since it’s too costly (computationally) to run the Gaussian process on 50 million pairwise combinations.

      *3. One major comment would be that the claim that the Gaussian process method is more sensitive and specific than simpler approaches, though intuitively understandable, does not seem to be fully correct from a strict statistical point of view, given the lack of a gold standard reference to compare if the new method is indeed picking more true positives/negatives. I would reconsider re-rephrasing such statement in the absence of a biologically relevant validation set. *

      Our response: We agree with the reviewer that there is no ‘gold standard’ reference data set with which to compare our findings. We have softened this language a bit in response, where it occurs in both the Abstract and the Results.

      Under Abstract, we changed “Our method not only is considerably more specific than standard correlation metrics but also more sensitive, finding correlations not significant by other methods” to “Our method appears to be not only more specific than standard correlation metrics but also more sensitive, finding correlations not significant by other methods.”

      Under Results, we changed “Therefore, computing correlations between genes using covariance estimates from the Gaussian Processes greatly increases specificity over direct correlations. Furthermore, the Gaussian processes are not only more specific but more sensitive…” to “Therefore, computing correlations between genes using covariance estimates from the Gaussian Processes appears to increase specificity over direct correlations. Furthermore, the Gaussian Processes appear to be more sensitive…”

      *4. Finally, the study appears to be well powered and it is clear that the authors were careful in their explanation of the statistical methods. However, I could not find the copy of the code/script used for the model. Without it, it would be very difficult to fully reproduce the results as both the language used (Stan) and the method itself are not common in the sleep research field. *

      Our response: We thank the reviewer for noticing this, and apologize for this oversight. The code used for analysis has been deposited in GitHub under: https://github.com/caesoma/Multiple-shifts-in-gene-network-interactions-shape-phenotypes-of-Drosophila-melanogaster.

      We have noted the script location in the Data Availability statement. We added a statement to read “All scripts used for the model have been deposited in Git Hub https://github.com/caesoma/Multiple-shifts-in-gene-network-interactions-shape-phenotypes-of-Drosophila-melanogaster.”

      * * *Minor comments* * 5. The statistical cut-off used for gene expression hierarchical GLMM after BH correction was of 0.001, which is 50 times more strict than the common 0.05. Could the authors comment on how this choice may impact the results compared to those available in the literature and on the rational for choosing such a value.*

      Our response: A FDR of 0.05 would increase the number of genes identified (3,544 for females; 1,136 for males, with 462 overlapping). The FDR of 0.001 is consistent with the lowest threshold typically used for gene expression data collected during other artificial selection experiments (Mackay et al., 2005; Morozova et al., 2007; Edwards et al., 2006), though thresholds as high as 0.20 have been used (Sorensen et al., 2007). We have added to the last statement to the Methods and Materials section under “Generalized Linear Model analysis of expression data” to read “Model p-values were corrected for multiple testing using the Benjamini-Hochberg method (Benjamini and Hochberg, 1995), with significance defined at the 0.001 level, consistent with the lower threshold applied in other artificial selection studies (Mackay et al., 2005; Morozova et al., 2007; Edwards et al., 2006).”

      *6. Heritability calculations are not mentioned in the methods. Could it be useful to include a small paragraph? Could a small comment be done on the differences in h2 for the short sleep replicates which show ~10x difference? *

      Our response: We thank the reviewer for noticing this omission and apologize for the oversight. We have added the following statements to the Methods and Materials under “Quantitative genetic analyses of selected and correlated phenotypic responses.”

      “We estimated realized heritability h2 using the breeder’s equation:

      h2 = ΣR/ΣS

      where ΣR and ΣS are the cumulative selection response and differential, respectively (Falconer and Mackay, 1996). The selection response is computed as the difference between the offspring mean night sleep and the mean night sleep of the parental generation. The selection differential is the difference between the mean night sleep of the selected parents and the mean night sleep of the parental generation.”

      Additionally, we thank the reviewer for noticing the large difference in the realized heritability between the short sleeping population replicates; the heritability for replicate 1 is a typo and should be 0.169, not 0.0169. Hence, the heritabilities of both replicate populations are quite similar, i.e., 0.169 for replicate 1 and 0.183 for replicate 2. We have corrected this error in the Results.

      7. In regards to the model implementation, what would be the implications of not enforcing positive semi-definiteness on the co-variance matrix, given than that these are strictly positive semi-defined?

      Our response: All covariance matrices are by definition positive semi-definite (PSD), since they cannot yield negative values for the probabilities associated to them, so it would not be possible to relax that assumption generally. The only choice we could make would be on the number of genes included (M) in each multi-channel gaussian process model, and this in turn would by design enforce positive semi-definiteness on an matrix of size MN, (N being the number of generations). As noted in the appendix, “enforcing” positive semi-definiteness on smaller blocks of a larger 2D-array of covariances (which is not itself a covariance matrix) does not imply the latter is PSD and therefore seems like a softer constraint. In practice scaling up to a model where M >> 40 is not trivial from a computational and inference point of view, so the choice of smaller M is in a way imposed on us, and fortunately it is the less limiting one. We provide the appendix as a general clarification on the subtleties of Gaussian Processes, but a comprehensive assessment is beyond the multidisciplinary scope of this article and would require a narrower mathematical/statistical description in a standalone methodological article or technical note.

      1. *The methods mention that PCA projection were performed on the first 3 components, however only the first two are showed. *

      Our response: PCA was performed on 10 components, although the algorithms will commonly compute all components and return only the selected number. The variance of the third component is smaller than ~5% (that of the second PC). In practice PC1 is by itself enough to show the clear separation of expression per sex with ~65% of the variance; PC2 is in fact only shown to improve visualization. Plots of the remaining components will not show clear separation among samples as the variance explained is so small. We have corrected the Methods to indicate that PCA was performed on 10 components rather than 3.

      *9. Figure 1 refers to the mean night sleep time of the population. Could some measurement of variability (se or sd) be represented to provide a general idea of the distribution of the values? Additionally, the standard deviation of associated with the CVe estimates are mentioned but not showed explicitly. Could they maybe be added to the text as to illustrate how much such deviations were reduced? *

      Our response: We thank the reviewer for this comment. Including either the standard errors or standard deviations on the plot of the response to selection (Figure 1A) makes visualization unwieldy; thus we have added an additional supplemental table, Supplementary Table S15, that contains the mean night sleep, standard deviation, and number of flies measured for each generation in each replicate population. We also added a plot of the standard deviation in night sleep per generation to Supplemental Figure S2 (letter “Q” in the figure) so that the reduction over time in each population can be seen.

      Under “Data Availability,” We added the following: “Night sleep phenotypes per selection scheme/sex/generation/population replicate are listed in Table S15.”

      *10. Figure 2 shows the linear model fits for gene CG1304. I find this gene on the list of significant genes for both sexes (tables S5/6), but it does not seem to be one that shows opposite trend for short- and long-sleep (tables S7/8). Surprisingly, it shows up again on table S10! However, the text introducing the figure reads like this should be one of the 85 sex-independent genes. Would it be best to provide an example of what a significant gene looks like? *

      Our response: As mentioned in our response to comment #2 above, significance in the likelihood-ratio test does not imply opposite trends between long and short selection schemes, but between a model that includes specific slope coefficients for selection scheme by generation (both long and short) compared to a reduced model where the only slope is one associated to generation and therefore independent of selection scheme.

      11. *Figure 3 would be interesting to have both the GP correlations and the Spearman correlations to illustrate the methodological differences. I would be curious to see at least one pairwise expression scatter-plot as well just to see how they correlate in one plot. *

      __Our response: __Table S11 contains all (significant and nonsignificant) GP and Spearman values side-by-side for comparison. High correlations are likely to conform to the Spearman assumptions of a monotonic relationship; nevertheless, this will not be so for the majority of genes since the difference in the number of Spearman and GP-significant genes is tenfold or more, so it would be misleading to focus on individual-gene relationships without taking into consideration the transcriptome wide results for any method employed.

      We would like to stress that there is nothing particularly special about CG1304 in and of itself; furthermore, there are no “representative” genes or figures in this manuscript. Instead, CG1304 is chosen because its GLM and GP fits are illustrative of the limitations and capabilities of each model to pick up certain kinds of trends, and especially because it is especially instructive of how correlations arise from the GP model, which may not be intuitively clear to all readers.

      12. Figures 3S3/4 are described as showing single- and multi-channel models don't change substantially. Would this be expected and why?

      Our response: This is not necessarily expected, as scaling up from a single to a multi-channel model will add additional parameters as well as constraints, like positive the semi-definiteness mentioned in comment #7 above. If that seemed to have considerable impact on the fits it could challenge our assumption that the signal variance parameters estimated from the single-channel are good priors for the same parameters in the two-channel model (although this is not a hard constraint, so in the worst case the result could still only be a slight bias).

      *13. Having build different networks of pairwise associations of genes (projecting on a unified network as illustrated on figure 5), it could in interesting to compare the network topologies at a basic level such as node degrees, overlapping sub-networks, are they potentially scale free as previously described for biological systems, etc. *

      __Our response: __The reviewer makes an interesting point. Indeed summaries of the network could be useful information about the system level parameters, which are the main results of this paper. We now include the number of connections (i.e., the degree) to each gene in each of the four networks presented in Figure 5 in a new supplemental Table (Table S13). We also plot the distribution of node connectivity below. The distributions do not appear random (i.e., a normal distribution), and appear closer to a power-law or scale-free distribution. However, the small size and low average degree of these networks make a formal test unfeasible, and a recent study suggests that a log-normal distribution is in general more likely than a power-law distribution (Broido et al., Nat Comm, 2019), so we lack the evidence to claim that these networks are scale-free.

      We have added to the Results under “Gaussian Process model analysis uncovers nonlinear trends and specifically identifies covariance in expression between genes”: “Table S13 lists the number of connections (degrees) that each gene has with others in the network. The average number of connections for long-sleeper males was 2.6; the other three networks had average degrees of 2.0 or less (2.0 for long-sleeper females and short-sleeper males; 1.75 for short-sleeper females).”

      *14. On table S6 I noticed some gene symbols were loaded as dates (1-Dec) *

      Our response: We thank the reviewer for noticing this, the gene symbol is supposed to be dec. We have corrected this in Table S6 (now Table S7).

      1. *In results, the phenotypical response to artificial selection is sometimes described in minutes, other times in hours. Though this is an hurdle, it could make the values easier to compere if they were consistently formatted as minutes (hours). *

      Our response: We are unsure what the reviewer is referring to. We only see one sentence in which we used hours, and that was the concluding sentence under Results, “Phenotypic response to artificial selection.” The remainder of the manuscript refers to sleep times in minutes, phenotypes in all of the figures are plotted as minutes, and all of the supplemental material refers to times in minutes.

      16. *Over 99% of chains converged after three runs. Even though the reasons for the lack of convergence of these chains was not investigated, could this be a relevant effect? 1% of 3570 interactions is still 35 potential interactions. Do the non convergent chains relate with specific genes? *

      Our response: Bayesian MCMC inference is a stochastic algorithm, so there is a finite chance that any given run doesn’t converge, and that means that all eight parallel chains must converge and mix as measured by the stringent choice of R-hat metric being within 0.05 of unity. Relaxing the interval to 0.1 or 0.2 could still be acceptable, but we made the choice of a stringent threshold to avoid making interpretations on less-than-ideal runs. There is no evidence that there is any gene-specific problem, usually it would be one out of eight chains that would not mix well and throw off the diagnostic metrics (like relaxing the metrics, an acceptable approach could be accepting a run with 6-7 chains converging properly, but we decided to rerun all chains and only accept 100% convergence but accept a possible loss). Non-converging/nonmixing runs are likely to eventually do so, but since were are running tens of thousands of runs (3570 pairwise combinations × 3 schemes × 8 chains) a massively parallel implementation in a HPC cluster is required. Finally, seeing that 145 is ~4% of the total number of interactions, a naïve expectation would be that no more than one interaction would come out significant — while there is a chance that an interesting interaction was identified, the same can be said for potential false negatives computed using the GLM, which is a consequence of working at a high-throughput scale.

      17. The GO terms identified as significantly enriched after pvalue correction point to a clear association of the 85 genes identified with Serine proteases. Could this be discussed further to highlight biological findings of the work in the context of neuronal function or sleep regulation?

      Our response: The reviewer is correct, nine putative Serine proteases are significantly enriched among the 85 genes. All nine exhibit some expression in neurons and in epithelial cells, and all are expressed at the adult stage. The appearance of these enzymes is interesting given their role in proteolysis.

      We have updated the Discussion to read, “Interestingly, our Gene Ontology analysis identified nine genes from the 85-gene network with predicted Serine endopeptidase/peptidase/hydrolase activity: CG1304, CG10472, CG14990, CG32523, CG9676, grass, Jon65Ai, Jon65Aii, and Jon99Fii. All of these genes are expressed in neurons and epithelial cells, and all genes are expressed at the adult stage (Li et al., 2022). Serine proteases are a large group of proteins (257 in Drosophila) that perform a variety of functions (Cao and Jiang, 2018). Their predicted enzymatic activity suggests a putative role in proteolysis. This is an intriguing observation given pioneering work in mammals which suggested a role for sleep in exchanging interstitial fluid and metabolites between the brain and cerebral spinal fluid (Xie et al., 2013). Recent work demonstrated that a similar function is conserved in flies via vesicular trafficking through the fly blood-brain barrier (Artiushin et al., 2018). It would be interesting to determine whether these genes function in this process.”

      *18. Could the authors discuss the little overlap between males/females and shot/long sleep for 145 gene pairs identified after the MCMC runs. Similarly, how can the network differences be explained from a biological/evolutionary perspective? *

      Our response: The reviewer asks an interesting question. We did not detect sex-specific responses to artificial selection for long or short sleep in the present experiment. Yet differences in gene expression network pairs between males and females exist, and as the reviewer mentions, we also observed differences in network pairs between long sleepers and short sleepers. These differences reflect an inescapable conclusion: a given sleep duration phenotype can originate from more than one gene expression network configuration.

      19. *In the mutational analyses it is pointed out that CG12560 and Jon65Aii only affect females significantly. However, in the following sentence, the authors claim these two genes had the greatest effect on both sexes, which seems contradictory, at least in the way it is described. *

      Our response: Our wording may have been confusing, given that it came after a comment about Jon65Aii. Our exact statement was “Effects of the Minos insertions on night sleep duration were stronger in females than in males; when sexes were examined separately, only mutations in CG12560 and Jon65Aii affected male night sleep duration.” This was meant to convey that the effects of all Minos insertions were the same directionally for both males and females, but that only CG12560 and Jon65Aii insertions had statistically significant effects on each sex separately. We have re-worded this sentence to read “All Minos insertions had the same directional effect on night sleep for both males and females, but only the CG12560 and Jon65Aii insertions had statistically significant effects on night sleep on each sex separately.”

      20. *Maybe a small comment on how unchanged expression could lead to the observed phenotypical variation could help understanding how Minos mutations effects are biological mediated for those not familiar with the method. This seems to be the authors expectation so, could it be non-functional proteins or something else? *

      Our response: The reviewer raises an interesting point. We did not observe changes in gene expression for CG13793, Cyp6a16, or hiw compared to w1118 controls. Thus far, we have examined gene expression relative to the control for a single timepoint, and only in pooled whole flies. Differential gene expression between the Minos mutants and controls might occur at a different timepoint, or in a small set of key neurons that would be undetectable when comparing whole flies.

      We expand on this in Results, under “Mutational analyses confirms the role of candidate genes and interacting expression networks in sleep”: “Potential reasons for the lack of a significant change in gene expression in the remaining lines include: the position of the insertion within the targeted gene, which has variable effects on its expression; the relatively low statistical power of the experiment; confining our observation to a single timepoint during the day; or pooling whole flies, which might obscure gene expression changes occurring at a single-tissue level.”

      *21. The assumption that interacting genes would have their expression ratio changed by the Minos insertion would hold on situation where the affected gene causally interferes with the candidates expression. As far as I understand, causality cannot be inferred by the proposed method. Thus in a situation where both genes are co-regulated by a third factor, no change in expression ratio is to expected. How would the authors re-interpret their final result when considering this direct vs indirect interaction distinction? *

      Our response: Our method only gives us the hypothesis that two genes interact based on their correlation, and that is what we test using the Minos insertions. We do not as yet have a way to identify a third gene or factor that might be regulating the two. Given the number of genes affecting sleep, it is quite likely that there are such factors, but we can only report and test what we’ve observed. Any interpretation based on an arbitrary third factor would be purely speculative.

      **Referees cross-commenting**

      22. *I agree with Reviewer #2 comments which, to me, reads as generally pointing out the lack of biological interpretation of the results (and thus connecting this study with previous literature). Adding this component would make the manuscript well-rounded and attractive to a wider audience. *

      Our response: We agree with both reviewers that additional biological interpretation of the results would make the manuscript more attractive to a wider audience. Accordingly, we have added the following paragraph to the Discussion: “The genes we identify herein overlap and extend previous work. Of the 1,140 genes implicated in the generalized linear model, 151 (13.2 percent) overlapped with previous candidate gene, random mutagenesis, gene expression, and genome-wide association studies of sleep and circadian behavior in flies (Pegoraro e t al., 2022; Dissel et al., 2015; Seugnet et al., 2017; Shalaby et al., 2018; Thimgan et al., 2010, Thimgan et al., 2018, He et al., 2013; Mallon et al., 2014; Roessingh et al., 2019, Feng et al., 2018; Lee et al., 2021; Khoury et al., 2020; Wu et al., 2018; Harbison et al., 2013; Harbison et al., 2009; Harbison et al., 2017; Harbison et al., 2019). Notably, previous studies identified the genes CG17574, cry, dro, mip120, Mtk, NPFR1, pdgy, PGRP-LC, Shal, and vari as affecting sleep duration (Feng e t al., 2018, Dissel et al., 2015; Pegoraro et al., 2022; Thimgan et al., 2018; Mallon et al., 2014; He et al., 2013; Khoury et al., 2020; Harbison et al., 2013). Two genes, ringer and mip120, overlapped with our previous study of DNA sequence variation in flies selected for long and short sleep (Harbison et al., 2017). In that study we identified a polymorphism in an intron of ringer that changed in allele frequency with selection, with increases in the population frequency of the ‘G’ allele with increasing sleep, while the frequency of the ‘A’ allele increased with decreasing sleep. When the selective breeding procedure was relaxed, the frequency of the ‘G’ allele increased in short-sleeping populations, paralleling an increase in sleep (Souto-Maior et al., 2020). One possibility is that this polymorphism contributes to the changes in gene expression in ringer that we observed in the present study. Of the 85 genes common to both sexes that we used in the gene interaction networks, 11 (13 percent) appear in other studies of sleep: CG10444, CG2003, CG5142, CG6785, CG9114, CG9676, CR42646, hiw, NPFR1, Tie, and wb (He et al., 2013; Seugnet et al., 2017; Wu et al., 2018; Harbison et al., 2013). Thus, our study corroborates genes known to affect sleep, and identifies new candidate genes for sleep as well.”

      Reviewer #1 (Significance (Required)):

      *This study proposes the application of advanced non-linear methods to study complex traits such as sleep. As implemented, Gaussian Processes are able to identify non-linear correlations between two biological features (e.g. transcripts) over time (e.g. generations), representing an attempt to push the analytical methods available beyond the single gene paradigm. As such, more than the relevance of the biological results themselves, the authors focus on the explaining and illustrating the application of methodological advances obtained, and its relevance to obtain a better understanding of biological systems.

      However the mathematical principles required to understand the implemented method are not trivial and require advanced knowledge of machine learning and statistics. This is a potential barrier, though not an impediment, to its quick and wide adoption by the community. In addition, even if demonstrated to be a valid method when working with Drosophila, the resolution required to perform such a study may be difficult to obtain with other model systems, which would likely require further refinement of the statistical approach.

      The main audience interested in this work would be basic sleep researchers. However, this work is also related to the understanding gene selection over an artificial evolutionary process, thus evolutionary and developmental biologist may be also be interested. The methodology itself, already used in other fields of study, is a general statistical tool that could be adopted by a broad range of researchers for a diversity of topics. As such, I believe with this work, the authors will be able to stimulate the development and/or rethinking of the available analytical methods to study complex biological systems, though this would likely be done either in collaboration with the authors themselves or by a specific subset of researchers who regularly work with advanced mathematical, statistical and computational principles.

      (disclaimer) My mathematical formation does not reach the PhD level expertise that may be required to fully understand the methodology described. I have never personally worked with D. melonogaster or used Gaussian Processes in a professional setting. As such, I may not be able to fully evaluate/appreciate the more detailed technical aspects of this work.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Souto-Mairo et al. reports phenotypic and genotypic effects of artificially selecting for short and long sleep in flies. They generated an impressive time-series dataset where one could examine genetic and phenotypic changes across time (generations, total 13 generations) in response to the selection pressure. The authors explored the relationships between pairs of genes in addition to just identifying potential candidate genes involved in the regulation of the amount of sleep.

      Major points:

      1. Harbison et al 2017: This study seems to be a continuation of Harbison et al 2017. There needs to be a clearer approach in the text (introduction?) in elucidating how this study is really advancing the findings of Harbison et al., 2017. Do the two studies use the same selection lines? If not, how are they different? If they are not different, what might cause the phenotypes evolving differently? For example, day sleep, day bout number do not respond to the selection pressure similarly in both studies etc. *

      Our response: We would like to emphasize that this study is not a continuation of the Harbison et al., PLoS Genetics, 2017 paper, where we examined the changes in DNA sequence during artificial selection, and it does not use the same selection lines. The fact that the two studies are different can be seen from an examination of Figure 1A of the current study and Figure 1A of the Harbison et al 2017 study. The trajectories of each population across generation are very different. Out of convenience, we used the same nomenclature to refer to the populations in both studies (L1, L2, S1, S2, etc.), and apologize if this is the source of the confusion. Both studies do originate from the same outbred population, however, and to get to the broader question that the reviewer is asking, should one expect to see the same correlated responses to selection for night sleep among selection lines originating from the same outbred population? The answer is no, not unless the selected trait and the responding trait have a genetic correlation of 1.0. We previously estimated the correlation between day sleep and night sleep to be between 0.29 - 0.38 and between day bout number and night sleep to be -0.05 (Harbison et al., 2013; Harbison et al. 2009). In the Harbison et al. 2017 study we noted that day sleep and day bout number had correlated responses to selection for night sleep, but neither have correlated responses in the current study. The relatively low genetic correlations between these two measures and night sleep explain why we do not see a consistent correlated response among studies.

      We didn’t really elaborate on these observations in the manuscript, and so have added to the Results under “Correlated response of other sleep traits to selection for night sleep” the following: “These correlated responses concur with previous observations we made in selected populations originating from the same outbred population for night sleep and night average bout length, and night sleep and sleep latency (Harbison et al., 2017). However, unlike the previous study, we did not see a correlated response between night sleep and day sleep, and night sleep and day bout number (Harbison et al., 2017). The lack of correlated response reflects the relatively low genetic correlations these two traits have with night sleep (Harbison et al., 2013; Harbison et al., 2009).”

      2. Zeitgeber Time (ZT) for RNA collection: It is puzzling that the study reports that the RNA was collected at 12 PM. I do not understand what this information means; especially in a project where one is working with sleep. The authors might want to report ZT. Also, why a particular ZT was chosen should be discussed. These genes are potential sleep-relevant genes - hence it is not too esoteric to think that the ZT of data collection matters a lot as some of them might be cycling. To get a more appropriate picture, multiple time points of data collection might be even better. The authors seem to have ignored this crucial aspect of a clock/sleep study - time of data collection and how time of data collection might shape your findings.

      Our response: We agree with the reviewer that it would be better to have multiple timepoints for collection, but this is difficult to implement in practice as it would require an additional 5,280 flies per generation (4 pools of 10 flies per sex per population) for 12 timepoints as recommended by Hughes et al., JBR, 2017. We mention collection time in the Methods and Materials because we are aware of the changes in gene expression over the circadian day. 12PM is the midpoint between the start of the lights-on and lights-off period (i.e., ZT6), and was chosen arbitrarily. We have added the ZT notation to the Methods and Materials for clarity.

      3. Short sleeping flies: Are there reports of flies sleeping this less? "We found 2,830 interactions; 8 of these were one of the 3,570 between the 85 genes, but none of them overlapped with the 145 gene pairs found to be different from controls. The gene interactions we observed may therefore be unique to extreme sleep." What is extreme sleep? How does this study then claim to have identified evolution of potential sleep-relevant gene expression for normal, physiologically relevant sleep?

      Our response: Our statement was not very well worded, and we thank the reviewer for noticing this. What we intended to say was that the lack of overlap between our data and a known protein-protein interaction database may due to the interactions being unique to sleep as opposed to some other complex trait. We have re-worded this statement to say “The gene interactions we observed may therefore be unique to sleep.”

      *Minor points:

      4. The article uses an unnecessarily defensive tone to establish their approach to understand underlying mechanisms of sleep 'better' than that of the others (in both introduction and discussion): "In spite the large amount of studies and data generated for many systems, identifying underlying processes is still very rare; this is clear indication that better methods are needed to obtain understanding of biological processes from data." The 'still very rare' part is just factually incorrect and misleading as far as sleep is concerned. Even if we just see Drosophila studies on sleep, there is a huge progress that's being made in terms of genes, neurons and circuits relevant for sleep: both in terms of baseline sleep as an output of the circadian clock and the rebound/homeostatic sleep. Most, if not all, of these elegant and pioneering studies from multiple, independent groups took approaches that did not require artificial selection regimes. As a substitution for their defense, the authors might attempt to present their findings in the context of the existing knowledge of sleep in flies. For example, what about genes already implicated in sleep? Do they show up in their analysis? For example, Sleepless, DATfmn, Sandman, AstA, AstA-receptor, Wide-awake etc. This could really help the manuscript.*

      Our response: We certainly did not intend for this statement to suggest that no progress had been made in the identification of genes and circuits for sleep, and we agree that elegant and pioneering approaches have made significant progress in our understanding of the phenomenon. Rather, we were thinking more in terms of fully described biochemical networks. To avoid this interpretation by other readers, we have altered the “still very rare” sentence in the Introduction to read: “Despite the large amount of studies and data generated for many systems, a full understanding of underlying processes has not yet been achieved…’

      We also agree with the reviewer that it would be helpful to put our work in the context of what is already known in flies. We have added the following paragraph to the Discussion to relate the work with previous work on sleep in flies: “The genes we identify herein overlap and extend previous work. Of the 1,140 genes implicated in the generalized linear model, 151 (13.2 percent) overlapped with previous candidate gene, random mutagenesis, gene expression, and genome-wide association studies of sleep and circadian behavior in flies (Pegoraro e t al., 2022; Dissel et al., 2015; Seugnet et al., 2017; Shalaby et al., 2018; Thimgan et al., 2010, Thimgan et al., 2018, He et al., 2013; Mallon et al., 2014; Roessingh et al., 2019, Feng et al., 2018; Lee et al., 2021; Khoury et al., 2020; Wu et al., 2018; Harbison et al., 2013; Harbison et al., 2009; Harbison et al., 2017; Harbison et al., 2019). Notably, previous studies identified the genes CG17574, cry, dro, mip120, Mtk, NPFR1, pdgy, PGRP-LC, Shal, and vari as affecting sleep duration (Feng e t al., 2018, Dissel et al., 2015; Pegoraro et al., 2022; Thimgan et al., 2018; Mallon et al., 2014; He et al., 2013; Khoury et al., 2020; Harbison et al., 2013). Two genes, ringer and mip120, overlapped with our previous study of DNA sequence variation in flies selected for long and short sleep (Harbison et al., 2017). In that study we identified a polymorphism in an intron of ringer that changed in allele frequency with selection, with increases in the population frequency of the ‘G’ allele with increasing sleep, while the frequency of the ‘A’ allele increased with decreasing sleep. When the selective breeding procedure was relaxed, the frequency of the ‘G’ allele increased in short-sleeping populations, paralleling an increase in sleep (Souto-Maior et al., 2020). One possibility is that this polymorphism contributes to the changes in gene expression in ringer that we observed in the present study. Of the 85 genes common to both sexes that we used in the gene interaction networks, 11 (13 percent) appear in other studies of sleep: CG10444, CG2003, CG5142, CG6785, CG9114, CG9676, CR42646, hiw, NPFR1, Tie, and wb (He et al., 2013; Seugnet et al., 2017; Wu et al., 2018; Harbison et al., 2013). Thus, our study corroborates genes known to affect sleep, and identifies new candidate genes for sleep as well.”

      Reviewer #2 (Significance (Required)):

      5. I believe that the authors should attempt to put this study in the context of what is already known in sleep in flies and how this study advances the knowledge. And how the knowledge generated by this study would help other sleep researchers, who, for obvious reasons, would like to employ techniques other than artificial selection and big data. The data is elegant. The work seems to be extremely laborious. Nonetheless, as it stands now, this manuscript is only very specific for an audience who work with artificial selection to understand underlying genetics of behavior. In fact, even within the fly sleep field, most people might not find this manuscript very useful.

      Our response: The reviewer may not have considered the wider application of this work. This framework is applicable to any data set of gene expression sampled across time, whether sampled across generation, as we did, or across the 24-hour circadian day, or sampled at other time intervals. We have added a statement to the Discussion to stress this fact: “The Gaussian Processes we apply herein have broad applications to other experimental designs, such as gene expression measured at varying time intervals over the circadian day, or time-based sampling of gene expression responses to drug administration.”

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Souto-Mairo et al. reports phenotypic and genotypic effects of artificially selecting for short and long sleep in flies. They generated an impressive time-series dataset where one could examine genetic and phenotypic changes across time (generations, total 13 generations) in response to the selection pressure. The authors explored the relationships between pairs of genes in addition to just identifying potential candidate genes involved in the regulation of the amount of sleep.

      Major points:

      1. Harbison et al 2017: This study seems to be a continuation of Harbison et al 2017. There needs to be a clearer approach in the text (introduction?) in elucidating how this study is really advancing the findings of Harbison et al., 2017. Do the two studies use the same selection lines? If not, how are they different? If they are not different, what might cause the phenotypes evolving differently? For example, day sleep, day bout number do not respond to the selection pressure similarly in both studies etc.
      2. Zeitgeber Time (ZT) for RNA collection: It is puzzling that the study reports that the RNA was collected at 12 PM. I do not understand what this information means; especially in a project where one is working with sleep. The authors might want to report ZT. Also, why a particular ZT was chosen should be discussed. These genes are potential sleep-relevant genes - hence it is not too esoteric to think that the ZT of data collection matters a lot as some of them might be cycling. To get a more appropriate picture, multiple time points of data collection might be even better. The authors seem to have ignored this crucial aspect of a clock/sleep study - time of data collection and how time of data collection might shape your findings.
      3. Short sleeping flies: Are there reports of flies sleeping this less? "We found 2,830 interactions; 8 of these were one of the 3,570 between the 85 genes, but none of them overlapped with the 145 gene pairs found to be different from controls. The gene interactions we observed may therefore be unique to extreme sleep." What is extreme sleep? How does this study then claim to have identified evolution of potential sleep-relevant gene expression for normal, physiologically relevant sleep?

      Minor points:

      The article uses an unnecessarily defensive tone to establish their approach to understand underlying mechanisms of sleep 'better' than that of the others (in both introduction and discussion): "In spite the large amount of studies and data generated for many systems, identifying underlying processes is still very rare; this is clear indication that better methods are needed to obtain understanding of biological processes from data." The 'still very rare' part is just factually incorrect and misleading as far as sleep is concerned. Even if we just see Drosophila studies on sleep, there is a huge progress that's being made in terms of genes, neurons and circuits relevant for sleep: both in terms of baseline sleep as an output of the circadian clock and the rebound/homeostatic sleep. Most, if not all, of these elegant and pioneering studies from multiple, independent groups took approaches that did not require artificial selection regimes. As a substitution for their defense, the authors might attempt to present their findings in the context of the existing knowledge of sleep in flies. For example, what about genes already implicated in sleep? Do they show up in their analysis? For example, Sleepless, DATfmn, Sandman, AstA, AstA-receptor, Wide-awake etc. This could really help the manuscript.

      Significance

      I believe that the authors should attempt to put this study in the context of what is already known in sleep in flies and how this study advances the knowledge. And how the knowledge generated by this study would help other sleep researchers, who, for obvious reasons, would like to employ techniques other than artificial selection and big data.

      The data is elegant. The work seems to be extremely laborious. Nonetheless, as it stands now, this manuscript is only very specific for an audience who work with artificial selection to understand underlying genetics of behavior. In fact, even within the fly sleep field, most people might not find this manuscript very useful.

    1. Author Response

      Reviewer #1 (Public Review):

      Doostani et al. present work in which they use fMRI to explore the role of normalization in V1, LO, PFs, EBA, and PPA. The goal of the manuscript is to provide experimental evidence of divisive normalization of neural responses in the human brain. The manuscript is well written and clear in its intentions; however, it is not comprehensive and limited in its interpretation. The manuscript is limited to two simple figures that support its concussions. There is no report of behavior, so there is no way to know whether participants followed instructions. This is important as the study focuses on object-based attention and the analysis depends on the task manipulation. The manuscript does not show any clear progression towards the conclusions and this makes it difficult to assess its scientific quality and the claims that it makes.

      Strengths:

      The intentions of the paper are clear and the design of the experiment itself is simple to follow. The paper presents some evidence for normalization in V1, LO, PFs, EBA, and PPA. The presented study has laid the foundation for a piece of work that could have importance for the field once it is fleshed out.

      Weakness:

      The paper claims that it provides compelling evidence for normalization in the human brain. Very broadly, the presented data support this conclusion; for the most part, the normalization model is better than the weighted sum model and a weighted average model. However, the paper is limited in how it works its way up to this conclusion. There is no interpretation of how the data should look based on expectations, just how it does look, and how/why the normalization model is most similar to the data. The paper shows a bias in focusing on visualization of the 'best' data/areas that support the conclusions whereas the data that are not as clear are minimized, yet the conclusions seem to lump all the areas in together and any nuanced differences are not recognized. It is surprising that the manuscript does not present illustrative examples of BOLD series from voxel responses across conditions given that it is stated that it is modeling responses to single voxels; these responses need to be provided for the readers to get some sense of data quality. There are also issues regarding the statistics; the statistics in the paper are not explicitly stated, and from what information is provided (multiple t-tests?), they seem to be incorrect. Last, but not least, there is no report of behavior, so it is not possible to assess the success of the attentional manipulation.

      We appreciate the reviewer’s feedback on providing more information so that the scientific quality of our work can be assessed. We have now added a new figure including BOLD responses in different conditions, as well as how we expected the data to look and the interpretations. To provide extra evidence for data quality and reliability, we have included BOLD responses of different conditions for odd and even runs separately in the supplementary information.

      In order to avoid any bias in presentation, we have now visualized the results from all areas with the same size and in a more logical order. However, we have also modified all results to include only those voxels in each ROI that were active for the stimuli presented in the main task based on the comment of one of the reviewers. According to the current results, there is no difference in the efficiency of the normalization model in different regions, which we have reported in the results section.

      Regarding the statistics, we have corrected the problem. We have performed ANOVA tests, have corrected all results for multiple comparisons, and have added a statistics subsection in the methods section to explicitly explain the statistics.

      Finally, we have added the report of the reaction time and accuracy in the results section and the supplementary information. As stated, average performance was above 86% in all conditions, confirming that the participants correctly followed the instructions and that the attentional manipulation was successful.

      We hope that the reviewer would find the manuscript improved and that the new analyses, figures, and discussions would address the reviewer’s concerns.

      Reviewer #2 (Public Review):

      My main concern is in regards to the interpretation of these results has to do with the sparseness of data available to fit with the models. The authors pit two linear models against a nonlinear (normalization) model. The predictions for weighted average and summed models are both linear models doomed to poorly match the fMRI data, particularly in contrast to the nonlinear model. So, while I appreciate the verification that responses to multiple stimuli don't add up or average each other, the model comparisons seem less interesting in this light. This is particularly salient of an issue because the model testing endeavor seems rather unconstrained. A 'true' test of the model would likely need a whole range of contrasts tested for one (or both) of the stimuli, Otherwise, as it stands we simply have a parameter (sigma) that instantly gives more wiggle room than the other models. It would be fairer to pit this normalization model against other nonlinear models. Indeed, this has been already been done in previous work by Kendrick Kay, Jon Winawer and Serge Dumoulin's groups. So far, may concern above has only been in regards to the "unattended" data. But the same issue of course extends to the attended conditions. I think the authors need to either acknowledge the limits of this approach to testing the model or introduce some other frameworks.

      We thank the reviewer for their feedback. We have taken two approaches to answer this concern. First, we have included simulations of neural population responses to attended and unattended stimuli. The results demonstrate that with our cross-validation approach, the normalization model is only a better fit if the computation performed at the neural level for multiple-stimulus responses is divisive normalization. Otherwise, the weighted sum or the weighted average models are better fits to the population response when the neurons respectively sum or average responses. These results suggest that the normalization model provides a better fit to the data because the underlying computation performed by the neurons is divisive normalization, not because of the model’s non-linearity.

      In a second approach, we tested a nonlinear model, which was a generalization of the weighted sum and the weighted average models with an extra saturation parameter (with even more parameters than the normalization model). The results demonstrated that this model was also a worse fit than the normalization model.

      Regarding the reviewer’s comment on testing for a range of contrasts, as we have emphasized now in the discussion, here, we have used single-, multiple-, attended- and unattended-stimulus conditions to explore the change in response and how the normalization model accounts for the observed changes in different conditions. While testing for a range of contrasts would also be interesting, it would need a multi-session fMRI experiment to test for a range of contrasts with isolated and paired stimulus conditions in the presence and absence of attention. Moreover, the role of contrast in normalization has been investigated in previous studies, and here we added to the existing literature by exploring responses to multiple objects, and investigating the role of attention. Finally, since the design of our experiment includes presenting superimposed stimuli, the range of contrasts we can use is limited. Low-contrast superimposed stimuli cannot be easily distinguished, and high-contrast stimuli block each other.

      We hope that the reviewer would find the manuscript improved and that the new models, simulations, analyses, and discussions would address the reviewer’s concerns.

      Reviewer #3 (Public Review):

      In this paper, the authors model brain responses for visual objects and the effect of attention on these brain responses. The authors compare three models that have been studied in the literature to account for the effect of attention on brain responses to multiple stimuli: a normalization model, a weighted average model, and a weighted sum model.

      The authors presented human volunteers with images of houses and bodies, presented in isolation or together, and measured fMRI brain activity. The authors fit the fMRI data to the predictions of these three models, and argue that the normalization model best accounts for the data.

      The strengths of this study include a relatively large number of participants (N=19), and data collected in a variety of different visual brain regions. The blocked design paradigm and the large number of fMRI runs enhance the quality of the dataset.

      Regarding the interpretation of the findings, there are a few points that should be considered: 1) The different models that are being studied have different numbers of free parameters. The normalization model has the highest number of free parameters, and it turns out to fit the data the best. Thus, the main finding could be due to the larger number of parameters in the model. The more parameters a model has, the higher "capacity" it has to potentially fit a dataset. 2) In the abstract, the authors claim that the normalization model best fits the data. However, on closer inspection, this does not appear to be the case systematically in all conditions, but rather more so in the attended conditions. In some of the other conditions, the weighted average model also appears to provide a reasonable fit, suggesting that the normalization model may be particularly relevant to modeling the effects of attention. 3) In the primary results, the data are collapsed across five different conditions (isolated/attended for preferred and null stimuli), making it difficult to determine how each model fares in each condition. It would be helpful to provide data separately for the different conditions.

      We thank the reviewer for their feedback.

      Regarding the reviewer’s concern about the number of free parameters, we have introduced a simulation approach, demonstrating that with our cross-validation approach, a model with a higher number of parameters is not a good fit when the underlying neural computation does not match the computation performed by the model. Moreover, we have now included another nonlinear model with 5 parameters that performs worse than the normalization model. Besides, we have used the AIC measure in addition to cross-validation for model comparison, and the AIC measure confirms the previous results.

      Regarding the difference in the efficiency of the normalization model across conditions, after selecting the voxels that were active during the main task in each ROI (done according to the suggestion of one of the reviewers to compensate for the difference in size of localizer and task stimuli), we observed that the normalization model was a better fit for both attended and unattended conditions. However, since the weighted average model results were also close to the data in unattended conditions, we have discussed the unattended condition separately and have discussed the relevance of our results to previous reports of multiple-stimulus responses in the absence of attention.

      Finally, concerning model comparison for different conditions, we have calculated the models’ goodness of fit across conditions for each voxel. The reason for calculating the goodness of fit in this manner was to evaluate model fits based on their ability in predicting response changes with the addition of a second stimulus and with the shifts of attention. Since correlation is blind to a systematic error in prediction for all voxels in a condition, calculating the goodness of fit across voxels would lead to misinterpretation. We have now included a figure in the supplementary information illustrating the method we used for calculating the goodness of fit.

      We hope that the reviewer would find the manuscript improved and that the new analyses, simulations, figures, and discussions would address the reviewer’s concerns.

    2. Reviewer #2 (Public Review):

      My main concern is in regards to the interpretation of these results has to do with the sparseness of data available to fit with the models. The authors pit two linear models against a nonlinear (normalization) model. The predictions for weighted average and summed models are both linear models doomed to poorly match the fMRI data, particularly in contrast to the nonlinear model. So, while I appreciate the verification that responses to multiple stimuli don't add up or average each other, the model comparisons seem less interesting in this light. This is particularly salient of an issue because the model testing endeavor seems rather unconstrained. A 'true' test of the model would likely need a whole range of contrasts tested for one (or both) of the stimuli, Otherwise, as it stands we simply have a parameter (sigma) that instantly gives more wiggle room than the other models. It would be fairer to pit this normalization model against other nonlinear models. Indeed, this has been already been done in previous work by Kendrick Kay, Jon Winawer and Serge Dumoulin's groups. So far, may concern above has only been in regards to the "unattended" data. But the same issue of course extends to the attended conditions. I think the authors need to either acknowledge the limits of this approach to testing the model or introduce some other frameworks.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We are very grateful to the reviewers for their constructive comments. In response to their critiques, we have made extensive modifications to the manuscript, including documenting new experiments and analyses, and improving data presentation. Here we provide a point-by-point response to the reviewers’ comments.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary:

      It is well established that localization of oskar (osk) RNA in the Drosophila ovary proceeds in multiple steps. The first step depends upon dynein and results in delivery of osk into the oocyte. The second step involves kinesin-driven transport of osk to the oocyte posterior pole. The manuscript by Gáspár et al brings together several lines of evidence that support an tantagonistic relationship with respect to motor binding between two osk-interacting proteins, Egalitarian (Egl) and Staufen (Stau). As staufen RNA and protein accumulate in the oocyte, Egl dissociates from osk, down-regulating dynein and enabling the second stage of osk transport to begin.

      Major comments:

      In general the experimental results support the conclusions drawn, and the paper includes a strong mix of in vitro and in vivo approaches. Nevertheless I have a few concerns.

      (1)In Fig 1D it is apparent that stau KD increases the speed of both plus-end and minus-end runs to a highly significant degree, not just minus-end runs. The stimulating effect of loss of Stau on speed of plus-end runs is not mentioned in the text, and it perhaps muddies the argument that Stau is simply a negative regulator of dynein-dependent minus-end directed transport. This result needs to be explicitly discussed in the text.

      We thank the reviewer for this important comment. Indeed, our previous analysis of the overall population of oskar RNPs showed that plus-end-directed runs had increased velocity in the absence of Staufen (although the magnitude of the effect was considerably smaller than observed for minus-end-directed runs). The reviewer’s comment prompted us to analyze the effects on motility in more detail. In particular, we have now stratified the data based on the RNA content of the RNPs to control for effects of Staufen depletion on RNA copy number of the motile oskar RNPs. These analyses, which are documented in Fig 1B-F of the revised manuscript and discussed between lines 96-143, indicate that the previous velocity and run length data was somewhat confounded by the Staufen-depleted condition having a lower fraction of moving complexes with a large RNA content, which generally move more slowly. Accounting for this effect shows that impairing Staufen has no significant effect on plus-end-directed run lengths, whereas minus-end-directed run lengths are substantially increased. The velocity of runs is also specifically increased in the minus-end direction in the Staufen-depleted background for RNPs that have a relative RNA content of 1 or 2 units, which represent the majority of the RNP population in that genotype. Whilst RNPs with larger RNA content (2 relative units) do have significantly higher plus-end-directed velocity compared to the same category in the control, the effect is of much smaller magnitude than observed for minus-end-directed movements by this population. To help clarify these results, magnitudes of the effects are now shown in the new Fig. 1 E and F.

      These data strengthen the case that Staufen predominantly affects minus-end-directed motion. Given many documented examples of the interdependence of dynein and kinesin on bidirectional cargoes (Hancock et al. 2014), it is conceivable that the modest effects on plus-end-directed velocity for a subset of RNPs arise indirectly from the influence of Staufen on dynein activity. However, we agree with the reviewer that we should not rule out the alternative possibility that Staufen has additional roles in regulating oskar transport, including potentially modulating kinesin-1 directly. We have therefore added a section to the Discussion that covers this issue (lines 496-514).

      (2) I recognize the importance of quantitative imaging to rigorously measure small differences in localization patterns. Nevertheless I find the data in Fig 3 extremely difficult to interpret. Presumably there is standard deviation everywhere there is green signal, but the magenta signal that corresponds to SD is not visible in most places that are green. I suggest adding to Fig 3 a single representative image for each genotype to illustrate each localization pattern, as well as a much clearer explanation of the quantitative imaging data. Perhaps the quantitative images could be moved to a supplemental figure.

      Reviewer 2 also suggested that we include representative images in addition to the quantitative readout. We have now replaced the old Figure 3 with a new one showing representative examples of oskar distribution in the different genotypes and moved the quantitative images to the supplement (Figure S4). We have also improved the legends and labeling of this supplementary figure to add clarity.

      **Minor comments:**

      (1)Color/density scales should be added to Figs 1A and S1A, otherwise the yellow/white signal at the posterior could be interpreted as something other than high abundance.

      We thank the reviewer for spotting this. We have now added a color scale to the relevant figures.

      (2)In Fig 4A and 4C, I find it odd to have different halves of images photographed under different intensity settings and would prefer duplicate whole images.

      We used this layout to illustrate in the most compact way possible the (co)localization of the two RBPs and oskar RNA in the nurse cell and oocyte compartments, where signal intensities can differ dramatically. Following the reviewer’s comment, we now show whole images with different intensity settings (Figure 4 A, A’, C, C’).

      (3)The references to Fig 3G on page 13 should be corrected to Fig 4G.

      We thank the reviewer for spotting this error, which has now been corrected.

      Reviewer #1 (Significance (Required)):

      The paper represents a substantial advance over existing knowledge and it extends our understanding about how RNAs can shuttle between different motor proteins to achieve a localized pattern. However, the Mohr et al 2021 PLoS Genetics paper covers some of the same ground. As that paper has now been published for several months, I believe a revised version of this paper should discuss that other work more prominently, making it apparent where the two studies concur and where this study extends the conclusions of the other one. If there are any contradictions between the two, those should be made explicit as well.

      We had discussed the Mohr et al. study in our manuscript, which came out when our work was in preparation. Following the reviewer’s comment, we now address explicitly how our study differs from theirs and how our work extends their findings. The relevant paragraphs in the Discussion begin on lines 437 and 496. Briefly, a key point of difference is that Mohr et al. focused on the Transport and Anchoring Sequence (TAS) (including its ability to associate with Egl) and other Staufen recognition sites (SRSs) in oskar mRNA. Their study also includes an experiment examining the effect of Egl overexpression on oskar localization (as described in our original submission). In contrast, our study directly examines the interplay between the RBPs Staufen and Egl on oskar RNPs. We are the first to show that Staufen directly antagonizes dynein-based transport and that this is associated, at least in part, with an ability to impair Egl association with RNPs. Moreover, we provide insights into the in vivo role of Egl/BicD in recruitment vs activation of dynein on RNPs and how the activity of Staufen is coordinated in space and time via Egl-mediated delivery of stau mRNA, which constitutes a novel type of feed-forward mechanism. We do not believe there are any contradictions between the two studies.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Gáspár et al. investigated the molecular mechanisms underlying the switching of motors for osk mRNA transport in the Drosophila ovary: from dynein in the nurse cells to kinesin-1 in the oocyte. They demonstrated that it requires two RNA-binding proteins, Egalitarian (Egl) and Staufen (Stau) to achieve the posterior localization of osk mRNA in the oocyte. Their data show that Egl is responsible for the stau mRNA transport into the oocyte, while Stau protein inhibits Egl-dependent dynein transport in the oocyte. Thus, they proposed a feed-forward mechanism in which Egl transports mRNA encoding its own antagonist Stau into the oocyte and thus achieves the switch of the osk mRNA transport from dynein to kinesin-1.

      The antagonistic interaction between Egl and Staufen is well documented both in vitro and in vivo. All the results are carefully analyzed, but the data presentation is not reader-friendly. Overall, our main concern is about the role of Staufen in osk mRNA transport.

      **Here are specific points:**

      (1)According to the model, lack of Stau should result in failure of displacing Egl from the RNP complex and thus more dynein-driven transport in the oocyte. However, the increase of minus-end run length in stau-RNAi is very small (Figure 1E). It makes us wonder whether Stau is not a dominant inhibitor of Egl/dynein transport of osk RNPs. On the other hand, the speed increase of minus-end run in stau-RNAi is more dramatic than the run length (Figure 1D-1E). Does it mean that in stau-RNAi dynein-driven osk transport has a shorter duration of run? Additionally, in Figure 1D, there is a statistically-significant increase of plus-end-directed transport velocity in stau-RNAi. While the author did mention that in the results "analysis of the speed and length of oskar RNP runs in ooplasmic extracts indicated that Khc activity was not compromised upon staufen knock-down", it does not explain the increased velocity towards the plus-end.

      We thank the reviewer for these insightful comments.

      We and others (Zimyanin et al. 2008; Gaspar et al., 2014) have shown that there is only a small posterior-directed bias in oskar RNP transport in the wild-type ooplasm at mid-oogenesis. Thus, small increases in minus-end-directed transport parameters are expected to be sufficient for anterior mislocalization of a subset of RNPs, as is seen in stau mutants (note that we would not expect a dramatic increase in minus-end-directed motile properties in the stau RNAi condition, as a significant fraction of oskar RNA is targeted posteriorly). To allow the readers to better judge the magnitude of the effects, we now include the percentage change in mean velocity and run length values on the graphs (new Figure 1E and F).

      Regarding the reviewer’s question about the run duration, indeed it is shorter for the minus-end directed runs in the absence of Staufen. In the motor field, it is typical to present velocity and run length only because duration is dependent on these two parameters.

      Reviewer 1 also made a similar comment about plus-end directed velocity of RNPs. As we wrote in response to their comment, we have now stratified the data based on the RNA content of the RNPs to control for effects of Staufen depletion on RNA copy number of the motile oskar RNPs. These analyses, which are documented in Fig 1 B-F of the revised manuscript and discussed between lines 96-143, indicate that the previous velocity and run length data were somewhat confounded by the Staufen-depleted condition having a lower fraction of moving complexes with a large RNA content, which generally move more slowly. Accounting for this effect shows that impairing Staufen has no significant effect on plus-end-directed run lengths, whereas minus-end-directed run lengths are substantially increased. The velocity of runs is also increased only in the minus-end direction in the Staufen-depleted background for RNPs that have a RNA content of 1 or 2 relative units, which represent the majority of the RNP population in that genotype. Whilst RNPs with larger RNA content (2 relative units) do have significantly higher plus-end-directed velocity compared to the same category in the control, the effect is of much smaller magnitude than observed for minus-end-directed movement for this population.

      These data strengthen the case that Staufen predominantly affects minus-end-directed motion. Given many documented examples of the interdependence of dynein and kinesin on cargoes (Hancock et al., 2014), it is conceivable that the modest effects on plus-end-directed velocity arise indirectly due to the influence of Staufen on dynein activity. However, we agree with the reviewer that we should not rule out the alternative possibility that Staufen has additional roles in regulating oskar transport, including potentially modulating kinesin-1 activity directly. We have therefore added a section to the Discussion that covers this issue (lines 496-514).

      (2) What happened to osk mRNP transport in nurse cells with Staufen overexpression? The authors briefly mentioned that "GFP-Staufen overexpression has no major effect on the localization of oskar (Fig S1F-I)" on page 10. This is quite puzzling, as the authors propose that Staufen antagonized the Egl/dynein-driven transport. If the model holds true, we would expect to see that overexpression of Staufen causes less osk transport in nurse cells and thus less osk accumulated in the oocyte. Can the authors examine the osk mRNP transport in nurse cells in control and in GFP-Staufen overexpressing mutant and quantify the total amount of osk mRNA in the oocyte in control and after GFP-Staufen overexpression?

      We showed in the initial submission that strong overexpression of GFP-Staufen in early oogenesis (e.g. with osk-Gal4) disrupts oskar localization, including causing ectopic accumulation in the nurse cells (Fig S7F and G, now marked with arrowheads). Fig S1F-I, to which the reviewer refers, documents an experiment in which the expression of GFP-Staufen was directly driven by the maternal tubulin promoter (i.e. not through the UAS-Gal4 system; now indicated in Fig. S1F). We had assumed that the difference in behavior of the different GFP-Staufen transgenes was caused by the timing and the amount of overexpression – maternal Gal4 drivers are capable of very strong and, in the case of osk-Gal4, early expression of UAS transgenes. Prompted by the reviewer, we have now examined GFP-Staufen expression in these lines in more detail. This confirmed our previous assumptions about timing and levels of ectopic expression. We now included a new panel Fig S7I to document the expression of maternal tubulin promoter-driven GFP-Staufen and have updated the manuscript to include details about the mode of Staufen overexpression used in different experiments (lines 205, 408-417).

      (3)Is osk mRNP transport in the nurse cells affected by stau-RNAi? The authors showed the Khc association with oskar mRNPs in the nurse cells in Figure 1C. We hope they could quantify the velocity and run length of the osk mRNP particles in nurse cells and compare control with stau-RNAi.

      We have never succeeded in making squashes of nurse cells that maintain oskMS2 RNA transport. Therefore, we are unable to evaluate directional transport of oskar in these cells. However, Staufen does not accumulate to appreciable levels in the nurse cells, as shown by Little et al., 2015 and also Figure 4A and A’ (left panels). Moreover, we did not detect significant colocalization between Staufen and oskar in the nurse cells (Fig. 4B). Therefore, depletion of Staufen with RNAi is not expected to influence motility of oskar in this part of the egg chamber.

      (4)The kymograms of in vitro motility assays (Figure 2A and Figure S2) clearly showed two different moving populations, fast and slow. Did the authors include both types of events in their quantifications? What are the N numbers for each quantification? What do the dots mean in Figure 2B-2G? Does each dot represent a single track in the kymograph? If so, we believe that the sample sizes are too small for in vitro motility assay.

      For completeness, we did not exclude particles from our analysis based on their speed of movement. We have now made this point clear in an updated section of the Methods (lines 799-802), which provides additional information on particle inclusion criteria.

      We did document in the legends what the dots represent (values for single microtubules). We have now also included information on the number of complexes analyzed, which is 586-1341 single RNA particles or 1247-2207 single dynein particles per condition. These sample sizes are considerably larger than those used in most in vitro motility studies.

      (5)The in vitro motility assay showed that Staufen impairs dynein-driven transport of osk 5'-UTR (Figure 2). Based on these data, it is unclear whether the effect of Staufen is osk mRNA-dependent or Egl-dependent. We suggest performing the motility assay in the absence of osk 5'-UTR and Egl. Dynein, dynactin, and BicD should be sufficient to constitute the processive dynein complex in vitro. The addition of Staufen to the dynein complex will help to understand whether Staufen could directly affect dynein activity. We bring up this point because we noticed that the Staufen displacement of Egl in osk RNPs does not alter the amount of dynein complex associated (Figure 6), implying that Staufen inactivates dynein activity on the RNP complex, independently of Egl-driven dynein recruitment.

      We cannot look at transport of dynein in the presence of only dynactin and full-length BicD as BicD is not activated (and thus unable to effectively bind dynein and dynactin) without Egl and RNA (McClintock et al. 2018, Sladewski et al. 2018). However, the reviewer’s comment prompted us to investigate the effect of Staufen on dynein-dynactin motility that is stimulated by the constitutively active truncated mammalian BicD2, so called BicD2N (Schlager et al. 2014, McKenney et al. 2014). We find that Staufen partially inhibits DDB motility but not to the extent seen with the full-length BicD in the presence of Egl and RNA (new main figure panels 2H and I, and Figure S3). As stated between lines 187-188, these data suggest that Staufen inhibits both the activation of dynein-dynactin motility by BicD proteins, as well as stimulation of this event by Egl and RNA. This finding is also incorporated in a new section of the Discussion that covers possible roles of Staufen in addition to competing for Egl’s binding to RNA (between lines 496-514). We are very grateful to the reviewer for suggesting this approach, as it has provided significant new insight into Staufen’s function.

      (6)In Figure 4, it is hard to see any colocalization between GFP and osk mRNA. And the authors compared overexpressed Egl-GFP (driven by mat atub-Gal4 in mid-oogenesis) with Staufen-GFP under its endogenous promoter. An endogenous promoter-driven Egl-GFP would be much more appropriate for the comparison.

      Colocalization between GFP and oskar signals is seen as white in Fig. 4A and C. We have now added arrows to highlight a few examples of colocalization. The degree of colocalization was quantified in an unbiased fashion (shown in panels Fig 4B and D).

      Regarding the expression of Egl-GFP: it was driven directly by the aTub84B promoter and not by matTub-Gal4. Western blot analysis performed in response to the reviewer’s comment shows that Egl-GFP is expressed at similar levels to endogenous Egl in this line (new Fig. S5I).

      (7)In a recent publication (Mohr et al., 2021), a different model was proposed, in which Egl mediates transport, and Staufen facilitates the dissociation from the transport machinery for posterior anchoring. Although the authors referred to their paper in the discussion, they should acknowledge the differences and try to reconcile it (at least in the discussion).

      We now further discuss our work in the light of the findings by Mohr et al. (a request also made by Reviewer 1) (in paragraphs starting on lines 436 and 496). In our opinion, the data of Mohr et al. in fixed material cannot discriminate between effects of Staufen (or the TAS) on transport vs anchorage. In contrast, our dynamic imaging in vitro and ex vivo shows unambiguously that Staufen can modulate transport processes. As accumulation of RNA at the cortex is dependent on directional transport, we do not think it necessary to invoke a separate anchorage role of Staufen. We have now raised the possibility that transport and cortical localization are two facets of the same underlying process in the hope that this will stimulate further investigation (lines 455-459).

      (8)In the feed-forward model, Egl is required for the staufen mRNA transport from the nurse cells to the oocyte. Are Egl-GFP dots colocalized with staufen mRNAs in the nurse cells?

      We showed in Fig 7I of the original submission that Egl-GFP puncta are colocalized with stau mRNAs in nurse cells. Indeed, this is a key piece of evidence for our model. These data are now in Figure 7F.

      Furthermore, to our understanding, in this model, the translation of the staufen mRNA would be critical for the switching motors between dynein and kinesin-1. In this sense, staufen mRNA translation is either suppressed in the nurse cells or only activated in the oocytes. I think the authors should at least address this point in the discussion.

      This is another excellent suggestion. We have now included in the Discussion (from line 525) the point that Staufen translation may be suppressed during transit to the oocyte or that the protein may be translated en route but only build up to meaningful levels where the RNA is concentrated in the oocyte.

      **Minor points:**

      1)I hope the authors would show the osk mRNA localization in egl mutant in in individual stage 9 egg chambers. I can only find the osk mRNA in egl-RNAi early stage egg chambers (Figure 7E), in which osk mRNA still shows an accumulation in the oocyte, although to a much lesser extent compared to control. In another publication (Sanghavi et al., 2016), it seems that the knockdown of Egl by RNAi causes some retention of osk mRNA in the nurse cells; but there are still noticeable amount of osk mRNA in the oocyte (Figure 3A-B). We wonder whether the authors could quantify the amount of osk mRNA both in the nurse cells and in the oocyte of control and egl-RNAi. Also I wonder whether the authors could comment on fact that some osk mRNA transported into the oocyte. Could it be due to an egl-independent transport mechanism?

      egl null mutants do not reach stage 9 due to a defect in retention of oocyte fate, hence the use of egl RNAi in our study and the one by Sanghavi et al. Whilst we can’t rule out a (minor) Egl-independent mechanism for localizing oskar RNA in the oocyte, to date no other pathway has been implicated in the delivery of this or any other mRNA from the nurse cells. We favor a scenario in which residual oskar accumulation in the oocyte in egl RNAi egg chambers is due to incomplete depletion of Egl protein in the knockdown condition. We have noted this in the relevant figure legend and also clarify that the RNAi is a tool for knockdown in line 383 of the Results section.

      The below plot shows a quantification of oskar mRNA localization in egl and control RNAi egg chambers, which the reviewer was wondering about.

      In the egl RNAi egg-chambers, there is a significant increase in the mean signal intensity of oskar mRNA in the nurse cells, while oskar mRNA levels are substantially reduced in the oocyte, in line with the findings of Sanghavi et al., 2016.

      2)It is always nice to how the average distribution of osk mRNA (e.g., Figure 3, Figure S1, and Figure S3). But we recommend having a representative image of each genotype (a single egg) next to the average distribution. It will help the readers to better appreciate the differences among these genotypes.

      This suggestion was also made by Reviewer 1. We have added representative images to Figure 3 and moved the images depicting average distributions to the supplement (Fig S4). We have also improved the legend and labeling for Fig S4.

      3)The figure legends are overall hard to read and sometimes impossible to get information about the experiments (for example, Figure 4 legend). Can the authors improve their figure legends making them reader-friendly?

      We have edited the legends to make them clearer, including an extensive reworking of those for Figure 4. We thank the reviewer for encouraging us to do this.

      4)For moderate overexpression, the authors used P{matα4-GAL-VP16} (FBtp0009293). However, there are two different transgenic lines associated with FBtp0009293 (V2H and V37), which have slightly different expression levels. The authors should specify which line they used in the experiments.

      The matTub-Gal4 transgene we used in our study is inserted in the 2nd chromosome. We now mention this in the Methods section (line 567). We received this line from another lab many years ago, with no additional information provided.

      5) On page 13 "PCR on egg-chambers co-expressing Egl-GFP and either staufen RNAi or a control RNAi (white) in the germline (Fig 3G)", it should be Figure 4G.

      We apologize for this mistake, which has now been fixed.

      Reviewer #2 (Significance (Required)):

      see above

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Some additional experimental evidence is needed to solidify the conclusions and provide definitive support for this model, as discussed below.

      Biochemical experiments using UV crosslinking and GFP immunoprecipitation followed by quantitative PCR were performed to show that Staufen antagonizes the association of Egl with oskar mRNA in vivo. -The authors need to show the quantitative analysis, which was not present in the figure, specifically the effects of Staufen RNAi compared to control.

      These quantitative data, which are key for our model, were shown in the original submission (Fig 4G in the original and revised manuscript). We mistakenly called out the panel as 3G in the original submission. We apologize for this error, which has now been dealt with.

      Is the ability of Staufen to antagonize and displace Egl dependent on Staufen binding to Oscar RNA? Will a Staufen mutant that can't bind to RNA also displace Egl? Alternatively, the mechanism may be independent of RNA binding and perhaps due to protein-protein interactions.

      While the details of how Staufen displaces Egl are certainly an interesting topic for future research, we consider that addressing this goes well beyond the scope of this study, which already covers a lot of ground. Staufen contains four double stranded RNA-binding domains, and deleting or mutating all of these domains is likely to interfere with overall folding of Staufen, thus confounding the interpretation of the results.

      As an alternative approach to elucidating RNA-dependent vs RNA-independent roles of Staufen, we have now assessed the effect of the protein on in vitro motility of dynein-dynactin complexes formed in the presence of a constitutively active truncation of mammalian BicD2 (BicD2N). We find that Staufen partially inhibits motility of these ‘DDB’ complexes but not to the extent seen with the full length BicD in the presence of Egl and RNA (new Fig 2H, I and S3). As stated in the manuscript (lines 187-188) these data suggest that Staufen inhibits both the activation of dynein-dynactin motility by BicD proteins, as well as stimulation of this event by Egl and RNA. We believe these experiments provide significant new insight into Staufen’s function. This finding is also incorporated into a new section of the Discussion dealing with potential roles of Staufen in addition to displacing Egl from RNPs.

      A key question addressed is how does Staufen play a role in directing Oscar RNA localization to the posterior pole. The spatiotemporal control of Staufen at stage 9 seems to be a critical step. A number of experiments are performed to show that Staufen RNA enters the oocyte and accumulates to anterior pole through a process dependent on Egl (Fig. 7).

      -Definitive evidence is needed to show the role of 3'UTR of Stau and Egl binding. As it stands now, no evidence is presented to prove that delivery of staufen RNA via Egl, rather than dumping of Staufen protein into oocytes is the necessary trigger for the switch. It is well known that Staufen protein is also transported through ring canals to deliver Staufen into oocytes. There is no need to invoke an additional mechanism of Egl mediated staufen mRNA delivery. A key experiment is to perturb the Egl interaction with staufen 3'UTR and show this is a necessary component to impact oscar. Related to this comment, they should first perform biochemistry IP and PCR to demonstrate association of Egl with staufen RNA, and then somehow perturb this interaction to assess effects on oscar RNA localization. For example, is the 3'UTR of staufen RNA necessary for this mechanism? What if staufen RNA was ectopically localized in some inappropriate manner, for example localized to posterior pole? Would this prevent the switch of oscar RNA to move to posterior pole? The key question is: is it necessary that translation of Stau be coupled to Egl in order to drive the switch.

      Mapping of the Egl-binding site in stau mRNA is a major undertaking requiring the production and evaluation of multiple new transgenic fly lines. We feel that this would constitute an entirely new study. Moreover, multiple lines of evidence already support a functional interaction between Egl and stau mRNA, notably the presence of Egl on stau RNPs (previously Fig. 7I, now Fig. 7F), the strongly impaired accumulation of stau mRNA in the oocyte of egl RNAi egg chambers, and the ability of Egl overexpression to reposition a subset of the stau mRNA population at the anterior cortex.

      We have now performed new experiments and analyses to test the alternative hypothesis that Staufen protein is transported into the oocyte in the absence of stau mRNA transport. We find that disrupting Egl function with RNAi impairs localisation of both stau mRNA and protein in the proto-oocyte (new Figure 7A-D). As Egl has no known function in protein transport, these data argue against an RNA-independent mechanism for Staufen protein delivery. Moreover, we showed that both stau mRNA and Staufen are enriched in early oocytes lacking oskar mRNA, the main target of Staufen protein in the female germline. This result shows that Staufen protein is not appreciably transported from the nurse cells to the oocyte by hitchhiking on its RNA targets.

      Whilst Mhlanga et al. 2009 did report transport of large GFP-Staufen particles through ring canals, the line used (matTub4>GFP-Staufen from the St Johnston lab, which was also used for our rescue experiments) is known to make protein aggregates which is not the case for the endogenous protein (Zimyanin et al., 2008 and our new Figures 7B and S7E-I) and are therefore likely to be artefactual. Neither we, nor previous studies (Little et al., NCB, 2015), detected endogenous Staufen protein in nurse cells.

      Finally, the reviewer asks if coupling Staufen translation to Egl-mediated enrichment of stau mRNA in the oocyte is important: we showed in the original submission that strong overexpression of GFP-Staufen by Gal4 drivers leads to mislocalization of Staufen in the nurse cells of early egg-chambers, presumably due to saturation of the Egl-based transport machinery. In these egg-chambers, we observed defects in RNA enrichment in the primordial oocyte and defects in oogenesis, consistent with the need to exclude Staufen protein from the nurse cells.

      These findings are now presented in new panels of the updated Figures 7 and S7, with the corresponding section of the manuscript revised accordingly (lines 408-417). We think that altogether these lines of evidence strongly support our model that Egl transports stau mRNA into the developing oocyte and that this process is pivotal for oskar RNA localization.

      **Minor comments**

      "Substantially more oskar mRNA was co-immunoprecipitated with Egl-GFP from extracts of egg-chambers expressing staufen RNAi compared to the control (Fig 3G). -This data is not shown in 3G, but rather only in Fig. S4H which needs quantitative analysis shown.

      This point stems from us calling out the wrong panel in the first submission; this has now been addressed, as described above. We apologize for the error.

      "Addition of recombinant Staufen to the Egl, BicD, dynein and dynactin assembly mix significantly reduced the number of oskar mRNA transport events (Fig. 2A and B)."

      -In Fig. 2A, the Y axis shows velocity not number of transport events

      Fig 2A is a kymograph that is representative of the overall effect, where the Y-axis represents time. The reviewer may be referring to Fig 2B but this shows the frequency of processive oskar RNA movements (expressed as ‘number / micron / minute’), not velocity (micron/minute).

      Fig. 3. - This is very unclear figure as to what is being shown. More details are needed to explain the figure, and add arrows to help reader note what is being described.

      We have changed this figure to show representative images of individual egg chambers, as requested by the other two reviewers. The original Fig 3 is now moved to the Supplement as Fig S4. We have added arrows to the figure to indicate the anterior mislocalization of oskar mRNA and edited the legend for clarity.

      Staufen may also be required for the efficient release of the mRNA from the anterior cortex. This may reflect a role of Staufen in the coupling of the mRNA to the kinesin-dependent posterior transport pathway. This could be discussed as another aspect of the inhibition of dynein and handoff to kinesin.

      This is an interesting idea but it does not fit with our observation that Staufen depletion does not alter the association of oskar RNPs with kinesin-1 (originally Fig. 1C, now Fig. 1D). We do, however, now include in the Discussion a section on other ways, in addition to promoting Egl disassociation, that Staufen might orchestrate oskar mRNA transport.

      Reviewer #3 (Significance (Required)):

      This elegant manuscript by Gaspar et al provides new insight into the spatiotemporal regulation of Staufen mediated localization of oscar mRNA to the posterior pole in Drosophila oocytes. Here the authors demonstrate the competitive displacement of the RNA binding protein Egalitarian, which antagonizes dynein dependent localization at the anterior pole. This work done in this well characterized model of mRNA localization in Drosophila oocytes has broader implications for how the bidirectional transport of mRNAs is regulated in other polarized and highly differentiated cells, where very little is know about how mRNA transport direction might be regulated by opposing activities of kinesin and dynein motors. The strengths of this study are the integration of microscopy, biochemisty and genetic mutants to provide very nice experimental support for the two major aspects to the proposed model: 1) the competition between Staufen and Egl on oscar RNA which affects localization, 2) evidence for Egl mediated localization of staufen RNA into the oocyte as a key trigger for competitive displacement to bias localization of oscar RNA via kinesin. However, some additional experimental evidence is needed to solidify the conclusions and provide definitive support for this model, as discussed in other section.

    1. They will convey the degree to which they disagree and the respect or contempt they feel for this opposing view. Common Phrases That Introduce Counterarguments Attitude to the Counterargument Phrases Negative  The writer thinks the counterargument is completely wrong. It is a popular misconception that_____________. Some have fallen for the idea that_____________. Many people mistakenly believe that_____________. Neutral  The writer is about to describe a counterargument without giving their opinion yet. Many people think _____________. Some, on the other hand, will argue that _____________. Some might disagree, claiming that _____________. Of course, many have claimed that _____________. Some will take issue with _____________, arguing that _____________. Some will object that _____________. Some will dispute the idea that _____________, claiming that _____________. One criticism of this way of thinking is that _____________. Note that these neutral examples don’t tell us whether the writer thinks the counterargument has any validity. Usually, the writer will want to follow them with a sentence that does reveal their opinion. Positive  The writer sees some merit in the counterargument. They agree with it even though it hurts their argument. This is called concession. It is true that ___________. I do concede_____________. We should grant that_____________. We must admit that_____________. I acknowledge that _____________. X has a point that _____________. Admittedly, _____________. Of course, _____________. To be sure, _____________. There may be something to the idea that _____________.

      I haven't thought about the attitude an author can have toward a counter argument. Rarely am I focusing on how they present opposing views to their argument as. I'm usually trying to dissect their argument or disprove it myself.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewer Comments

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary:

      In developing systems, morphogens gradients pattern tissues such that cells along the patterning length sense varying levels of the morphogen. This process has a low positional error even in the presence of biological noise in numerous tissues including the early embryo of the Drosophila melanogaster. The authors of this manuscript developed a mathematical model to test the effect of noise and mean cell diameter on gradient variability and the positional error they convey.

      They solved the 1D reaction-diffusion equation for N cells with diameters and kinetic parameters sampled from a physiologically relevant mean and coefficient of variation (CV). They fit the resulting morphogen gradients to a hyperbolic cosine profile and determined the decay length (DL) and amplitude (A) for a thousand independent runs and reported the CV in DL and A.

      The authors found that CV in DL and A increases with increase in mean cell diameter. They propose a mathematical relationship between CV in DL scales as an inverse-square-root of N. Whereas the CV in DL and A is a weak function of CV of cell surface area (CVa) if CVa __They further looked at the shift in readout boundaries and compared four different readout metrics: spatial averaging, centroid readout, random readout and readout along the length of the cilium. Their results show that spatial averaging and centroid have a high readout precision.

      They finally showed that the positional error (PE) increases along the patterning length of the tissue and increases with increasing mean cell diameter.

      The authors also supported their theoretical and simulated results by looking at mean cell areas reported for in patterning tissues in literature which also have a higher readout precision with smaller cell diameters.

      Major comments:

      Most of the key conclusions are convincing. However, there are four major points that should be addressed. First, the authors conclude the section titled, "The positional error scales with the square root of the average cell diameter," by saying that morphogen systems with small cells can have high precision in absolute length scales, but not on the scale of one cell diameter. They state this would result in salt and pepper patterns in the transition zones. The authors should either support this with biological examples or explain why this is not observed experimentally.

      We thank the referee for pointing out this imprecise comment, which we have removed. The exact nature of transition zones between patterning domains is a subject of ongoing research in our group, and goes beyond the scope of the present work. We will be sharing our results on this aspect in a separate forthcoming publication.

      Second, perhaps the main conclusion of the paper is that morphogen gradients pattern best when the average cell diameter is small. The authors support this by reviewing the apical cell area of epithelial systems that are known to be patterned by morphogens and those that are not (presumably taking apical cell area as a proxy for cell diameter). However, the key parameter is not absolute cell diameter, but the cell diameter relative to the morphogen length scale. The authors should report the ratio of these two quantities in their literature analysis.

      Since cell areas and cell diameters are monotonically increasing functions of one another for reasonably regular cell shapes, we indeed consider apical cell areas as proxies for the cell diameter, as the referee correctly noted. Cell areas are more frequently reported in the literature than cell diameters, which is why we compiled these in our analysis.We have now revised our analysis of the effect of the cell diameter on patterning precision to further length scales relevant in the patterning process. We show by example of the Drosophila wing disc how the parallel changes in cell diameter and morphogen source size compensate for the increase in gradient length and domain size, which would otherwise reduce patterning precision over time as the readout positions shift away from the source to maintain the same relative position in the growing wing disc.

      Lamentably, accurate measurements of morphogen gradients in epithelial tissues are still rare. In fact, among the listed tissues that are patterned by gradients, we are only aware of measurements of the SHH and BMP gradients in the mouse NT (lambda = 20 µm) and of the Dpp gradients in the Drosophila wing and eye discs [Wartlick, et al., Science, 2011 & Wartlick et al., Development, 2014]. We agree that it would be great if experimental groups would measure this in more tissues. In this revised and extended analysis, we show that the positional error increases with the cell diameter in absolute terms, not only relative to any reference length, be it the gradient length or cell diameter.

      Third, as part of their literature analysis, the authors state that in the Drosophila syncytium, there are morphogen gradients, but they imply that because these gradients operate prior to cellularization, one cannot use the large distances between nuclei as counter evidence to their main conclusion. Rather than simply dismissing the case of the Drosophila syncytium, the authors should explain why this case does not apply, using reasoning based on their model assumptions.

      Our paper is concerned with patterning of epithelia (which we now make clearer in the manuscript), and we would not want to stretch our paper to other tissue types, as the reaction-diffusion process in them differs. But we do not share the referee’s sentiment that the syncytium would present a counter-example. Since our model explicitly represents kinetic variability between spatial regions bounded by cell membranes, which are absent in the syncytium, our model is not directly applicable to it. We now provide this argument in the discussion, as requested by the referee.

      At 100 µm [Gregor et al., Cell, 2007], the Bicoid gradient is 5 times longer than the SHH/BMP gradients in the mouse neural tube and more than 10 times the reported length of the WNT gradient in the Drosophila wing disc [Kicheva et al., Science, 2007]. The nuclei become smaller as they divide because the anterior-posterior length of the Drosophila embryo remains about 500 µm [Gregor et al., Cell, 2007], but even at the earliest patterning stage their diameter will not be larger than 10 µm at midinterphase 12 [Gregor et al., Cell, 2007, Fig. 3A].

      Fourth, related to the above: the authors then state that there are no morphogen gradients known during cellularization. Unless I am misunderstanding their point, this is untrue. The Dpp gradient acts during the process of cellularization and specifies at least three distinct spatial domains of gene expression. Furthermore, not long after gastrulation, EGFR signaling patterns the ventral ectoderm into at least two distinct domains of gene expression. What are the cell areas in that case?

      Unfortunately, the referee does not provide literature references, and we were not able to find anything in the literature ourselves. We have now rephrased the statement to “we are not aware of morphogen gradient readout during cellularisation”.

      Minor comments:

      Figs 1cd:

      The way the system is set-up: (DL = 20 micron, Patterning Length (LP) = 250 micron, Nominal cell diameter (D) = 5 micron) the DL/L ~ 0.08 which makes the exponential profile far to a small value around 100 micron. This means in all these simulations, the LP was only around 100 micron, cells beyond that saw nearly zero concentration.

      Because of this, when diameters were varied from 0.2 - 40 micron, there could be as few as 2.5 cells in the "patterning region" which could be responsible for higher variability in DL and A.

      Patterning in the neural tube works across several 100 µm. At x=100µm, there is still exp(-5)=0.0067 of the signal left, which likely well translates into appreciable numbers of the morphogen molecule (see [Vetter & Iber, 2022] for a discussion of concentration ranges cells might sense). Unfortunately, very little is known about absolute morphogen numbers in the different patterning systems — experimental data is available only on relative scales, not in absolute nu mbers. While more quantitative experiments are still outstanding, modeling work needs to be based on reasonable assumptions. The seemingly quick decay of exponential profiles (when plotted on a linear scale) can be deceiving. In fact, exponential profiles describe the same fold-change over repeated equal distances, which makes them biologically very useful for different readout mechanisms operating on different levels of morphogen abundance. Our simulations are not limited to a patterning length of 100µm. Our work merely shows that variable exponential gradients stay precise over a long distance. We draw no conclusion on whether cells are able to interpret the low morphogen concentrations that arise far in the patterning domain - this aspect certainly deserves further research.

      The referee’s observation is correct in that for a cell diameter of up to 40 µm, there are only few cells in the patterning domain (namely down to about six, for a length of 250µm, as used in the simulations). It is also correct that this is the reason why gradients in such a tissue have greater variability in lambda and C0. This is precisely the main point we are making in this study: The narrower the cells in a tissue of given size, the less variable the morphogen gradients, and the more accurate the positional information they carry. Conversely, the wider the cells in x direction, the more variable the gradients.

      Would any of the results change if DL/L was higher, around 0.2?

      As we consider steady state gradients, nothing changes if we fix the (mean) gradient decay length and only shorten the patterning domain, except for a small boundary effect at the far end of the tissue due to zero-flux conditions applied there. At a fixed gradient length, the steady-state gradients just extend further if DL/L is increased (for example to 0.2), reaching lower concentrations, but the shape remains unchanged, and so does the morphogen concentration at a given absolute readout position.

      To demonstrate what happens at DL/L = 0.2, as requested by the referee, we repeated simulations with an increased gradient decay length of DL=50 micrometers; the length of the patterning domain remained unchanged at L=250 micrometers. As it is not possible to include image files in this response, we have made the plots available at https://git.bsse.ethz.ch/iber/Publications/2022_adelmann_vetter_cell_size/-/blob/main/revision_increased_dl.pdf for the time of the reviewing process. The plots show the resulting gradient variability, which is analogous to Fig 1c,d in the original manuscript. For both gradient parameters, we still recover the identical scaling laws.

      The source region is 25 microns in length and all cell diameters above 25 micron get defaulted back to 25 micron which explains the flatness lines in the region beyond mu_delta/mu_DL> 1

      Thanks for pointing this out. We now mention this in the manuscript. Note that it’s the ratio mu_delta/L_s that matters, not mu_delta/mu_lambda. It just so happens in this case, that both are nearly equal, because L_s=5*mu_lambda/4 in our simulations.

      Results:

      Pg 2 (bottom left): In the git repository code, the morphogen gradients are fit to a hyperbolic cosines function (described in reference 19) which is not described in the main text. Having this in the main text would help readers understand why fig 1c has variation in d only, D only and all k parameters whereas fig 1d has variation with all individual parameters p, d and D and all k.

      The reason why the impact of CV_p alone on CV_lambda is not plotted in Fig 1c is that it is minuscule. We now mention this in the figure legend. This follows from the fact that the gradient length lambda is determined in the patterning domain, whereas the production rate p sets the morphogen concentration in the source domain, and thus, the gradient amplitude, but not its characteristic length. This is unrelated to the functional form used to fit the shape of the gradients, be it exponential or a hyperbolic cosine. We mention that we fit hyperbolic cosines to the numerical gradients in section Gradient parameter extraction in the Methods section, and we refer the interested reader to the original reference [Vetter & Iber, 2022], which contains all mathematical details, should they be needed.

      Figure 3b:

      In figures where markers are overlapping perhaps the authors can use a "dot" to identify one set of simulations and a "o" to identify the ones under it. The way the plots are set up currently makes it hard for the reader to understand where certain points on the plot are.

      We use a color code to represent the readout strategy and different symbols to represent the cell diameter in Fig 3b. We agree that for the smallest of the cell diameters, the diamond-shaped data points lie so close that they are not easy to tell apart at first sight. For this reason, we chose different symbol sizes. We would like to keep the symbols as they are to maintain visual consistency with the other figures, which we think is an important feature of our presentation that facilitates the interpretation. Note that all our figures are vector graphics, which allow the reader to zoom in arbitrarily deep, and to easily distinguish the data points. Note also that in this particular case, telling the data points apart is not necessary; recognizing that they are nearly identical is sufficient for the interpretation of our results.

      Methods:

      The Methods can be more descriptive to include certain aspects of the simulations such as adjusted lambda which is only described in the code and not the main text or supplementary.

      We apologize for this omitted detail. As shown in Fig. 8g in [Vetter & Iber, 2022], the mean fitted value of lambda drifts away from the prescribed value, depending on which of the kinetic parameters are varied, and by how much. To report the true observed mean gradient length in our results, we corrected for this drift in our implementation, as the referee correctly noticed. We now describe this in the methods section, and we have extended the methods also on other aspects.

      Git code:

      The git code function handles do not represent figure numbers and should be updated to make it easier for readers to find the right code

      Thank you for pointing this out — it was an oversight from an earlier preprint version. The function names now correspond to the figure numbers.

      Reviewer #1 (Significance (Required)):

      This manuscript contributes certain key aspects to the patterning domain. The three most important contributions of this work to the current literature are: (1) the scaling relationships developed here are important, (2) the idea that PE increases at the tail-end of the morphogen profile is nicely shown and (3) Comparison of various readout strategies.

      Thank you for the positive assessment.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      How morphogen gradients yield to precise patterning outputs is an important problem in developmental biology. In this manuscript, Adelmann et al. study the impact of cell size in the precision of morphogen gradients and use a theoretical framework to show that positional error is proportional to the square root of cell diameter, suggesting that the smaller the cells in a patterning field, the more precise patterns can be established against morphogen gradient variability. This result remains true even when cells average the morphogen signal across their surface or spatial correlations between cells are introduced. Thus, the authors suggest that epithelial tissues patterned by morphogen gradients buffer morphogen variability by reducing apical cell areas and support their hypothesis by examining several experimental examples of gradient-based vs. non-gradient-based patterning systems.

      Major comments:

      While the idea that smaller cells yield to more precise morphogen gradient outputs is attractive, it is unclear whether patterning systems use this strategy to make patterns more precise, as there are several mechanisms that could achieve precision. Do actual developmental systems use it as a mechanism to increase precision? Or precision is achieved through other mechanisms (for example, cell sorting as in the zebrafish neural tube; Xiong et al. Cell, 2013). Indeed, classical patterning work on Drosophila embryo suggest that segmentation patterns are of an absolute size rather by an absolute number of cells (Sullivan, Nature, 1987). According to the authors, the patterning stripes should be more precise when embryos have higher cell densities than in the wild-type, but stripes are remarkably precise in wild-type embryos. This is likely due to other precision-ensuring mechanisms (such as downstream transcriptional repressors, in this case).

      We want to emphasize that our predictions concern the precision of the gradients, not the precision of their readout, which can be strongly affected by readout noise, as we will show in a forthcoming paper. Cell sorting can sharpen boundaries in the transition zone, but this would not address errors in target domain sizes and is thus different from gradient precision as we discuss it here. Also, cell sorting as observed in the zebrafish neural tube requires higher cell motility than what is observed in most epithelial tissues. The work by Sullivan, Nature, 1987, is concerned with patterning of the early Drosophila embryo, and the stripes are defined already before cellularisation. We are unfortunately not aware of any work that quantified gradient precision at different cell densities in epithelia. This would, of course, be highly interesting data and would indeed put our predictions to a test. We are, to the best of our knowledge, the first to propose this principle with the present work. We have now made these points and distinctions clearer in the revised manuscript. Thank you for bringing this up.

      Their modeling approach is based on exponential gradients formed by diffusion and linear degradation, but in reality, actual morphogen gradients are affected by receptor and proteoglycan binding and are likely not simply exponential and/or interpreted at the steady state. Do the main results of the manuscript hold even for non-exponential gradients or before they reach a steady state?

      We can confirm that our results also hold for non-exponential gradients, as they emerge for example when morphogen degradation is self-enhanced (i.e., non-linear). This result will be published in a follow-up study [BioRxiv: 10.1101/2022.11.04.514993], which we now cite in the concluding remarks in the revised manuscript.

      The analysis of pre-steady-state gradients lies outside of the scope of the present work, and so the question as to whether our results are applicable to them as well, remains to be answered in future research. We have added a comment on this to the discussion.

      In their Discussion section, the authors note that several patterning systems, such as the Drosophila wing and eye discs, show smaller cells near the morphogen source relative to other regions in the tissue. This observation suggests a prediction of the authors' hypothesis that can be tested experimentally. In the Drosophila wing and eye discs genetic mosaics of ectopic morphogen sources (such as Dpp) can (and have) been made. Therefore, one could predict that the patterning outputs in a region of larger cross-sectional areas will be more imprecise than in the endogenous source. Since this is a theoretical paper, it is understandable that authors are not going to make this experiment themselves, but I wonder if they can use published data to test this prediction or at least mention it in the manuscript to offer the experimental biology reader an idea of how their hypothesis can be tested experimentally.

      We appreciate that the referee would like to help us inspire the experimental community. Unfortunately, the problem with the proposal is that Dpp has been shown to result in a lengthening of the cells (and thus a smaller cell width) [Widmann & Dahman, J Cell Sci, 2009]. The Dpp gradient thus ensures a small cell width close to its source, which makes it virtually impossible to test this proposal experimentally in the suggested way. Nevertheless, we have added brief comments on potential experimental testing of our predictions to the discussion.

      Other comments:

      The Methods section should be expanded and should include more details about how authors consider cell size in their simulations. As presented, I believe that experimental biologists will not be able to grasp how the analysis was done.

      We have expanded on the technical details of our model in the methods section, in particular in relation to the cell size, as requested. To avoid being overly redundant with existing published descriptions of the modeling details [Vetter & Iber, 2022], we focus here on a description of what has not been covered already, and refer the interested reader to our previous publication. It is inevitable for any kind of work, be it theoretical or experimental, to be less accessible to experts in other disciplines, but we believe that the presentation of our results is independent enough of modeling aspects to be accessible to experimental biologists, too.

      Authors use adjectives such as 'little' as 'small' without a comparative reference. For example in the abstract, the authors say that apical areas "are indeed small in developmental tissues." What does "small" mean? This should be avoided throughout the text.

      We thank the referee for raising this point. Where appropriate, we changed the phrasing accordingly to clarify what the comparative reference is. We leave all sentences unchanged where the statement holds in absolute terms. Note that in the substantially revised analysis on the impact of the different length scales involved in the patterning process, we now explicitly show with simulation data and theory that the absolute positional error increases with increasing absolute cell diameter.

      Reviewer #2 (Significance (Required)):

      Overall, I believe that the manuscript is well written and deserves consideration for publication. However, authors should consider the points outlined above in order to make their manuscript more accessible and relevant to the developmental biology community.

      Thank you for the positive assessment.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In their mansucript "Impact of cell size on morphogen gradient precision" the authors Adelmann, Vetter and Iber numerically analyse a one-dimensional PDE-based model of morphogen gradient formation in tissues in which the cell sizes and cell-specific parameters locally affecting the gradient properties are varied according to predefined distributions. They find that the average cell size has the largest impact on the variance of the gradient shape and the read-out precision downstream, while other factors such as details of the readout mechanism have markedly less influence on these properties. In addition they demonstrate that averaging gradient concentrations over typical cell areas induces a shift of the readout position, which however appears to be insignificant (~1% of the cell diameter) for typical parameters.

      Overall this manuscript is in very good shape already and tackles an interesting topic. I still would like the authors to address the comments below before I would recommend any publication. My main criticism pertains to some of the authors' derivations which, as I find, partly do deserve more detail, and to their conclusions about gradient readout precision.

      Thank you for the positive assessment.

      MAJOR COMMENTS

      p. 1, left column: The positional error of the readout position does not only depend on the variation of the gradient parameters, as suggested by the first part of the introduction. A very important factor is also the fluctuations due to random arrival of molecules to the promoters that perform the readout due to the limited (and typically low) molecule number. In fact, for positions very distant to the source of the gradient, this noise source is expected to be dominant over gradient shape fluctuations. Importantly, these fluctuations also arise for non-fluctuating, "perfect" gradient inputs if copy numbers of the morphogen molecules are limited (which they always are). This important contribution to the noise is neglected in the work of the authors. This is OK if the purpose is focusing on the origin and influence of the gradient shape fluctuations, but that focus should be clearly highlighted in the introduction, saying explicitly that noise due to diffusive arrival of transcription factors is not taken into account in the given work (see, e.g., Tkacik, Gregor, Bialek, PLoS ONE 3, 2008)

      In the present work, only precision of the gradients, but not the readout itself is studied. We have now mentioned this more explicitly in the introduction. We also acknowledge the fact that the readout itself introduces additional noise into the system. We are currently finishing up work that addresses exactly this subject, which is outside of the scope of the present paper.

      What may have led to misinterpretation of the scope of our work is that we called x_theta the readout position. x_theta defines the location where cells sense (i.e., read out) a certain concentration threshold, and is not meant to be interpreted as the location of a certain readout (a downstream transcription factor) of the morphogen. We have made this distinction clearer in the revised manuscript.

      p.1, right column: Why exactly are the parameters p, d, D assumed to follow a log-normal distribution? Such a distribution has been verified for cell size, but the rationale behind choosing it also for the named parameters should be explained, in particular for D. Why would D depend on local properties of the cell? Which diffusion / transport mechanism precisely is assumed here?

      The motivations for the used log-normal distributions for the kinetic parameters are the following:

      The morphogen production rates, degradation rates and diffusivities must be strictly positive. This rules out a normal distribution. The probability density of near-zero kinetic parameters must vanish quickly, as otherwise no successful patterning can occur. For example, a tiny diffusion coefficient would not enable morphogen transport over biologically useful distances within useful timeframes. This rules out a normal distribution truncated at zero, because very low diffusivities would occur rather frequently for such a distribution. Given the absence of reports on distributions for p, d, D from the literature, we chose a plausible probability distribution that fulfills the above two criteria and possesses just two parameters, such that they are fully defined by a mean value and coefficient of variation. This is given by a lognormal distribution. Our results are largely independent of the exact choice of probability distribution assumed for the kinetic parameters, under the constraints mentioned above. To demonstrate this, we have repeated a set of simulations with a gamma distribution with equal mean and variance as used for the lognormal distribution. Below are some simulation results for a gamma distribution with shape parameters a = 1/CV^2 and inverse scale parameter b = mu*CV^2 with CV = 0.3 as used in the results shown in the paper. As can be appreciated from these plots, the results do not change substantially, and our conclusions still hold. As we believe this information is potentially relevant for the readership of our paper, we have added this result and discussion to the supplement and to the conclusion in the main text.

      We assume extracellular, Fickean morphogen diffusion with effective diffusivity D along the epithelial cells, as specified by Eq. 2. We now state this more explicitly just below Eq. 2 in the revised manuscript. Cell-to-cell variability in the effective diffusivity may arise from effects that alter the effective diffusion path and dynamics along the surface of cells, which we do not model explicitly, but lump into the effective values of D. Such effects may include different diffusion paths (different tortuosities) or transient binding, among others.

      Moreover, is there any relationship between A_i and p_i, d_i and D_i, or are these parameters varied completely independently? If yes, is there a justification for that?

      The parameters are all varied independently, as written in the paragraph below Eq. 2 on the first page (“drawn for each cell independently”). To our knowledge there is no reported evidence for correlations between cell areas, morphogen production rates, degradation rates, or transport rates across epithelia, that we could base our model on. The choice of independent cell parameters therefore represents a plausible model of least assumptions made. Note that we explore the effect of potential spatial correlations in the kinetic parameters between neighboring cells in the section “The effect of spatial correlation”, finding that such correlations, if at all present, are unlikely to significantly alter our results.

      p. 2, right column, section on "Spatial averaging": First of all, how is "averaging" exactly defined here? Do the authors assume that the cells can perfectly integrate over their surface in the dimensions perpendicular to their height? If yes, then this should be briefly mentioned here. Secondly, the shift \Delta x calculated by the authors ultimately seems to trace back to the fact that the cells average over an exponential gradient, whose derivative also is exponential, such that levels further to the anterior from the cell center are higher (on average) than levels to the posterior of it. I suppose, therefore, that a similar calculation for linear gradients would not lead to any shift. If these things are true they deserve being mentioned in this part of the manuscript because they provide an intuitive explanation for the shift. Thirdly, in Fig. 2A the cell sizes seem exaggerated with respect to the gradient length. This seems fine for illustrative purposes, but if it is the case it should be mentioned. Also, I believe that this figure panel would benefit from showing another readout case where the average concentration e.g. in cell 1 maps to its corresponding readout position, in order to show that this process repeats in every cell. Moreover, it could be indicated that in the shown case C_\theta matches the average concentration in cell 2 at the indicated position.

      Spatial averaging is defined as perfect integration along the spatial coordinate over a length of 2r (which can generally be equal to, or smaller than, or larger than one cell diameter) as detailed in the supplementary material. In simulations, we use the trapezoid method for numerical integration to get the average concentration a cell experiences along its surface area perpendicular to their height.

      The reviewer is correct, that the shift is a consequence of averaging over an exponential gradient. The average of an exponential gradient is higher compared to the concentration at the centroid of the cell, thus the small shift. This is mentioned e.g. in the caption of Fig. S1, but also in the main text (“spatial averaging of an exponential gradient results in a higher average concentration than centroid readout”). We have now added this information also to the caption of Fig. 2. As pointed out correctly by the referee, linear gradients would not result in such a shift. A brief comment on this has been added to the revised manuscript.

      We now mention that the cell size is exaggerated in comparison to the gradient decay length for illustration purposes in the schematic of Fig. 2a, as requested.

      Unfortunately, we had a hard time following the reviewer’s final point. We show a specific readout threshold concentration, C_theta, in Fig. 2a. A cell determines its fate based on whether its sensed (possibly averaged) concentration is greater or smaller than C_theta. In the illustration, cells 1 and 2 sense a concentration greater than C_theta, and all further cells sense a concentration smaller than C_theta. Cell fate boundaries necessarily develop at cell boundaries (here; between cells 2 and 3, red). Additionally, the readout position for a continuous domain, where morphogen sensing can occur at an arbitrary point along the patterning axis, is shown (blue). This position can be different from the one restricted to cell borders. Thus, different readout positions in the patterning domain result from the two scenarios, which is what the schematic illustrates. Given that our illustration seems to go well with the other referees, we are unsure in what way it could be improved.

      As for the significance of the magnitude of the shift for typical parameters as calculated by the authors: I believe that it could be said more explicitly and clearly that under biological conditions the calculated shift overall seems insignificant, as it amounts to a small fraction of the cell diameter.

      We have made this more explicit in the text.

      Finally, and most importantly: The term "spatial averaging" can have a different meaning in developmental biology than the one employed by the authors. While the authors mean by it that individual cells average the gradient concentration over their area, in other works "spatial averaging" typically means that individual cells sense "their" gradient value (by whatever mechanism) and then exchange molecules activated by it, which encode the read-out gradient value downstream, between neighboring cells, in order to average out the gradient values "measured" under noisy conditions. The noise reduction effect of such spatial averaging can be very significant, as evidenced by this (incomplete) list of works which the authors can refer to:

      - Erdmann, Howard, ten Wolde, PRL 103, 2009

      - Sokolowski & Tkacik, PRE 91, 2015

      - Ellison et al., PNAS 113, 2016

      - Mugler, Levchenko, Nemenman, PNAS 113, 2016

      The main point, however, is that this is a different mechanism as the one described by the authors, and this should be clearly mentioned in order to distinguished them. I would therefore also advise the authors to make the section title more precise here, by changing "Spatial averaging barely affects ..." to "Spatial averaging across the cell area barely affects ..." for clarity.

      Most theory development has previously indeed been done with the syncitium of the early Drosophila embryo in mind. However, most patterning in development happens in epithelial (or mesenchymal) tissues, where spatial averaging via translated proteins is not as straightforward and natural as in a syncitium. In fact, a bucket transport of a produced protein from cell to cell would be difficult to arrange (as upon internalization, degradation would have to be prevented), be subject to much molecular noise, and be rather slow. Our paper is concerned with patterning in epithelia, which we have now stated more clearly in the manuscript.

      Regarding the section title: Our analysis does not only cover spatial morphogen averaging over the cell area, but it also includes averaging radii below (in the theory) and far above (in the theory and in the new Fig. 4c, previously 3c) half a cell diameter. With cilia of sufficient length r, epithelial cells could potentially average over spatial regions extending further than their own cell area, without need for inter-cellular molecular exchange between neighboring cells. This is the kind of spatial averaging we explored here. Restricting the section title to the cell area only would therefore be misleading. However, we agree with the referee that the distinction between different meanings of “spatial averaging” is important, and we now emphasize our interpretation and the scope of our work more in the revised text.

      p. 3, Figure 3: It would be good to highlight the fact that the colours in panel A correspond to the bullet colors in the other panels also in the main text.

      We now added this also in the main text.

      As to the comparison of different readout strategies: How exactly were the different readout mechanisms compared on the mathematical side? More precisely: How was the readout by the whole area matched (in terms of fluxes) to the readout at a single point, be it in the center of the cell or a randomly chosen point? How was it ensured that the comparison is done at equal footing?

      Our model considers that a cell can sense a single concentration even if it is exposed to a gradient of concentrations. Assuming the French flag model is correct, a cell must make a binary decision based on a sensed concentration in order to determine its fate. The different readout strategies are hypothetical and simplified mechanisms for how a cell could, in principle, detect a local morphogen signal. It is unclear to us what the referee is referring to when mentioning “matching in terms of fluxes”, as there are no fluxes involved in the modeled readout strategies. We make no assumption on the underlying biochemical mechanism that would allow cells to implement one of the strategies. The main goal of this analysis was to determine whether various different sensing strategies had a significant effect on the precision of morphogen gradients experienced by cells. To assure that we can compare the different mechanisms at equal footing, we simulated gradients and then calculated from each gradient the readout concentration in each cell and for each of the methods.

      p. 3, right column: "... similar gradient variabilities, and thus readout precision": Linking to comment 1 above, this is strictly speaking only the case when the only source of fluctuations in the readout is the gradient fluctuations. I would therefore leave this statement out.

      To avoid confusion, we have removed parts of the sentence. Thank you for pointing this out.

      p. 3, section on positional error (right column): In this part I had most troubles following the thoughts of the authors.

      First of all, the measure that the authors use for the positional error is sigma_x / mu_lambda, i.e. the standard deviation of the readout position relative to the gradient length. The question is whether this is the correct measure. It should be specified what the motivation for normalizing by mu_lambda is. In the end, one could argue, what the cells really do care about would be that the developmental process can assign cell fates with single cell precision, for which the other measure shown in Eq. (6) is the representative one. Now in contrast to the former measure, the latter actually increases with decreasing cell diameter.

      We thank the referee for raising this point, and acknowledge that we have not presented this aspect well enough. We have rewritten the entire section and the discussion about biological implications. Instead of normalizing with a constant mean gradient length in the formulas and figures, which has left room for misinterpretation, we now instead varied all relevant length scales in the patterning system, to determine the impact of each of them independently on the positional error. We now show that the positional error increases (to leading order) proportionally to the mean gradient length, the square root of the cell diameter, the square root of the location in the patterned tissue, and inversely proportional to the length of the source domain. We support these new aspects with new simulation data (Fig. 2E-2H, Fig. 3D-G, Fig. S5, Fig. S6). As the positional error is now reported in absolute terms, rather than relative to a particular length scale, the question of the relevant scale is addressed. We now show that the absolute positional error increases with increasing absolute cell diameter.

      We believe that this extension provides additional important insight into what affects the patterning precision. We thank the referee very much for motivating us to expand our analysis.

      Secondly, even when the former measure (sigma_x / mu_lambda) is employed, Fig. 3(D) shows that while it decreases with decreasing cell diameters, in the regime of small diameters the std. dev. of the readout position becomes larger than the average cell diameter, which actually would mean that cell fates cannot be assigned with single-cell precision. While the authors later report both quantities for specific gradients, it should be clarified beforehand which of the measures is the relevant one.

      This has now been addressed by considering absolute length scales as discussed at length in our answer to the previous point.

      Moreover, in the following derivations, mu_x is not properly introduced. What exactly is the definition of that quantity? Is it the mean readout position? If yes, it is not clear why exactly it would be interesting and relevant to the cell. This should be properly explained in a way that does not require the reader to look up further details in another publication.

      The referee is correct in that mu_x is the mean readout position. We apologize for not being clear enough on this, and have now defined this in the introduction together with the definition of sigma_x.

      At the end of this section the authors come back to the sigma_x / mu_delta measure again and indeed point out that it increases with decreasing mu_delta, which causes a bit of confusion because the initial part of the section only talks about the increase of the pos. error with mu_delta. Overall I find that this section should be rewritten more clearly. Right now it leaves the reader with the "take home message" that small cells are good because they lead to smaller pos. error, but when the--in my opinion--relevant measure (sigma_x/mu_delta) is employed the opposite is the case. This is confusing and unclear about the authors' intentions in that part.

      See the answer above. The “take-home message” is now reformulated in absolute terms regarding the effect of cell diameter, rather than relative to a certain choice of reference scale. Our new analysis revealed a new relative ratio that determines the positional error, mu_lambda/L_s. We now discuss this relative measure also regarding its biological significance. Once again, we thank the referee for pointing us at this source of confusion, the elimination of which allowed us to improve our analysis.

      __Finally, the authors could also supplement the numbers that they name for the FGF8 and SHH gradients by the known numbers for the Bcd gradient in Drosophila, which has been studied excessively and constitutes a paradigm of developmental biology. Here mu_delta ~= 6.5 um, while mu_lambda ~= 100 um, such that mu_delta/mu_lambda While we appreciate that most theoretical work has been done for syncytia, this paper is concerned with patterning of epithelia, which have different patterning constraints, as also explained in a reply further above. We now make the scope of our work clearer in the revised manuscript. But as the referee points out, the diameter of the nucleus relative to the gradient length is such that gradients can be expected to be sufficiently precise.

      p. 4, section on the effect of spatial correlation: Here the authors chose to order the kinetic parameters in ascending or descending order. Is there any biological motivation for that particular choice? Other types of correlations seem possible, e.g. imposing the rule that successive parameter values are sampled starting from the previous value, p_i+1 = o_i +- delta_i+1 where delta_i+1 are random numbers with a defined variance.

      In the simulations we go from zero correlation (every cell has independent kinetic parameters) to maximal correlation (every cell has the same parameters, resulting effectively in a patterning domain that consists of a single effective “cell”), see Fig. S3. Biologically plausible correlations in between these extremes should retain the same kinetic variability levels (same CVs) which we took from the measured range reported in the literature. We accomplish this by ordering the parameters after independently sampling the parameters for each cell from probability distributions with the desired CV. The motivation for this approach is that this produces a type of maximal correlation that still reflects the measured biological cell-to-cell variability, to demonstrate in Fig. S3, that even such a maximal degree of spatial correlation does not qualitatively alter our results. The kind of correlation that the referee suggests introduces a spatial correlation length that lies in between the extremes that we simulated. Since even for maximal correlation using the ordering approach, we find our conclusions to still apply, we have no reason to expect that intermediate levels of correlation would behave any differently.

      The idea brought forward by the referee effectively introduces a correlation length scale. We discuss this case in the paper, noting that the positional error will scale as x~N , where N is the number of cells sharing the same kinetic parameters. A correlation length scale will be proportional to N and will therefore simply uniformly scale the positional error accordingly, but will likely not reveal any new insight beyond that.

      Moreover, using the idea of the referee as an additional way to introduce correlation is difficult to realise in practice, as we need to recover the mean and variance of the kinetic parameters, while ensuring strict positivity for each of them. A simple random walk, as proposed, would not lend itself easily to achieve this without introducing a bias in the distribution, because negative values need to be prevented. As explained in a reply further above, an important feature of the kinetic parameters is that they are not too small to prevent the formation of a meaningful gradient, which is not straightforward to ensure with the proposed method.

      We acknowledge that there are different types of correlations conceivable, but we expect these correlations to lie between the two extremes that we present in the paper, which show no qualitative difference in the results.

      p.5, Discussion: "..., but with nuclei much wider than the average cell diameter". To be honest, I could not completely imagine what is meant with this sentence. Intuitively, it seems that the nuclei cannot be larger than the cells, but I suppose that some kind of special anisotropy is considered here? In any case, this should be made precise.

      The main tissues that are patterned by gradients are epithelia. Our paper focuses on such tissues. It is a well-known feature of pseudostratified epithelia that nuclei are on average wider than the cell width averaged over the apical-basis axis. Nature solves this problem by stacking nuclei above each other along the apical-basal axis, resulting in a single-layered tissue that appears to be a multi-layered stratified tissue when only looking at nuclei. For a schematic illustration of this, see Fig. 1 in [DOI: 10.1016/j.gde.2022.101916]. An image search for “pseudostratified epithelia” on Google yields a plethora of microscopy images. Right at the end of the quote recited by the referee, we also cite our own study [Gomez et al, 2021], which quantifies this in Fig. 5.

      Moreover, I find that the conclusion that morphogen gradients "provide precise positional information even far away from the morphogen source" goes to far based on the authors' work, precisely for the fact input fluctuations due to limited morphogen copy number, which can become detrimentally low far away from the source, are not considered, neither the timescales needed to both establish and sample such low concentrations far away from the source. While thus, according to the work of the authors, the fluctuations in the morphogen signal may be favorably small, these other factors are supposed to exert a strong limit on positional information. This conclusion therefore seems unjustified and should be toned down, or even better taken out and replaced by a more accurate one, which only focuses on the gradient shape fluctuations, not on the conveyed positional information.

      There is no evidence so far that morphogen gradient concentrations become too low to be sensed by epithelial cells, to the best of our knowledge. What we show is that the gradient variability between embryos remains low enough that precise patterning remains possible. Whether the morphogen concentration remains high enough to be read out reliably by cells is a subject that requires future research. Genetic evidence from the mouse neural tube demonstrates that the SHH gradient is still sensed at a distance beyond 15 lambda (SHH signalling represses PAX7 expression at the dorsal end of the neural tube) [Dessaud et al., Nature, 2007], where an exponential concentration has dropped more than 3-million-fold.

      As the referee correctly recites, we state that “morphogen gradients remain highly accurate over very long distances, providing precise positional information even far away from the morphogen source”. This statement is restricted to the positional information that the gradients convey, and does not touch potentially precision-enhancing or -deteriorating readout effects, nor does it concern the absolute number of morphogen molecules.

      Positional information goes through several steps. The gradients themselves convey a first level of positional information, by being variable in patterning direction, as quantified by the positional error. This is what we draw our conclusion about. This positional information from the gradients can then be translated into positional information further downstream, by specific readout mechanisms, inter-cellular processes, temporal averaging, etc. About these further levels of positional information, we make no statement.

      We therefore disagree that our conclusion is unjustified. In fact, we have phrased it exactly having the limited scope of our study in mind, making sure that we restrict the conclusion to the gradients themselves.

      MINOR COMMENTS

      - p. 1: "and find that positional accuracy is the higher, the narrower the cells".

      (This sentence, however, should be anyhow revised in view of major comment 5 above.)

      We have added “the”.

      - p. 4: "... with an even slightly smaller prefactor."

      We have removed “even”.

      Reviewer #3 (Significance (Required)):

      I believe that this work is significant to the community working on the theoretical foundations of morphogen gradient precision in developmental systems. The main interesting findings are that small cell diameters lead to smaller positional error (although the relevant measure should be clarified according to my comment no. 5), and that the gradient shape fluctuations are surprisingly robust with respect to the readout mechanism.

      Its limitations consist of the fact that the impact of small copy numbers on the readout and associated timescales are neglected, such that the findings of the authors on gradient robustness cannot be simply transferred by simple conversion formulas to readout robustness / positional information. Comment 5 goes hand in hand with this, as a different conclusion may emerge depending on how the relevant positional error measure is defined. This should be fixed by the authors as indicated in the main part of the report.

      Thank you for your assessment.

    1. Author Response

      Reviewer #1 (Public Review):

      Major points:

      1) How STC1 controls changes in MSCs' ability for hampering CAR-T cell-mediated anti-tumor responses is unclear.

      In this study, we demonstrated that the presence of STC1 is critical for MSCs to exert their immunosuppressive role by inhibiting cytotoxic T cell subsets, activating key immune suppressive/escape related molecules such as IDO and PD-L1, and crosstalking with macrophages in the TME. These immunosuppressive functions of MSC could be significantly hampered when the STC1 gene was knockdown. Considering that staniocalcin-1 is glycoprotein hormone that is secreted into the extracellular matrix in a paracrine manner, we would conclude that the role of STC-1 is not to alter the function of MSCs intracellularly. Rather, it facilitates the immunosuppressive capabilities of MSCs through extracellular secretion into the TME as a pleiotropic factor, thus impacting the functioning of T cells, cancer cells and other immune cells.

      The reviewer's question is well taken, and we have added the points mentioned above to the Discussion section to ensure a more comprehensive conclusion. Moreover, a recent study published in Cancer Cell, which was suggested by the other reviewer, is consistent with our results. It has provided further mechanistic information on how stanniocalcin-1 impacts immunotherapy efficacy and T cell activation. The reference has been cited and discussed as shown below.

      "In this model, activated macrophages or stress signals during CAR-T therapy may prompt MSCs to secret staniocalcin-1 into the extracellular matrix of TME, serving as a pleiotropic factor to negatively impact the function of T cells and stimulate the expression of molecules that inactivate immune responses, ultimately providing an immunosuppressive effect of MSC." (page 22, highlighted). "In line with our study, it was recently reported that stanniocalcin-1 negatively correlates with immunotherapy efficacy and T cell activation by trapping calreticulin, which abrogates membrane calreticulin-directed antigen presentation function and phagocytosis [50]." (Page 20, highlighted)

      2) Is ROS important? It is not tested directly.

      ROS plays an important role during immune response, which are released by neutrophils and macrophages. Not only do they act as key mediators of the adaptive immune response, but they also have the ability to modulate the activation of B-cells and T-cells. In our study, we suggest that ROS may be involved in NLRP3 inflammasome activation and the expression and secretion of STC1. Although we did not pursue this line of inquiry further as it was beyond the scope of our paper, we have included additional relevant research in Discussion and a reference is provided.

      "It has been proved that the expression and secretion of STC1 in multiple cell lines can be stimulated by external stimuli, including cytokines and oxidative stress [26]." (Page 21, highlighted)

      3) The changes in CD8 and Treg are not convincing. Moreover, it is not tested how these changes can be elicited by the presence of MSCs.

      We have included additional in vivo data to assess the levels of Treg cells and CD8+ in this revised manuscript. This not only confirms the alterations of CD8 and Treg, but also offers additional line of evidence to further analyze the influence of MSCs on CAR-T in vivo. The findings are presented in Figure 4B, and the corresponding discussion can be found on Page 17 (highlighted).

      Reviewer #2 (Public Review):

      Major points:

      1) STC-1 is expressed and secreted by many human cancer cells. This should be discussed in the introduction or discussion with more inter-related background info on both its regulation in cancer cells and secretion pattern into TME. It is important because you state that the STC-1 secreted by MSC has such strong functions, then how about those produced and secreted by cancer cells? Are those also stimulated by macrophages or other components in TME? Do they have possible functions in helping cancer cell to escape the immune surveillance mechanisms?

      Thanks for the suggestion. We have added more details about the regulation and secretion of STC-1 in cancer cells (see below). The information is added to both the introduction and discussion (highlighted on pages 4 and 21), and all the above questions are addressed.

      "It was proved that STC1 is involved in several oxidative and cancer-related signaling pathways such as NF-κB, ERK, and JNK pathways [26,27]. The expression and secretion of STC1 in cancer tissue can be stimulated by external stimulus including external cytokines and oxidative stress [26]. Under hypoxia conditions, STC1 could be modulated by HIF-1 to facilitate the reprogramming of tumor metabolism from oxidative to glycolytic metabolism [28]. STC1 was also reported to participate in the process of epithelial-to-mesenchymal transition (EMT), which is associated with tumor invasion and the reshape the tumor microenvironment, as well as increasing therapy resistance [29]." (Page 4)

      "It has been proved that the expression and secretion of STC1 in multiple cell lines can be stimulated by external stimuli including cytokines and oxidative stress [26]." (Page 21)

      2) In Figure 4B, using a single marker of IL-1β to show the immune suppressive capability of MSC in vivo is not sufficient, staining for CD4+ and CD8+ should also be included to demonstrate whether MSC could modulate T cell compositions, which can give more direct evidence about MSC's impacts on CAR-T cell.

      The above experiments were done as suggested, and the data were presented in figure 4B. Explanations of the results are shown on page 17 Results section and page 21 Discussion section (highlighted).

      3) One of the major risks associated with CAR-T therapy is an excessive immune response that causes cytokine release syndrome. MSCs have been used in clinics as a way to suppress immune response including post-CAR-T. What does the author think about using MSC with STC-1 knockout? Can it still help reduce toxicity while maintaining CAR-T efficacy? This might be a potential application.

      This is definitely an interesting idea. Based on the data presented in the current study, it is clear that knockdown of STC-1 would abrogate the immune-suppressive impact of MSC, and therefore affect CAR-T efficacy. However, whether the presence of MSC can help reduce cytokine release syndrome when losing the function of STC-1 requires further study. We agree with the reviewer, and we had briefly discussed this possibility at the very end of the discussion as shown below (Page 22, highlighted).

      "… the findings we presented here are no doubt that would have potential clinical applications toward improving the efficiency of CAR-T therapy as well as reducing the excessive toxicity by modulating the level of STC1 in TME".

      4) There was a recent study published in Cancer Cell (Lin et al. Stanniocalcin 1 is a phagocytosis checkpoint driving tumor immune resistance. 2021), and they also reported that STC1 negatively correlates with immunotherapy efficacy and patient survival. It should be cited, and in fact, it provided support to the authors' present study with completely different experimental settings.

      Thanks for providing this important information. It is an excellent study and consistent with our findings. The reference was added and discussed on page 20 (highlighted) as shown below.

      "In line with our study, it was recently reported that stanniocalcin-1 negatively correlates with immunotherapy efficacy and T cell activation by trapping calreticulin, which abrogates membrane calreticulin-directed antigen presentation function and phagocytosis [50]"

    1. Author Response

      Reviewer #1 (Public Review):

      This theoretical (computational modelling) study explores a mechanism that may underlie beta (13-30Hz) oscillations in the primate motor cortex. The authors conjecture that traveling beta oscillation bursts emerge following dephasing of intracortical dynamics by extracortical inputs. This is a well written and illustrated manuscript that addressed issues that are both of fundamental and translational importance.

      We are pleased by the reviewer’s judgement about the importance of the question that we consider and about the presentation of our manuscript.

      Unfortunately, existing work in the field is not well considered and related to the present work. The rationale of the model network follows closely the description in Sherman et al (2016). The relation (difference/advance) to this published and available model needs to be explicitly made clear. Does the Sherman model lack emerging physiological features that the new proposed model exhibits?

      We view the work of Sherman et al (2016) and ours as complementary. Sherman et al propose a model of a single E-I module, using the terminology of our manuscript, that is much more detailed than ours since it approximately accounts for the layered structure of the cortex using two layers of multi-compartment spiking neurons, each comprising 100 excitatory neurons and 35 inhibitory neurons. This allows a detailed comparison of the model with local MEG signals. We used a much simpler description and only describe the population behavior of local E and I neurons populations in each module. However, contrary to Sherman’s model, this allows us to address the spatial aspect of beta oscillations which is the main target of our work. Our simple description of a local E-I module allows us to consider several hundred E-I modules with a spatially-structured connectivity and to analyze the spatio-temporal characteristics of beta activity. We have now described the relation of our work with Sherman et al (2019) in the discussion section (lines 540-547).

      The authors may also note the stability analysis in: Yaqian Chen et al., “Emergence of Beta Oscillations of a Resonance Model for Parkinson’s Disease”, Neural Plasticity, vol. 2020, https://doi.org/10.1155/2020/8824760

      We thank the reviewer for pointing out this paper that had escaped our notice. It presents the stability analysis of a single E-I module with propagation delay (and instantaneous synapses). At the mathematical level, the analysis brings little as compared to the much older article of Geisler et al., J Neurophys (2005) that we cite. However, the model specifically proposes to describe beta oscillations in the motor cortex as arising from the interaction between excitatory and inhibitory neurons, as we do. Therefore, we included this reference as well as a reference to the previous work of Pavlides et al., PLoS Comp Biol (2015) where the model was developed.

      The model-based analysis of the traveling nature of the beta frequency bursts appears to be the most original component of the manuscript. Unfortunately, this is also the least worked out component. The phase velocity analysis is limited by the small number (10 x 10) of modeled (and experimentally recorded) sites and this needs to be acknowledged.How were border effects treated in the model and which are they?

      We thank the reviewer for these points which gave us the opportunity to clarify them and improve our manuscript. As described in Methods: Simulations (line 847 and seq.) and shown in Fig. S2 (Fig. S10 in the original submission), we actually simulated our model on a 24 × 24 grid and did all our measurements in a central 10×10 grid to take into account that the electrode covers only part of the motor cortex. In addition to minimize border effects, we added on each side of the 24×24 grid two rows of E-I modules kept at their (non-oscillating) fixed points of stationary activity, as depicted in Fig. S2. In order to address the concern of the reviewer, and to check that indeed border effects had a minimal impact on our results, we have performed a new set of simulations on a 24×24 grid with periodic boundary conditions. The results are shown in the new supplementary Fig. S9 and are indistinguishable from those reported in the main text and figures. In particular, the proportion of the different wave types and the wave speeds are unaffected by this change of boundary conditions. A paragraph has been added in the revised version (lines 371-378) to discuss this point.

      How much of the phase velocities are due to unsynchronized random fluctuations? At least an analysis of shuffled LFPs needs to be performed.

      The phase velocities are indeed due to unsynchronized random fluctuations (coming from the finite number of neurons in each of our modules as well as, and more importantly, from the uncorrelated local external inputs). In order to check that the spatial-structure of connectivity was important, we followed the suggestion of the reviewer and also performed a new set of simulations to provide a further test. As proposed by the reviewer, after performing the simulations we shuffled in space the signal of the different electrodes and also did a parallel analysis where we shuffled the signal from different electrodes in the recording. We then reclassified the shuffled simulations/recordings in exactly the same way as the original ones. As shown in the new additional Fig. S16, this resulted in the full elimination of time frames classified as “planar waves” both in the model and in the experimental recordings. Additionally, it little modified the proportion of “synchronized” or “random” episodes which is intuitively understandable since shuffling does not change the nature of these states. In order to further assess the impact of connections between modules, we also decided to suppress them, namely to put their range l to zero. In order to avoid modifying the working point of a local module by this manipulation, we focused on the case without propagation delay. Without long-range connection, the local dynamics of each module is little modified. However, as shown in the new Fig. S18a, synchronization between neighboring modules is strongly decreased and the proportion of the different wave types is entirely changed: synchronized states and planar waves disappear and are replaced by random states. These results are described in two new paragraphs (lines 401-414 and lines 431-435).

      Is there a relationship between the localizations of the non-global external input and the starting sites of the traveling waves?

      This is also an interesting question that parallels some asked by the other reviewers and which we did our best to address. As described in the “Essential revisions” point 5) above, we aligned all “planar wave events” in space and time with the help of the spatio-temporal phase maps of the oscillations. We did find that planar waves were preceded by an increase in the global synchronization index σp, both in simulations and in experiments. In simulations this increase also corresponded to a shift of the global inputs away from their mean, as depicted in the new Fig. 4 in the main manuscript. However, no significant average spatio-temporal profile of the local inputs emerged when we used these temporal alignments. This is presumably due to the large variability of local inputs that can give rise to planar waves. We have described these results in the new section “Properties of planar waves and characteristics of their inputs”.

      In summary, this work could benefit from a widening of its scope to eventually inspire new experimental research questions. While the model is constructed well, there is insufficient evidence to conclude that the presented model advances over another published model (e.g. Sherman et al., 2016).

      As described in the “Essential revisions” and the discussion section of the manuscript, our work highlights a number of questions that can (and hopefully will) inspire new experimental research. We also hope that we have clarified above that our model complements Sherman et al.’s model and advances it as far as the spatial aspects of beta oscillations in motor cortex are concerned.

      Reviewer #2 (Public Review):

      Kang et. al., model the cortical dynamics, specifically distributions of beta burst durations and proportion of different kind of spatial waves using a firing rate model with local E-I connections and long range and distance dependent excitatory connections. The model also predicts that the observed cortical activity may be a result of non stationary external input (correlated at short time scales) and a combination of two sources of input, global and local. Overall, the manuscript is very clear, concise and well written. The modeling work is comprehensive and makes interesting and testable predictions about the mechanism of beta bursts and waves in the cortical activity. There are just a few minor typos and curiosities if they can be addressed by the model. Notwithstanding, the study is a valuable contribution towards developing data driven firing rate.

      We really appreciate the positive comments of the reviewer and thank her/him for them. We have done our best to correct the typos and to address the questions raised by the reviewer.

      1) The model beautifully reproduces the proportion of different kind of waves that can be seen in the data (Fig 3), however the manuscript does not comment on when would a planar/random wave appear for a given set of parameters (eg. fixed v ext, tau ext, c) from the mechanistic point of view. If these spatio-temporal activities are functional in nature, their occurrence is unlikely to be just stochastic and a strong computational model like this one would be a perfect substrate to ask this question. Is it possible to characterize what aspects of the global/local input fluctuations or interaction of input fluctuations with the network lead to a specific kind of spatio-temporal activity, even if just empirically ?

      This is an important question that parallels some asked by the other reviewers and which we did our best to address. As described in the “Essential revisions” paragraph above, we aligned all “planar wave events” either in phase or at their starting time points. We did find that planar waves were preceded by an increase in the global synchronization index σp, both in simulations and in experiments. In simulations this increase also corresponded to a shift of the global inputs away from their mean, as depicted in the new Fig. 4 in the main manuscript. When we used the same alignment to average spatio-temporal local inputs, we did not see the emergence of any significant patterns. This presumably reflects the high variability of local inputs able to produce a planar wave.

      Do different waves appear in the same trial simulation or does the same wave type persist over the whole trial? If former, are the transition probabilities between the different wave types uniform, i.e probability of a planar wave to transit into a synchronized wave equal to the probability of a random wave into synchronized wave?

      In the same trial simulation, different types of waves indeed successively appear. The curiosity of the reviewer led us to investigate this interesting point. Since time frames classified as random or synchronized are much more numerous than the planar (and radial) wave ones, it is much more probable that a planar wave transits into a synchronized or a random pattern than the reverse process (i.e., synchronized and random patterns preferentially transit into each other). Nonetheless, we considered questions related to the one of the reviewer. What are the states preceding a planar wave event? Given that a planar wave episode is preceded by a random (or synchronous) episode, is it more likely to be followed by a random or by a synchronous event? We actually find that the entry state is prominently a synchronized state. Furthermore, when the entry state is synchronized, the exit state is also synchronized much more often than would be expected by chance. This shows that most often, planar waves are created from an underlying synchronized persistent state. This has been described in the revised manuscript (lines 443-451).

      2) Denker et al 2018, also reports a strong relationship between the spatial wave category, beta burst amplitude, the beta burst duration and the velocity (Fig 6E - Denker et. al), eg synchronized waves are fastest with the highest beta amplitude and duration. Was this also observed in the model ?

      We had long exchanges with Michael Denker about his analysis since there are some differences between his code and what is described in Denker et al. (2017), possibly because of several typos in the Method section of Denker et al (2017). We have checked that the results of our code agree with his but there are some differences with the results obtained on the available datasets and those reported in Denker et al from other data sets. We have now provided the detailed statistics of the different wave types as obtained by our analysis in the simulation of model SN (Fig. S9) and SN’ (Fig. S11) and in the recordings for monkey L (Fig. S10) and monkey N (Fig. S12). In the recording data, the amplitude and speed of the synchronized and planar waves are comparable and higher than in the radial and random wave types. The duration of synchronized events is longer than the one of planar waves and of the other waves types. Comparable results are obtained in the simulations with nonetheless a few differences: the mean amplitude of planar waves is somewhat larger than those of synchronized states, the hierarchy of duration in the different states is respected but the duration themselves are longer in the simulations than in the recordings (about 40 % for the planar waves and almost two times longer for the synchronized states). We attribute these differences to the fact synchronization is slightly less effective in the recordings than in the model. Long synchronization episodes in the recordings are often cut-off by a few time frames where the synchronization index goes below the threshold value for a synchronized pattern. This happens rarely enough not to affect much the global statistics of the different states but it as a much more visible effect on the measured duration of the synchronized states.

      Reviewer #3 (Public Review):

      In this manuscript, the authors consider a rate model with recurrently connections excitatory-inhibitory (E-I) modules coupled by distance-dependent excitatory connections. The rate-based formulation with adaptive threshold has been previously shown to agree well with simulations of spiking neurons, and simplifies both analytical analysis and simulations of the model. The cycles of beta oscillations are driven by fluctuating external inputs, and traveling waves emerge from the dephasing by external inputs. The authors constrain the parameters of external inputs so that the model reproduces the power spectral density of LFPs, the correlation of LFPs from different channels and the velocity of propagation of traveling waves. They propose that external inputs are a combination of spatially homogeneous inputs and more localized ones. A very interesting finding is that wave propagation speed is on the order of 30 cm/s in their model which is consistent with the data but does not depend on propagation delays across E-I modules which may suggest that propagation speed is not a consequence of unmylenated axons as has been suggested by others. Overall, the analysis looks solid, and we found no inconsistency in their mathematical analysis.

      We thank the reviewer for his comments and for his expert review.

      However, we think that the authors should discuss more thoroughly how their modeling assumptions affect their result, especially because they use a simple rate-based model for both theory and simulations, and a very simplified proxy for the LFPs.

      In the revised manuscript, we have performed additional simulations to test different modeling assumptions as suggested by the reviewer and discussed further below.

      The authors introduce anisotropy in the connectivity to explain the findings of Rubino et al. (2006), showing that motor cortical traveling waves propagate preferentially along a specific axis. They introduce anisotropy in the connectivity by imposing that the long range excitatory connections be twice as long along a given axis, and they observe waves propagating along the orthogonal axis, where the connectivity is shorter range. Referring specifically to the direction of propagation found by Rubino et al, could the authors argue why we should expect longer range connections along the orthogonal axis? In fact, Gatter and Powell (1978, Brain) documented a preponderance of horizontal axons in layers 2/3 and 5 of motor cortex in non-human primates that were more spatially extensive along the rostro-caudal dimension as compared with the medio-lateral dimension, and Rubino et al. (2006) showed the dominant propagation direction was along the rostro-caudal axis. This is inconsistent with the modeling work presented in the current manuscript.

      This is an important comment and we thank the reviewer for pointing out these data in Gatter and Powell (1978). Since the experimental data show that planar wave propagation directions are anisotropically distributed, we have tried and investigated what the underlying mechanism of this anisotropy could be in the framework of our model. Anisotropy in connectivity is an obvious possibility. Given our result, and the data of Gatter and Powell, it appears however that it is not the underlying cause of the observed anisotropy direction in the motor cortex (in the framework of our model). We have thus investigated another possibility, namely that the local external inputs are anisotropically targeting the motor cortex, being more spread out along a given axis (lines 510-529 and new Fig. 5g-l). We find that planar waves propagate preferentially along the orthogonal axis. This leads us to conclude that the observed propagation anisotropy could be of consequence of the external input being more spread out along the medio-lateral axis. Data addressing this issue could be obtained using retroviral tracing techniques.

      The clarity and significance of the work would greatly improve if the authors discussed more thoroughly how their modeling assumptions affect their result. In particular, the prediction that external inputs are a combination of local and global ones relies on fitting the model to the correlation between LFPs at distant channels. The authors note that when the model parameter c=1, LFPs from distant channels are much more correlated than in the data, and thus have to include the presence of local inputs. We wonder whether the strong correlation between distant LFPs would be lower in a more biologically realistic model, for example a spiking model with sparse connectivity and a spiking external population, where all connections are distant dependent. While the analysis of such a model is beyond the scope of the present work, it would be helpful if the authors discussed if their prediction on the structure of external inputs would still hold in a more realistic model.

      This is a legitimate question that we indeed asked ourselves. In a previous work with a simpler chain model, we only considered finite size fluctuations. We found good agreement between our simplified description of finite size fluctuations and simulations of a spiking network with fully connected modules and sparse distance-dependent connectivity. This leads us to believe that our description of finite-size fluctuations is reliable in this setting. Assuming that it is the case, we find that with 104 neurons or more per module finite size noise is not strong enough to replace our local external inputs. Even with 2000 neurons per modules the intrinsic fluctuations the network is very synchronized (new Fig. S15e-g). With 200 neurons per module, the intrinsic fluctuations are strong enough to replace the fluctuating local inputs (Fig. S15a-d) but this is quite a low number. Our description of local noise would have to underestimate the fluctuation in a more sparsely connected network by a significant amount for agreement with the data to be obtained without local inputs. Moreover, it seems to us quite plausible that different regions of motor cortex receive different inputs but, of course, this can only settled by further experiments. Together with the new Fig. S15, we have added a paragraph to address this question in the manuscript (lines 379-400).

    2. Reviewer #3 (Public Review):

      In this manuscript, the authors consider a rate model with recurrently connections excitatory-inhibitory (E-I) modules coupled by distance-dependent excitatory connections. The rate-based formulation with adaptive threshold has been previously shown to agree well with simulations of spiking neurons, and simplifies both analytical analysis and simulations of the model. The cycles of beta oscillations are driven by fluctuating external inputs, and traveling waves emerge from the dephasing by external inputs. The authors constrain the parameters of external inputs so that the model reproduces the power spectral density of LFPs, the correlation of LFPs from different channels and the velocity of propagation of traveling waves. They propose that external inputs are a combination of spatially homogeneous inputs and more localized ones. A very interesting finding is that wave propagation speed is on the order of 30 cm/s in their model which is consistent with the data but does not depend on propagation delays across E-I modules which may suggest that propagation speed is not a consequence of unmylenated axons as has been suggested by others. Overall, the analysis looks solid, and we found no inconsistency in their mathematical analysis. However, we think that the authors should discuss more thoroughly how their modeling assumptions affect their result, especially because they use a simple rate-based model for both theory and simulations, and a very simplified proxy for the LFPs.

      The authors introduce anisotropy in the connectivity to explain the findings of Rubino et al. (2006), showing that motor cortical traveling waves propagate preferentially along a specific axis. They introduce anisotropy in the connectivity by imposing that the long range excitatory connections be twice as long along a given axis, and they observe waves propagating along the orthogonal axis, where the connectivity is shorter range. Referring specifically to the direction of propagation found by Rubino et al, could the authors argue why we should expect longer range connections along the orthogonal axis? In fact, Gatter and Powell (1978, Brain) documented a preponderance of horizontal axons in layers 2/3 and 5 of motor cortex in non-human primates that were more spatially extensive along the rostro-caudal dimension as compared with the medio-lateral dimension, and Rubino et al. (2006) showed the dominant propagation direction was along the rostro-caudal axis. This is inconsistent with the modeling work presented in the current manuscript.

      The clarity and significance of the work would greatly improve if the authors discussed more thoroughly how their modeling assumptions affect their result. In particular, the prediction that external inputs are a combination of local and global ones relies on fitting the model to the correlation between LFPs at distant channels. The authors note that when the model parameter c=1, LFPs from distant channels are much more correlated than in the data, and thus have to include the presence of local inputs. We wonder whether the strong correlation between distant LFPs would be lower in a more biologically realistic model, for example a spiking model with sparse connectivity and a spiking external population, where all connections are distant dependent. While the analysis of such a model is beyond the scope of the present work, it would be helpful if the authors discussed if their prediction on the structure of external inputs would still hold in a more realistic model.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      Summary:<br /> The Authors report on the synthesis and characterization of a class of small molecules, the tanshinone mimics (TMs), which interfere with binding of the RNA binding protein (RBP) HuR to its mRNA targets. HuR is an important regulator of mRNA stability and translation of genes involved in key homeostatic (cell cycle, stress response) and pathologic process (inflammation, carcinogenesis). In particular, the first part of the study describes the compounds' chemical synthesis and some pharmacokinetic parameters (i.e., definition of molecular binding, solubility, bioavailability, prodrug generation etc). The second part undertakes, in in vitro and ex-vivo model of LPS-induced mouse macrophage activation, the identification of HuR-bound mRNA targets, which is then evaluated within the global LPS-induced transcriptome; finally, the study evaluates the ability of TMs to inhibit HuR-mediated proinflammatory gene regulation, indicating their use and potential value as therapeutic anti-inflammatory strategy.<br /> Major comments:<br /> The manuscript contains a wealth of data generated from different experimental systems, spanning from synthetic chemistry to preclinical models of gene regulation, requiring cultural backgrounds in chemistry and biology as well. The key conclusions are well supported by the data, but it takes a great effort to get to the core results and thus critically read and evaluate their interpretation. Although the complexity and sheer size of data sets generated lends itself to a hard read, this is further complicated by data presentation, which especially in the second part needs to be significantly improved to gain clarity and focus. For ease of referral, specific comments will be addressed related to Figures whenever possible.<br /> 1.1 • Page 15: To measure TM7nox disrupting ability of HuR:mRNA complex for the HTRF assay (Figure 2G) and for biotin pull down assay (Figure 5C), it was chosen a biotinylated probe containing the AU rich elements of the TNFα, as known HuR target. Please comment on the rationale, and whether could it be relevant reevaluate these parameters post-hoc, based on the sequences identified in HuR targets more susceptible of modulation by TM compound (listed in table 1, Figure 5 A/B) and based on the absence of regulation of TNFa (Figures 3D, 4D, 7A) found in the tested systems.

      R1.1 - We thank the reviewer for this observation. We have been using the biotinylated probe containing the AU-rich elements of TNFα as a representative probe for HuR for biochemical assays in several articles (PMID: 29313684, PMID: 26553968, PMID: 23951323). As the reviewer suggests, a posteriori, it is worth reevaluating the representative probe to be used for evaluating the disrupting ability of TMs based on the data we present here. Indeed, we will tackle this problem in our following efforts, as it is a meaningful although time-consuming task which is outside of the scope of this manuscript.

      1.2 • Page 16-18: Description of the RNAseq data shown in Figure 3 should be more centered around the main findings regarding the effect of TMnox that are further pursued in the study: that is, (Figure 3B), the 249 downregulated DEGs found modulated by TM7nox in presence of LPS, where was observed a strong enrichment of categories related to the inflammatory response: cytokines (Il1b, Cxcl10, Il10, Il19, Il33), immune cell chemotaxis (Ccl12, Ccl22, Ccl17, Ccl6) and innate immune response.

      The description of the GO for the remaining data should be shortened to main points, perhaps reporting what described in the results with each section of the Venn in a table, while referring to the whole list in the supplements as already done. This could replace Figures 3D, E which do not add substantially to what provided in the supplementary table 2 and to which they can be added as further visualization.

      R1.2 - We thank the reviewer for this suggestion, accordingly, we simplified the text keeping only the description of the genes modulated by TM7nox during LPS treatment. The other information originally there was moved to Supplementary table 2. Revised figures 3E and 3F now focus only on the 249 downregulated genes of this group.

      1.3 • Page 18-19: Description of the results of the RIP-seq shown in Figure 4 set is very confusing: onward from the line "477 HuR-bound transcripts (log2 FC > 3) were also upregulated by LPS at the transcriptional level..." the numbers do not match or reconcile with those shown in the Venn diagram (Fig. 4B) nor with those listed in the figure legend of Figure S8.

      R1.3 - We agree with the reviewer, we apologize for having reported the wrong numbers, and we clarified this point in general by deeply revising the text. A more precise explanation of the selection procedure for the genes of interest is now reported and better explained also by adding a scheme (Fig 4D in the revised manuscript).

      1.4 Moreover, as previously remarked for Figure 3 (and even more for this dataset in which initial description of Venn in 4B is unclear), panel 4E does not add as much to the info provided in Table 1/supplementary Table 1, where they can eventually be added as further data visualization; Instead, Figure S8 displays very informative data merging together the results obtained in RNAseq (Fig. 3) and RIP-Seq (Fig.4) and should be displayed in Figure 4, as in the result section they are indeed presented together.

      R1.4 - We agree with this remark, thus we have removed the old panels 3E in S8C and 4E in S9B, and we now provide the information previously contained in old S8 in the main figure 4E of the revised manuscript.

      1.5 • Page 19-20: Description of the modulation by TM7nox of HuR binding to specific consensus sequences is summarized at the end of the relative paragraph as follows: "TM7nox reshapes HuR binding to target genes in presence of LPS by disrupting the binding of HuR towards target genes containing a lower number of HuR consensus sequences than the average observed in the HuR-bound transcripts". Understanding of these data through the provided text and the Supplementary Figure 9 is very laborious and referring of an entire dataset to a supplementary figure makes it even harder. It would be best to show this as main figure, not supplemental, either adding a Venn diagram as in 3B/4B showing to which dataset each part of the analysis refers, or even more efficaciously, extrapolate a representative gene set for the main analyses showing TM7nox activity in target genes with higher vs lower consensus sequences; same approach for the analysis in Figure 9B, where the effect on a gene with sequence #1 or #10 could be compared with one bearing sequence #3 for example.

      R1.5 - We agree with the reviewer, thus we moved the information of old S9 in figure 4C of the revised manuscript. We deeply revised the information provided also by taking into account the request to compare this experiment to the one in Lal et al. NAR 2017 (please see also R2.4). We made an effort to identify a subset of genes that follow a coherent modulation, identifying 82 genes highlighted in Supplementary Table 1. All such genes show increased expression during LPS or LPS/TMnox vs DMSO conditions, and decreased association to HuR during LPS/TMnox vs LPS. As 47 of these, i.e. more that 50%, contain less AU rich sequences than the average (highlighted in Supplementary Table 1), we can consider them as a representative gene ensemble modulated in accordance with the presence of AU rich sequences.

      1.6 • Page 21: Description of the effect of three TMs (TM6, TM7nox and TM7nred) on LPS response in macrophages at the single gene level (Figure 5 and Figure 6): TM6 and TM7nox were used in exps in Fig. 5 A and E, while only TM7nred was used for CXCL10 secretion analysis (fig.5 D and F): please describe the compound choices' rationale (as done for experiments in Figure 6).

      R1.6 – Following the reviewer suggestion, we now explain our rationale in choosing the small molecules, that is the use of the ones bearing the active quinone species. We have performed additional experiments, and now we report TM6n, TM7nox, and the control DHTS activity in decreasing the secretion of Cxcl10 (figure 5E in the revised manuscript). All compounds behave similarly in this experiment. TM7nred is now used to show its equivalence to TM7nox in figure 5E and in figure 6 of the revised manuscript.

      1.7 • Page 21-22: The effect on HuR expression of siRNA silencing and, more importantly, of TMs shown in Figure 6A, first panel, should be visualized at protein level by western blot. This is an important point as for CXCL10 and iL1b there seems to be an additive effect between decreased HuR levels and pharmacological blocking.

      R1.7 - Following the reviewer suggestion, we now show the protein level as measured by intracellular Elisa; as we were not able to detect the proteins by western blot. The protein level is in general agreement with the gene expression level. We do not observe an additive effect by pharmacological inhibition during HuR silencing, but we rather observe a slight increase in the protein level during HuR silencing. We do not have an explanation for this effect, which may depend on several reasons - for example, an aspecific effect of the TMs when their molecular target HuR is absent.

      1.8 • Page 24: please rephrase the statement 'These observations suggest the utilization of TMs in autoinflammatory and autoimmune diseases' as 'These observations suggest the evaluation of TMs in specific preclinical models for autoinflammatory and autoimmune diseases'.

      R1.8 - We fully agree with the reviewer, and we changed the text in the revised manuscript accordingly.

      1.9 • In the discussion, please include a paragraph with study limitation and possible biases (for example, the choice of RNP-IP without crosslinking has pros and cons).

      R1.9 – Thank you for the good suggestion, we added a paragraph in the discussion which describes study limitations due to the utilization of RNP-IP vs crosslinking.

      The data and the methods are correctly presented for reproducibility, replicates and statistical analysis are adequate. Minor comments: 1.10 • At least in the single gene validation experiments (Fig.5), a negative control (such as recombinant HuR with mutated RRMs in trans-, or ARE-less/non-HuR targetable sequence in cis, or inactive TM) would be advisable.

      R1.10- We thank the reviewer for the suggestion. Accordingly, we used an ARE-less/non-HuR targetable gene as RPLP0 for validation.

      1.11 • Figure 6B/C: for immunofluorescence panels, zooming on a smaller number of cells will render more visible HuR and NFkB nucleocytoplasmic shuttling, given that quantification and statistics are provided by imaging software. Negative control stainings (secondary Abs only) should be included.

      R1.11 – In accordance with this suggestion, we now report a higher magnification of the immunofluorescence images. We also report the standard DHTS effect, showing a difference vs TMnox activity which may suggest its impact on NFkB shuttling.

      1.12 • Figure 7A: in the X axis LPS+8n is indicated: is it a typo for LPD+6n or was compound TM8n indeed used?

      R1.12 – Thanks for your spotting our mistake, the prodrug 8 described in figure 1 was used.

      1.13 • In the Methods section please include protocols and materials for immunofluorescence (results shown in Fig. 6B/C).

      R1.13 – As for your suggestion, protocols and materials for immunofluorescence were added to the methods.

      1.14 • There are some typos and repetition in figure legends (legend Figure S9).

      R1.14- Thank you for this, we revised all the figure legends.

      Prior studies are referenced appropriately. Review Cross-commenting I fully agree with the Reviewer's remarks. I would add that a general concern expressed is that this manuscript in its present form has a readership issue: the first part is for chemistry/pharmacology audience, the second is biology-based. Splitting the work has been suggested; or, the Authors may decide which part is more impactful and present the other in a streamlined version.

      Reviewer #1 (Significance):

      This is a large study reporting progress in the development of synthetic antagonists of HuR function, which is the Authors' well-established line of research. The TM compounds are small molecules with anti-inflammatory effects with strong potential for therapeutic use due to selected inhibition of HuR-mediated upregulation of proinflammatory molecules. The physicochemical and early biological characterization done in this study will allow further testing of their efficacy and of the overall role of HuR-mediated regulation as targetable mechanism in several preclinical human disease models. Targeting of the RNA-binding protein HuR has been tackled as therapeutic approach in cancer, less in chronic immune and inflammatory diseases despite many common mechanisms and mediators. This study could be well received by researchers involved in basic science and drug development (chemistry, biochemistry/biophysics, pharmacology, computational modeling) and biologists/physician scientists interested in testing these compounds in translational research settings where HuR-driven functions can be relevant (cancer, chronic inflammation), though the chemical part would be less accessible to the latter audience. Reviewer's background is in preclinical human models of chronic inflammation with interest in posttranscriptional gene regulation with familiarity with RNAseq and RIPseq dataset and analysis. For the part of the manuscript regarding the synthesis and physicochemical characterization of the TN compound I requested assistance to a faculty from the chemistry department with expertise in that field, who did not request any specific clarification or addendum.

      Reviewer #2 (Evidence, reproducibility and clarity):

      In the manuscript entitled "HuR modulation with tanshinone mimics impairs LPS response in murine macrophages" the authors have described the synthesis and application of small molecule mimics of the naturally occurring compound tanshinone, which is known to inhibit the binding of the RBP HuR to a class of its mRNA targets. The authors have shown that the tanshinone mimics (TMs) used by them block the binding of RRM1-2 of HuR to ARE-containing RNA in vitro, and reduce the interaction of HuR with a set of ARE-containing mRNAs in LPS-treated mouse macrophage cells. This reduction of interaction of HuR with some of these mRNAs correlates with the reduction in their level in the cells treated with the TMs, and in the secreted level of their proteins in the serum of animals with LPS-induced peritonitis. Together, the study demonstrates the role of these TMs as modulators of the LPS-induced inflammatory response by blocking the binding of HuR to a subset of LPS-induced inflammatory mRNAs and thereby downregulating their mRNA and protein levels in inflammatory cells. The manuscript describes a comprehensive study, ranging from chemical synthesis of TMs, MD simulations to demonstrate the binding site of the TMs to the cleft formed by the RRM1-linker-RRM2 domains of HuR, which has been shown in crystal structure to be the main binding site of A/U-rich RNA molecules, in vitro studies showing the ability of the TMs to hinder ARE-containing RNA binding to HuR RRM1-2, whole transcriptome analysis to show the effect of the TMs on LPS-induced differential gene expression in murine macrophages, and on HuR binding to target mRNAs, and animal studies to show the effect of the TMs on the level of some inflammatory mediators in the serum of mice with LPS-induced peritonitis. The results are quite convincing and is in line with what is generally known about the effect of HuR on the expression of a large number of genes encoding pro-inflammatory proteins, and the ability of tanshinone derivatives/mimics in inhibiting HuR binding to target mRNAs. The authors put these two information together in this study and the results are on expected lines. While the results are convincing and quite comprehensive, I would suggest the following in order to substantiate and strengthen the results: 2.1. The experiments do not have any "positive control", such that the performance of the TMs can be compared with that of a molecule with known HuR binding inhibition activity, such as DHTS. It would be good to have such a comparison, to understand whether the TMs work similar to DHTS or differently, both qualitatively in terms of the mRNA targets which they affect and the extent of their anti-inflammatory activity.

      R2.1- We added DHTS as a comparison to TMs, following the reviewer’s comment. In this model, the net effect of DHTS is partially overlapping with TMs, at least for the parameters that we checked (see Figure 5, 6 and 7), showing some differences in the modulation of NF-kB shuttling upon LPS stimulation. Therefore, we suggest that DHTS and TMs show partially different effects on mRNA targets and in terms of anti-inflammatory activities.

      2.2. It is not clear to me whether the mRNAs which show differential expression in the RNAseq analysis of cells treated with LPS and TMs are exactly the ones which show difference in binding with HuR in the RIPseq analysis in presence of the TMs. This analysis is important for a number of reasons: all the mRNA binding targets of HuR are not affected by HuR at the level of mRNA stability, many are affected at the level of translation, without change in mRNA level. These mRNAs should therefore show change in binding of HuR in the RIPseq assay in presence of TM, but not show change in expression. Secondly, there may be mRNAs which show a change in expression in presence of TMs, but do not show binding of HuR, suggesting pleiotropic roles of the TMs. Therefore, instead of an overall correlation between differential expression and change in HuR binding of mRNAs, a table comparing the RIPseq status of individual mRNAs with that of their differential expression status, in presence and absence of LPS/TMs would be useful, further designating the different groups of mRNAs based on these differential status (change in HuR binding/change in expression, change in HuR binding/no change in expression etc.).

      R2.2 – We tried to rationalize the data following the reviewer’ suggestion, however, we could not fully adopt this strategy due to the complexity of the experiment design. Indeed, we have focused our attention on the effect of TMs during LPS stimulus, which induces a strong transcriptional response, rather than in steady state conditions. This is why we reported the overall correlation of LPS vs DMSO and TM7nox/LPS vs DMSO. Then, we evaluated whether the observed difference in the correlation may be reflected on a change of HuR binding, and we checked the RIPseq status during co-treatment vs LPS. This was the case for a subset of genes that are reported in Supplementary Table 1. Nevertheless, to be fully compliant with the reviewer’s request we now report a Supplementary Table 1 containing the entire gene list, so that the reader can immediately filter out the subsets according only to the comparison TM7nox/LPS vs LPS.

      2.3. Nuclear/cytoplasmic localization of HuR and NFkb is impossible to discern at the magnification of the immunofluorescence images in Fig 6 B and C. Higher magnification images are required to understand changes in localization.

      R2.3 – In accordance with this suggestion, we now report higher magnification, please see also R1.11. We do not observe any change in nuclear/cytoplasmic localization of HuR and NFkb due to TMs treatment. We rather observe LPS-induced NFkB nuclear accumulation, ActD-induced HuR cytoplasmic shuttling and inhibition of NFkB translocation, during LPS and DHTS treatment.

      2.4. It has been shown that DHTS-I increases the binding of HuR to the mRNAs with longer 3'UTR and with higher density of U/AU-rich elements, whereas it reduces the interaction of HuR with the mRNAs having shorter 3'UTR and with low density of U/AU-rich elements (Lal et al., NAR, 2017). It is not clear if the same is observed in case of the TMs or not, and such a comparative analysis would be useful to address this point.

      R2.4 – We re-analysed the data, checking the density of U/AU rich elements and the length of the 3’UTR of the displaced mRNA as in Lal et al. NAR 2017. Although we could not compare DHTS and TMs within the same biological system, it appears that the rules dictating their mechanism of action are similar.

      I think that the above suggested points are feasible as most of them really involve re-analysis of existing data. Only the suggestion to add DHTS or tanshinone as a positive/comparison control will require experimentation and addition of new data.

      Review Cross-commenting

      I think most of the reviewers' comments coincide in the evaluation of the manuscript. I would especially like to draw attention to the fact that all three reviewers found that the content and form of data presented in the paper is very dense and bogs down the reader and distracts from the overall focus of the manuscript.

      Reviewer #2 (Significance):

      The work described in the manuscript is comprehensive as it ranges from chemical synthesis and in vitro evaluation of the TMs to the characterization of their effects in vivo. Although the effect of tanshinone derivatives on HuR mRNA target binding is known, and the effect of HuR on inflammatory gene expression is also known, the manuscript is significant as it brings these two information together and tests the effect of these TMs on HuR-mediated regulation of inflammatory gene expression.<br /> However the extensiveness of the work also makes it quite dense, and I feel that the focus of the paper is often lost in the details. Also, the text of the manuscript is dense and verbose and uses many irregular grammatical and phraseological usages, for eg "their<br /> modulation or mis-localization lead to the insurgence of complex phenotypes and diseases". It appears to me that it would be ideal if the chemical synthesis, MD simulation studies and in vitro studies are presented in an independent manuscript. Also, that would allow a more exhaustive referencing of the known studies in literature where tanshinone derivatives, and other small molecules, have been used to modulate HuR binding to mRNA targets.<br /> This work would be of interest to molecular cell biologists in general and RNA biologists in particular, especially those who are studying RNA-protein interactions, and scientists who are interested in drug development using RNA-protein interactions as drug targets.<br /> My interest in the work lies in my expertise in studying RNA-protein interactions, especially of RNA-binding proteins such as HuR involved in regulating the translation of mRNAs encoded by inflammatory genes. I do not have expertise in chemical synthesis and am therefore not qualified to evaluate the first set of results describing the chemical synthesis of TMs.

      Reviewer #3 (Evidence, reproducibility and clarity):

      In this study, the authors investigated the modulation of HuR by tanshinone mimics and how it mitigates LPS response in murine macrophages. This represents a nice integration of synthetic chemistry, molecular simulations, and in vitro as well as in vivo experimental validations. Overall, this is an interesting study, and will add to the growing interest in HuR in inflammatory-mediated disease. The paper contains a lot of data (actually several papers in one) which may bog down the reader and distract from the overall message. it is suggested that they condense the data and simplify the figures and use more supplemental figures.<br /> Major Comments:<br /> 3.1. The authors have shown the dose response and cytotoxicity effect of tanshinone mimics; The authors show that TMs affect the overall HuR mRNA but they don't show protein levels.

      R3.1 – In accordance with the reviewer’s comment, we now show also protein levels, as we performed intracellular ELISA (Figure 6 in the revised manuscript); please see also R1.7.

      3.2. It is unclear the timing of certain experiments for LPS vs TMs (whether macrophages were pre-treated with TMs before LPS)-e.g fig 5. The authors should clarify for all experiments as the long-term clinical paradigm would be treatment after inflammation has been established.

      R3.2 – In most experiments TMs are co-administered with LPS. Only in one of the two protocols used for Actinomycin D chase experiment TMs are added after LPS with Act D, as we wanted to discriminate between transcriptional and post-transcriptional effects of TMs (see also R3.3).

      3.3. They have also identified differentially expressed genes which are RNA binding ligands of HuR by RIP-Seq. However, it would be necessary to check whether TM7nox affects the stability of those targets before conclusions can be made that TMs don't inhibit the primary transcriptional response (as mentioned in the Discussion section). Transcriptional effects of HUR chemical inhibition or genetic silencing has been reported previously in other cell systems.

      R3.3 – The reviewer is entirely correct, and we accordingly amended our conclusions. Indeed, TMs have an impact on gene transcription during co-administration with LPS as now suggested by Actinomycin D chase experiments reported in Figure 6C in the revised data and discussion in the manuscript.

      3.4. HuR competes with many RBPs (e.g. TTP and KSRP) as well as microRNAs (including miR-21 and miR-122) to regulate the stability/translational efficiency of several AU-rich transcripts. Does TM binding to HuR lead to increase access of these RBPs/microRNA to the transcripts? This could be addressed by RNA IP with antibodies to TTP or KSRP.

      R3.4 – The reviewer is suggesting an important experiment that requires multiple controls and significant efforts. Indeed, we are planning to study the specificity of TMs, and we prefer to tackle and report this point in a later publication.

      3.5. Another aspect of HuR functioning is the dimerization of HuR. HuR dimerization has been linked with many pathophysiologic conditions. The authors should show the effect of TM7nox on HuR dimerization. In figure 2, for example, there is a suggestion of this in the representative EMSAs where an intermediate shifted band is seen with the addition of TMs. Also, the legend should make clear which ligand is being tested in the modeling (purple structure) versus the RNA probe in the EMSAs. It would help the reader to identify the RNA probe used-e.g. "5′-DY681-labeled ARE RNA probe.

      R3.5 – We agree with the reviewer’s suggestion, and we investigated whether TM7nox influences HuR dimerization in the absence of RNA as performed in PMID 17632515 (Meisner et al 2007). We used MS-444 as a positive control, and we did not observe inhibition of dimerization by TMs at least at the used dosages. Data are reported in Supplementary Figure S6B of the revised manuscript.

      3.6. HuR does alter M2-associated targets like IL-10 and this should be addressed more directly. Fig. 3 suggests that IL-10 is reduced by TM7nox but the variance is so high that the statistics show NS. HuR regulates IL-10 in other cellular contexts and this would be important to determine for TM7 in the long run.

      R3.6 – Although we acknowledge its relevance, however, we did not investigate this gene directly. The variance becomes significant in the RIP-seq experiment (Supplementary Figure 9D). Therefore, we confirm that Il10 is among the 47/82 genes that show the same behavior as Cxcl10, Il1b and many other cytokines as Ccl12, Ccl7, Fas, Il1a, Il33; in conclusion, it is among the restricted list of genes modulated by TM7nox according to the presence of less AU rich sequences than average.

      3.7. Fig. 5-10 um of the TM used here produces significant toxicity to BMDM according to fig. S7. This may distort the ELISA/qPCR results as the RNA levels may be lower due to toxicity. The authors should address this or use a lower dose that is not toxic.

      R3.7 – The viability curves mentioned by the reviewer are run at 24-48 hours, and no toxic effects have been observed using TMs after 6 hours of treatment.

      3.8. In Fig 6 the immunocytochemistry is difficult to interpret as the magnification is too small to appreciate the N/C ratio. The investigators should provide higher magnification. A nuclear/cytoplasmic western blot is recommended as well to confirm that TM does not impair HuR shuttling (or NFkb shifts). This is an important area as there is a suggestion that TM blocks dimerization (Fig. 2) which does impair shuttling. Also, the modeling data suggest that TMs appear to sit in a similar groove between RRM1 and 2 as other HuR inhbitors that block shuttling.

      R3.8 – This point has also been raised by other reviewers, and we replied in R2.3 and R1.11. We understand the reviewer’s points, and we agree with the observation. However, we do not observe a change in HuR nuclear/cytoplasmic shuttling by immunofluorescence, neither we see an effect on HuR dimerization.

      3.9. IL-6 does not appear to be affected by TM treatment after LPS stimulation in vivo or in vitro -either mRNA or protein. However, DHTS did suppress this cytokine. The authors should address this discrepancy. Likewise, TNFa data here show no change and possibly a trend upward (Fig 3,4 and 7). This is in contrast to the effect of DHTS on TNF-a reported by the authors in a prior publication (D'Agnistino et al). The authors should address this discrepancy. There are reports suggesting that HuR is a translational inhibitor of TNFa in macrophages--Katsanou V, Papadaki O, Milatos S, Blackshear PJ, Anderson P, Kollias G, Kontoyiannis DL. HuR as a negative posttranscriptional modulator in inflammation (PMID 16168373)

      R3.9 – The reviewer’s comments are correct, but we do not have an explanation for this. In theory, there could be several possibilities such as 1) a DHTS effect on NFkB, 2) the fact that previously mentioned experiments with DHTS are not run with the same cells-at the same doses and timing as our current TM experiments, and 3) that HuR silencing is only partially overlapping with TMs treatment also in our recent experiments. Irrespective of specific transcripts, we think we have shown that TMs’ mechanism of action involves the modulation of HuR binding at the transcriptional level in our experimental condition.

      Review Cross-commenting

      I think the other reviewers' comments are pertinent and well thought out. I have no further suggestions.

      Reviewer #3 (Significance):

      The characterization of novel HuR inhibitors derived from tanshinones is an important advancement to the field which is rapidly growing. This complements other work with small molecule inhibitors and will allow the field to better understand the role of HuR in different disease contexts (cancer, neuroinflammatory etc) and cell types (e.g. macrophages, microglia, astrocytes). The ultimate significance is the clinical application of the inhibitors and the more options the better, particularly if there are toxic effects of some. My expertise is in post-trasnscriptional regulation of cytokines and we have already characterized some potent effects in cancer.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this study, the authors investigated the modulation of HuR by tanshinone mimics and how it mitigates LPS response in murine macrophages. This represents a nice integration of synthetic chemistry, molecular simulations, and in vitro as well as in vivo experimental validations. Overall, this is an interesting study, and will add to the growing interest in HuR in inflammatory-mediated disease. The paper contains a lot of data (actually several papers in one) which may bog down the reader and distract from the overall message. it is suggested that they condense the data and simplify the figures and use more supplemental figures.

      Major Comments:

      1. The authors have shown the dose response and cytotoxicity effect of tanshinone mimics; The authors show that TMs affect the overall HuR mRNA but they don't show protein levels.
      2. It is unclear the timing of certain experiments for LPS vs TMs (whether macrophages were pre-treated with TMs before LPS)-e.g fig 5. The authors should clarify for all experiments as the long-term clinical paradigm would be treatment after inflammation has been established.
      3. They have also identified differentially expressed genes which are RNA binding ligands of HuR by RIP-Seq. However, it would be necessary to check whether TM7nox affects the stability of those targets before conclusions can be made that TMs don't inhibit the primary transcriptional response (as mentioned in the Discussion section). Transcriptional effects of HUR chemical inhbiition or genetic silencing has been reported previously inother cell systems.
      4. HuR competes with many RBPs (e.g. TTP and KSRP) as well as microRNAs (including miR-21 and miR-122) to regulate the stability/translational efficiency of several AU-rich transcripts. Does TM binding to HuR lead to increase access of these RBPs/microRNA to the transcripts? This could be addressed by RNA IP with antibodies to TTP or KSRP.
      5. Another aspect of HuR functioning is the dimerization of HuR. HuR dimerization has been linked with many pathophysiologic conditions. The authors should show the effect of TM7nox on HuR dimerization. In figure 2, for example, there is a suggestion of this in the representative EMSAs where an intermediate shifted band is seen with the addition of TMs. Also, the legend should make clear which ligand is being tested in the modeling (purple structure) versus the RNA probe in the EMSAs. It would help the reader to identify the RNA probe used-e.g. "5′-DY681-labeled ARE RNA probe.
      6. HuR does alter M2-associated targets like IL-10 and this should be addressed more directly. Fig. 3 suggests that IL-10 is reduced by TM7nox but the variance is so high that the statistics show NS. HuR regulates IL-10 in other cellular contexts and this would be important to determine for TM7 in the long run.
      7. Fig. 5-10 um of the TM used here produces significant toxicity to BMDM according to fig. S7. This may distort the ELISA/qPCR results as the RNA levels may be lower due to toxicity.The authors should address this or use a lower dose that is not toxic.
      8. In Fig 6 the immunocytochemistry is difficult to interpret as the magnification is too small to appreciate the N/C ratio. The investigators should provide higher magnification and provide examples of ActD, LPS and LPS + drug. A nuclear/cytoplasmic western blot is recommended as well to confirm that TM does not impair HuR shuttling (or NFkb shifts). This is an important area as there is a suggestion that TM blocks dimerization (Fig. 2) which does impair shuttling. Also, the modeling data suggest that TMs appear to sit in a similar groove between RRM1 and 2 as other HuR inhbitors that block shuttling.
      9. IL-6 does not appear to be affected by TM treatment after LPS stimulation in vivo or in vitro -either mRNA or protein. However, DHTS did suppress this cytokine. The authors should address this discrepancy. Llikewise, TNFa data here show no change and possibly a trend upward (Fig 3,4 and 7). This is in contrast to the effect of DHTS on TNF-a reported by the authors in a prior publication (D'Agnistino et al). The authors should address this discrepancy. There are reports suggesting that HuR is a translational inhbitor of TNFa in macrophages--Katsanou V, Papadaki O, Milatos S, Blackshear PJ, Anderson P, Kollias G, Kontoyiannis DL. HuR as a negative posttranscriptional modulator in inflammation (PMID 16168373)

      Review Cross-commenting

      I think the other reviewers' comments are pertinent and well thought out. I have no further suggestions.

      Significance

      The characterization of novel HuR inhibitors derived from tanshinones is an important advancement to the field which is rapidly growing. This complements other work with small molecule inhibitors and will allow the field to better understand the role of HuR in different disease contexts (cancer, neuroinflammatory etc) and cell types (e.g. macrophages, microglia, astrocytes). The ultimate significance is the clinical application of the inhibitors and the more options the better, particularly if there are toxic effects of some. My expertise is in post-trasnscriptional regulation of cytokines and we have already characterized some potent effects in cancer.

    1. As citizens, we need not think that the conflict between ideas and ideologies is bad.

      This sentence supports the main argument. We are often pinned against one another when we have different political views. If leaders respect and accept their opponent's ideas, we may also reciprocate that same respect to fellow community members and leaders.

    1. Notice that attempting to summarize each claim can actually take more space than the original text itself if we are summarizing in detail and trying to be very precise about what the text claims and implies. Of course, we won’t want to or need to do this in such detail for every paragraph of every reading we are assigned to write about. We can resort to it when the argument gets harder to follow or when it’s especially important to be precise.

      Would it be every assumption that we may think is a fact, but ending it with a question in the sentence that we gave? Such as the example that is provided. It gives a fact, but then asks questions about it. Either it could be a claim of policy, fact, or value.

    1. Holidays moved from the realm of private affairs to commercial affairs, and such a move may have contributed to the depersonalization of these occasions.

      This makes me think about stores. The seasonal time change of consumption marked by holiday products. The target collectible birds for every season. The fastidious change of these products as soon as one holiday is over. We have to gear up and buy buy buy for the next small break. Churning out new products and accumulating them as the days and years go by. A cycle of consumption.

    1. It was as if there was a switch in my brain that regulated how I saw the world: It was off when I walked around alone, and it was on when I was out in the world with my parents. It was only when I visited that medical center with my mom that I realized the switch was the same one I flipped on whenever I stepped on my skateboard.

      Why now? I don't know much about disability and design, but this may be a metaphor for how we think about people with opposing viewpoints. Often people with different opinions are looking at the same thing in the same way, but with a different base experience.

    1. Author Response

      Reviewer #1 (Public Review):

      This study investigates the psychological and neurochemical mechanisms of pain relief. To this end, 30 healthy human volunteers participated in an experiment in which tonic heat pain was applied. Three different trial types were applied. In test trials, the volunteers played a wheel of fortune game in which wins and losses resulted in decreases and increases of the stimulation temperature, respectively. In control trials, the same stimuli were applied but the volunteers did not play the game so that stimulation decreases and increases were passively perceived. In neutral trials, no changes of stimulation temperature occurred. The experiment was performed in three conditions in which either a placebo, or a dopamineagonist or an opioid-antagonist was applied before stimulations. The results show that controllability, surprise, and novelty-seeking modulate the perception of pain relief. Moreover, these modulations are influenced by the dopaminergic but not the opioidergic manipulation.

      Strengths

      • The mechanisms of pain relief is a timely and relevant basic science topic with potential clinical implications.

      • The experimental paradigm is innovative and well-designed.

      • The analysis includes advanced assessments of reinforcement learning.

      Weaknesses

      • There is no direct evidence that the opioidergic manipulation has been effective. This weakens the negative findings in the opioid condition and should be directly demonstrated or at least critically discussed.

      We agree that we cannot provide direct evidence on the effectiveness of the opioidergic manipulation in our study. However, previous literature strongly suggests that a dose of 50 mg naltrexone (p.o.) is effective in blocking 𝜇-opioid receptors in humans. Using positron emission tomography, Weerts et al. (2013) found a blockage of 𝜇-opioid receptors of more than 90% with 50 mg naltrexone (p.o.) although given repeatedly 4 days in a row. In addition, convincing effects on behavioral functions have been reported with comparable doses that support the efficacy of the opioidergic manipulation. For example, Chelnokova et al. (2014) found attenuating effects of 50 mg naltrexone (p.o.) on wanting as well as liking of social rewards, implicating the involvement of endogenous opioids in the processing of rewarding stimuli. The same dose was also found to attenuate reward directed effort exerted in a value-based decision-making task (Eikemo et al., 2017). Moreover, 50mg of naltrexone (p.o.) have been shown to reduce endogenous pain inhibition induced by conditioned pain modulation (King et al., 2013) and to reduce the perceived pleasantness of pain relief (Sirucek et al., 2021). Thus, based on the available literature we assume the effectiveness of our opioidergic manipulation. A corresponding reasoning including a note of caution on the of the lack of a direct manipulation check of the opioidergic manipulation can be found in the manuscript in the Discussion:

      “The doses and methods used here are comparable to those used in other contexts which have identified opioidergic effects. Using positron emission tomography, Weerts et al. (2013) found a blockage of opioid receptors of more than 90% by 50 mg of naltrexone (p.o.) in humans given repeatedly over 4 days. In addition, effects on behavioral functions have been reported with comparable doses that support the efficacy of the opioidergic manipulation. Chelnokova et al. (2014) found attenuating effects of 50 mg naltrexone (p.o.) on wanting as well as liking of social rewards, implicating the involvement of endogenous opioids in the processing of rewarding stimuli. The same dose was also found to attenuate reward directed effort exerted in a value-based decision-making task (Eikemo et al., 2017). Moreover, 50 mg of naltrexone (p.o.) have been shown to reduce endogenous pain inhibition induced by conditioned pain modulation (King et al., 2013). Thus, based on the literature we assume that the opioidergic manipulation was effective in this study, although we do not have a direct manipulation check of this pharmacological manipulation. Despite its effectiveness in blocking endogenous opioid receptors, the effect of naltrexone on reward responses was found to be small (Rabiner et al., 2011). Hence, a lack of power may have limited our chances to find such effects in the present study.”

      • The negative findings are exclusively based on the absence of positive findings using frequentist statistics. Bayesian statistics could strengthen the negative findings which are essential for the key message of the paper.

      We agree with the reviewers that the power may not have been sufficient to detect potentially small effects of the pharmacological manipulations. The power calculation was based on the design and the medium effect size found in a previous study using a comparable experimental procedure for assessing pain-reward interactions (Becker et al., 2015). To acknowledge this weakness, we clarified in the manuscript the description of the a priori sample size calculation as follows:

      “The power estimation was based on the design and the finding of a medium effect size in a previous study using a comparable version of the wheel of fortune game without pharmacological interventions (Becker et al., 2015). The a priori sample size calculation for an 80% chance to detect such an effect at a significance level of 𝛼=0.05 yielded a sample size of 28 participants (estimation performed using GPower (Faul et al., 2007 version 3.1) for a repeated-measures ANOVA with a three-level within-subject factor)."

      Further, we did not aim to claim that endogenous opioids do not affect the perception of pain relief. Our phrasing in describing the results was in several instances too bold. The aim of the pharmacological manipulations was to investigate effects of dopamine and endogenous opioids on endogenous modulation of perceived intensity of pain relief. Here, we expected dopamine to enhance such endogenous modulation and naltrexone to reduce this modulation. The higher average pain modulation under naltrexone compared to placebo found in VAS ratings (naltrexone: -10.09, placebo: -7.31, see Table 1) suggests an increase in pain modulation by naltrexone compared to placebo, although this did not reach statistical significance, which is the opposite of what we had expected (see comment #11). Therefore, we concluded that we have no evidence to support our hypothesis of reduced endogenous modulation of pain relief by naltrexone. We do not want to claim that there are no effects of endogenous opioids on pain modulation. Although Bayesian statistics might be used to support such an interpretation, we think this might be misleading in our context here due to the considerations on the lack of power (which also affects null-hypothesis testing in Bayesian statistics) and the lack of a direct manipulation check mentioned above. Since we expected opposite effects of levodopa and naltrexone on pain modulation, we did not intend to compare these effects directly to avoid a distortion of the results. According to our hypotheses, we expected to see increased modulation of pain relief with enhanced dopamine availability and decreased modulation of pain relief with blocking of opioid receptors (see also comment #11). However, we had no a priori assumptions on potential differences in the absolute changes induced by the drug manipulations. Based on these considerations, we did now not include further direct comparisons of the effects of both drugs. Rather, we carefully went through the manuscript to tone down the descriptions and interpretations of our null findings and adjusted the respective section of the discussion to better reflect this interpretation.

      • The effects were found in one (pain intensity ratings) but not the other (behaviorally assessed pain perception) outcome measure. This weakens the findings and should at least be critically discussed.

      We thank the reviewers for highlighting this important aspect. We have considered the two outcome measures as indicative of two different aspects or dimensions of the pain experience, based also on previous results in the literature. Within our procedure, the ratings indicate the momentary perception of the stimulus intensity after phasic changes in nociceptive input (outcomes), while the behavioral measure indicates perceptual within-trial sensitization or habituation in response to the tonic stimulation within each trial. Supporting the assumption of such two different aspects, it has been shown before that pain intensity ratings and behavioral discrimination measures can dissociate (Hölzl et al., 2005). In line with the assumption that both outcome measures assess different aspects of the pain experience, a differential effect of controllability on these two outcome measures is conceivable. Similarly, Becker et al. (2015), using a very similar experimental paradigm, did only find endogenous pain facilitation in the losing condition of the wheel of fortune game in pain ratings but not in the behavioral outcome measure, while they found endogenous inhibition in both measures. Compared to Becker et al. (2015), we implemented here smaller changes in stimulation intensity as outcomes in the wheel of fortune game (-3°C vs -7°C for win trials, +1°C vs +5°C for lose trials), potentially resulting in the differential effects here. Nevertheless, we agree that this reasoning needs a more explicit discussion in the manuscript and we included the following sentences to the Discussion section:

      “Although we did not assess the affective component of the relief experience, we implemented two outcome measures that are assumed to capture independent aspects of the pain experience: VAS ratings indicate perception of phasic changes (outcomes), while the behavioral measure indicates perceptual within-trial sensitization or habituation in response to the tonic stimulation within each trial. We found enhanced endogenous modulation by controllability and unpredictability in the VAS ratings, in line with the view that endogenous modulation enhances behaviorally relevant information. In contrast, the within-trial sensitization did not differ between the active and passive conditions under placebo. In contrast, in a previous study using a similar experimental paradigm Becker et al. (2015) found a reduction of within-trial sensitization after pain relief outcomes by controllability. Compared to this study, we implemented here smaller changes in stimulation intensity as outcomes in the wheel of fortune (-3 °C vs -7 °C for pain relief), potentially explaining the differential results.“

      • The instructions given to the participants should be specified. Moreover, it is essential to demonstrate that the instructions do not yield differences in other factors than controllability (e.g., arousal, distraction) between test and control trials. Otherwise, the main interpretation of a controllability effect is substantially weakened.

      Thanks for pointing out that specific information on instructions given to the participants was missing. We agree that factors other than controllability would confound the interpretation of differences between test and control trials. We aimed minimizing nonspecific effects of arousal and/or distraction while still giving all needed information with our instructions (see below). In addition, control and test trials were kept as similar as possible. In order to check for unspecific effects of arousal and/or distraction, we also included lose trials in the game as an additional control condition. For clarifying participants’ instructions, we added the following paragraph to the Materials and methods section: “The participants were instructed that there were two types of trials: trials in which they could choose a color to bet on the outcome of the wheel of fortune and trials in which they had no choice. Specifically, they were told that in the first type of trials they could use the left and right mouse button, respectively, to choose between the pink and blue section of the wheel of fortune. Participants were further instructed that if the wheel lands on the color they had chosen they will win, i.e. that the stimulation temperature will decrease, while if the wheel lands on the other color, they will lose, i.e. that the stimulation temperature will increase. For the second type of trials, participants were instructed that they could not choose a color, but were to press a black button, and that after the wheel stopped spinning the temperature would by chance either increase, decrease, or remain constant.”

      In general, both arousal and distraction can be assumed to affect pain perception. If the active condition in the wheel of fortune resulted in higher arousal and/or distraction this should result in comparable effects on intensity ratings in both the win and lose outcomes compared to the passive condition. In contrast, controllability is expected to have opposite effects on pain perception in win and lose trials (decreased pain perception after winning and increased pain perception after losing in the active compared to the passive condition). These opposite effects of controllability are tested by the interaction ‘outcome × trial type’ when fitting separate models for each drug condition, which should be zero if unspecific effects of arousal and/or distraction predominated. Instead, we found a significant interaction in these models, confirming opposing effects of controllability in win and lose outcomes and contradicting such unspecific effects. We added this reasoning, marked in red here, to the Results section to better highlight this line of reasoning, as follows:

      “To test whether playing the wheel of fortune induced endogenous pain inhibition by gaining pain relief during active (controllable) decision-making, a test condition in which participants actively engaged in the game and ‘won’ relief of a tonic thermal pain stimulus in the game was compared to a control condition with passive receipt of the same outcomes (Figure 1). As a further comparator the game included an opposite (‘lose’) condition in which participants received increases of the thermal stimulation as punishment. This active loss condition was also matched by a passive condition involving receipt of the same course of nociceptive input. Comparing the effects of active versus passive trials between the pain relief and the pain increase condition (interaction ‘outcome × trial type’) allowed us to test for unspecific effects such as arousal and/or distraction. If effects seen in the active compared to the passive condition were due to such unspecific effects, then actively engaging in the game should affect comparably pain in both win and lose trials. In contrast, if the effects were due to increased controllability, pain inhibition should occur in win trials and pain facilitation in lose trials.”

      • The blinding assessment does not rule out that the volunteers perceived the difference between placebo on the one hand and levodopa/naltrexone on the other hand. It is essential to directly show that the participants were not aware of this difference.

      We based our assessment of blinding on the fact that for none of the drug conditions the frequency of guessing correctly which drug was ingested was above chance (see Results section, page 8, lines 201ff). In addition, the frequency of side effects reported by the participants did not differ between the three drug conditions, supporting this notion indirectly. However, we agree with the reviewer that this does not rule out completely that participants may have perceived a difference between the placebo and the levodopa/naltrexone conditions. We ran additional analyses to test whether participants were more likely to answer correctly that they had ingested an active drug and whether they were more likely to report side effects in the active drug conditions compared to the placebo condition. In 7 out of 28 placebo sessions (25%) the participants assumed incorrectly to have ingested one of the active drugs. In 12 out of 43 drug sessions (21.8%) the participants assumed correctly that they had ingested one of the active drugs. These frequencies did not differ between placebo sessions on the one hand and the levodopa and naltrexone active drug sessions on the other hand (𝜒)(1) = 0.11, p = 0.737). In 9 out of 28 placebo sessions (32.1%) and in 23 out of 55 drug sessions (41.8%) participants reported to be tired at the end of the session. The frequency of reporting tiredness did not significantly differ between placebo sessions on the one hand and drug sessions on the other hand (𝜒)(1) = 1.06, p = 0.304). No other side effects were reported. We added the following information, marked in red here, to the Results section:

      “In 32 out of 83 experimental sessions subjects reported tiredness at the end of the session. However, the frequency did not significantly differ between the three drug conditions (𝜒)(2) = 2.17, p = 0.337) or between the placebo condition compared to the levodopa and naltrexone condition (𝜒)(1) = 1.06, p = 0.304). No other side effects were reported. To ensure that participants were kept blinded throughout the testing, they were asked to report at the end of each testing session whether they thought they received levodopa, naltrexone, placebo, or did not know. In 43 out of 83 sessions that were included in the analysis (52%), participants reported that they did not know which drug they received. In 12 out of 28 sessions (43%), participants were correct in assuming that they had ingested the placebo, in 6 out of 27 sessions (22%) levodopa, and in 2 out of 28 sessions (7%) naltrexone. The amount of correct assumptions differed between the drug conditions (𝜒)(2) = 7.70, p = 0.021). However, posthoc tests revealed that neither in the levodopa nor in the naltrexone condition participants guessed the correct pharmacological manipulation significantly above chance level (p’s > 0.997) and the amount of correct assumptions did not differ significantly between placebo compared to levodopa and naltrexone sessions (𝜒)(1) = 0.11, p = 0.737), suggesting that the blinding was successful.”

      • The effects of novelty seeking have been assessed in the placebo and the levodopa but not in the naltrexone conditions. This should be explained. Assessing novelty seeking effects also in the naltrexone condition might represent a helpful control condition supporting the specificity of the effects in the naltrexone condition.

      We thank the reviewer for this interesting suggestion. Indeed, we did not report the association of pain modulation with novelty seeking in the naltrexone condition, because we did not have an a-priori hypothesis for this relationship. We now included correlations for all three drug conditions, testing if higher novelty seeking was associated with greater perceptual modulation in the active vs. passive condition. In line with comment 3, we applied a correction for multiple comparisons here (Bonferroni-Holm correction). This correction caused the correlation in the placebo condition to be no longer significant with an adjusted p-value of 0.073 (r = -0.412), while the correlation stays significant in the levodopa condition (r = -0.551, p = 0.013). Because of a reasonable effect size of the correlation under placebo (i.e. r = -0.412), we still report this correlation to highlight the increase under levodopa, while emphasizing that this correlation not significant We carefully toned down the interpretation of this correlation to reflected the change in significance with the correction for multiple testing.

      We added the following information, marked in red here, in the Results section:

      “Previous data suggest that endogenous pain inhibition induced by actively winning pain relief is associated with a novelty seeking personality trait: greater individual novelty seeking is associated with greater relief perception (pain inhibition) induced by winning pain relief (Becker et al., 2015). Similar to these results, we found here that endogenous pain modulation, assessed using self-reported pain intensity, induced by winning was associated with participants’ scores on novelty seeking in the NISS questionnaire (Need Inventory of Sensation Seeking; Roth & Hammelstein, 2012; subscale ‘need for stimulation’ (NS)), although this correlation failed to reach statistical significance after correction for multiple comparisons using Bonferroni-Holm method (r = -0.412, p = 0.073). A significant association between novelty seeking and endogenous pain modulation was found in the levodopa condition (r = 0.551, p = 0.013). More importantly, the higher a participants’ novelty seeking score in the NISS questionnaire, the greater the levodopa-related endogenous pain modulation when winning compared to placebo (NISS NS: r = -0.483, p = 0.034 Figure 7). In contrast, higher novelty seeking scores were not correlated with stronger pain modulation induced by winning in the naltrexone condition (r = 0.153, p = 0.381) and the naltrexone induced change in pain modulation showed no significant association with novelty seeking (r = 0.239, p = 0.499). Pain modulation after losing was not associated with novelty seeking in placebo (r = 0.083, p = 0.866), levodopa (r = -0.164, p = 0.783), or naltrexone (r = 0.405, p = 0.133).

      No significant correlations with NISS novelty seeking score were found for behaviorally assessed pain modulation in the placebo, levodopa and naltrexone conditions during pain relief or pain increase (|r|’s < 0.35, p’s > 0.238). Similarly, the difference in pain modulation during pain relief or pain increase between the levodopa and the placebo condition and between the naltrexone and the placebo condition did also not correlate with novelty seeking (|r|’s < 0.22, p’s > 0.576).” <br /> We also edited the interpretation of the correlation in the Discussion:

      “Overall, all three predictions were largely borne out by the data: relief perception as measured by VAS ratings was enhanced by controllability, unpredictability and showed a medium sized - although not significant - association with the individual novelty-seeking tendency,”

      • The writing of the manuscript is sometimes difficult to follow and should be simplified for a general readership. Sections on the information-processing account of endogenous modulation in the introduction (lines 78-93), unpredictability and endogenous pain modulation in the results (lines 278-331) are quite extensive and add comparatively little to the main findings. These sections might be shortened and simplified substantially. Moreover, providing a clearer structure for the discussion by adding subheadings might be helpful.

      We have reworked the manuscript to make it easier to follow. Specifically, we reworked the Introduction section to simplify it and to make it more concise. Further, we also shortened the extensive descriptions of modeling procedures that are not central for understanding the main findings. We think that these additions make it easier to follow the manuscript and our line of arguments, and to understand the applied analysis strategies.

      • Effect sizes are generally small. This should be acknowledged and critically discussed. Moreover, effect sizes are given in the figures but not in the text. They should be included to the text or at least explicitly referred to in the text.

      We agree that the effect sizes we report appear generally small. Importantly, the effect sizes were calculated by dividing differences in marginal means by the pooled standard deviation of the residuals and the random effects to obtain an estimate of the effect size of the underlying population rather than only for our sample. This procedure was used for the purpose of achieving more generalizable estimates. Due to considerable variance between subjects in our sample, this procedure resulted in comparatively small effect sizes. Nevertheless, we think this calculation of effects sizes results in more informative values because they can be viewed as estimates of population effects. We added specific information on the calculation of the effect sizes and a brief explanation that this procedure results in comparatively small effect sizes estimates to the Materials and methods and to the Results section (see below). In addition, we included standardized effect sizes whenever we report the respective post-hoc comparisons in the Results section.

      “Effects sizes were calculated by dividing the difference in marginal means by the pooled standard deviation of the random effects and the residuals providing an estimate for the underlying population (Hedges, 2007).” (Materials and methods section)

      “We used post-hoc comparisons to test direction and significance of differences in either outcome condition and report standardized effect sizes for these differences. Note that all reported effect sizes account for random variation within the sample, providing an estimate for the underlying population; due to considerable variance between participants in the present study, this results in comparatively small effect sizes.” (Results section)

      • The directions of dopamine and opioid effects on pain relief should be discussed.

      We amended our explanation of the hypothesis on the expected drug effects. As outlined there, we indeed expected opposite effects of levodopa and naltrexone on endogenous pain modulation in the active vs. the passive condition of the wheel of fortune.

      Reviewer #2 (Public Review):

      This study used the tonic heat stimulation combined with the probabilistic relief-seeking paradigm (which is a wheel of fortune gambling task) to manipulate the level of controllability and predictability of pain on 30 healthy participants. The authors focused on the influence of controllability and unpredictability on pain relief using pain reports and computational models and examined the involvement of dopamine and opioids in those effects. For that, the authors conducted the three-day experiments, which involved placebo, levodopa (dopamine precursor), and naltrexone (opioid receptor antagonist) administration on separate days. Lastly, the authors examined the relationship between dopamine-induced pain relief and novelty-seeking traits.

      This is a strong and well-performed study on an important topic. The paper is well-written. I really enjoyed reading the introduction and discussion and learned a lot. Below, I have a few minor comments.

      First, given that the Results section comes before the Methods section, it would be helpful to include some method and experimental design-related information crucial for the understanding of the results in the Results section. For example, how long was the thermal stimulus? What was the baseline temperature? etc. Maybe this information can be included in the caption of Figure 1.

      We thank the reviewer for this helpful suggestion. We agree that due to the order of the manuscript sections, more information on experimental design and the statistical analysis strategies should be included in the results section. Accordingly, we included more detailed information on the analysis strategies in the Results section (please see responses to comments #5 & #9). In addition, we added more detailed information on the experimental design and information such as the duration of the stimuli and the baseline temperature, marked in red below, to the caption of Figure 1 (Results section).

      “Figure 1: Time line of one trial with active decision-making (test trials) of the wheel of fortune game. Experimental pain was implemented using contact heat stimulation on capsaicin sensitized skin on the forearm. In each trial, the temperature increased from a baseline of 30 °C to a predetermined stimulation intensity perceived as moderately painful. In each testing session, one of the two colors (pink and blue) of the wheel was associated with a higher chance to win pain relief (counterbalanced across subjects and drug conditions). Pain relief (win) as outcome of the wheel of fortune game (depicted in green) and pain increase (loss; depicted in red) were implemented as phasic changes in stimulation intensity offsetting from the tonic painful stimulation. Based on a probabilistic reward schedule for theses outcomes, participants could learn which color was associated with a better chance to win pain relief. In passive control trials and neutral trials participants did not play the game, but had to press a black button after which the wheel started spinning and landed on a random position with no pointer on the wheel. Trials with active decision-making were matched by passive control trials without decision making but the same nociceptive input (control trials), resulting in the same number of pain increase and pain decrease trials as in the active condition. In neutral trials the temperature did not change during the outcome interval of the wheel. Two outcome measures were implemented in all trial types: i) after the phasic changes during the outcome phase participants rated the perceived momentary intensity of the stimulation on a visual analogue scale (‘VAS intensity’); ii) after this rating, participants had to adjust the temperature to match the sensation they had memorized at the beginning of the trial, i.e. the initial perception of the tonic stimulation intensity (‘self-adjustment of temperature’). This perceptual discrimination task served as a behavioral assessment of pain sensitization and habituation across the course of one trial. One trial lasted approximately 30 s, phasic offsets occurred after approximately 10 s of tonic pain stimulation. Adapted from Becker et al. (2015).”

      Second, it would be helpful if the authors could provide their prior hypotheses on the drug effects. It could be a little bit confusing that the goal of using these drugs given that levodopa is a precursor of dopamine, whereas naltrexone is the opioid antagonist, i.e., the effects on the target neurotransmitters seem the opposite. Then, I wondered if the authors expected to see the opposite effects, e.g., levodopa enhances pain relief, while naltrexone inhibits pain relief, or to see similar effects, e.g., both enhance pain relief. Clarifying which direction of expected effects would be helpful for novice readers.

      We thank the reviewer for pointing out that information on the expected drug effects should be explained in more detail. Indeed, we expected opposite effects of levodopa and naltrexone with respect to the effect of controllability on pain relief. Levodopa, as a precursor of dopamine, enhances dopamine availability and thus, phasic release of dopamine in response to events, for example, the reception of reward. Accordingly, we hypothesized that endogenous modulation by relief outcomes are increased in the active (reward) compared to the passive condition. In contrast, naltrexone blocks opioid receptors and as such it has been reported that naltrexone blocks placebo analgesia as a type of endogenous pain inhibition. Correspondingly, we hypothesized that naltrexone decreases endogenous pain modulation induced by actively winning pain relief compared to the passive condition. We expanded the explanation of these hypotheses in the Introduction section as follows:

      “We expected increased dopamine availability to enhance phasic release of dopamine in response to rewards, and hence, to increase the effect of active compared to passive reception of pain relief. In contrast, we expected the inhibition of endogenous opioid signaling to decrease the effect of active controllability on pain relief. The latter is based on the observation that blocking of opioid receptors attenuates other types of endogenous pain inhibition such as placebo analgesia (Benedetti, 1996; Eippert et al., 2009) or conditioned pain modulation (King et al., 2013). “

      Third, on the "Behaviorally assessed pain perception" results in Figs. 2D-F, I wonder why the results for the "pain increase" were still positive. Were the y values on the plots the temperature that participants adjusted (i.e., against the temperature right before the temperature adjustment)? or are the values showing the differences from the baseline (i.e., against the baseline temperature)?

      The behavioral measure was calculated as the difference in temperatures between the memorization interval at the beginning of the trial (i.e. the predetermined temperature perceived as moderately painful) minus the self-adjusted temperature at the end of the trial so that positive values indicate sensitization (i.e. an increase in sensitivity) and negative values indicate habituation (i.e. a decrease in sensitivity) across the stimulation within on trial (i.e. approx. 30 seconds of stimulation). In general, for a stimulation of approximately 30 seconds with intensities perceived as painful, perceptual sensitization is expected to occur (Kleinböhl et al., 1999).

      The outcome of the wheel of fortune game, i.e. the phasic decrease (winning) or increase (losing) in stimulation intensity, should indeed have opposite effects on this sensitization. A decrease in nociceptive input negatively reinforces pain perception, as seen in stronger sensitization in win trials, while an increase in nociceptive input punishes pain perception, as seen in reduced perceptual sensitization in lose trials. Using the a very similar task, Becker et al. (2015) found values indicating habituation within trials with temperature increases in lose outcomes. However, in this previous study, increases of +5°C were used for lose outcomes (as compared to +1 °C in the present study). Thus, in the present study the comparatively small increase in absolute stimulation temperature may not have been sufficient to induce within trial habituation to the tonic heat pain stimulation.

      Nevertheless, independent of the effect of the outcome (increase or decrease of the stimulation intensity) our focus was on the additional effect that controllability (active vs. passive condition) had on the perception of the underlying tonic stimulation within each outcome condition (i.e. on the same nociceptive input). Here we expected to see endogenous inhibition after winning and endogenous facilitation after losing in the active compared to the passive condition.

      We added more detailed information on the calculation of the behavioral measure and the expected perceptual modulation within each trial due to the stimulus duration in the Methods section as well as in the Results section.

      Methods section:

      “After this rating, participants had to adjust the stimulation temperature themselves to match the temperature they had memorized at the beginning of the trial. This self-adjustment operationalizes a behavioral assessment of perceptual sensitization and habituation within one trial (Becker et al., 2011, 2015; Kleinböhl et al., 1999). Participants adjusted the temperature using the left and right button of the mouse to increase and decrease the stimulation temperature. The behavioral measure was calculated as the difference in temperatures in the memorization interval at the beginning of each trial minus this selfadjusted temperature at the end of each trial. Positive values, i.e. self-adjusted temperatures lower than the stimulation intensity at the beginning of the trial, indicate perceptual sensitization, while negative values indicate habituation.” Results section:

      “Positive values (i.e. lower self-adjusted temperatures compared to the stimulation intensity at the beginning of the trial) indicate perceptual sensitization across the course of one trial of the game, negative values indicate habituation. For tonic stimulation at intensities that are perceived as painful, perceptual sensitization is expected to occur (Kleinböhl et al., 1999). Differences between the outcome conditions (win, lose) reflect the effect of the phasic changes on the perception of the underlying tonic stimulus. Differences between active and passive trials reflect the effect of controllability on this perceptual sensitization within each outcome condition.”

      Lastly, I wonder if it is feasible or not, but examining the effects of dopamine antagonists will be helpful for obtaining a more definitive answer to the role of dopamine in information-related pain relief. This could be a good suggestion for future studies.

      We thank the reviewer for this suggestion. We agree that antagonistic manipulation of the dopaminergic system could provide further insights and confirm the role of dopamine in shaping pain related perception and behavior. Moreover, we think that bidirectional manipulations of opioidergic signaling could also provide valuable insights and should be used for future research. We added the following sentences to the Discussion section:

      “Because the mechanisms underlying learning from pain and pain relief and their recursive influence on pain perception may contribute to the development and maintenance of chronic pain, it is crucial to better understand the roles of dopamine and endogenous opioids in these mechanisms. Accordingly, bidirectional manipulations of both transmitter systems should be used in future studies to better characterize their respective roles in shaping behavior and perception.“

    1. hey can act in waysthat benefit a small percentage of the group and unknowingly destroy all ofhuman life.

      This makes me think of where we are as a human race and what is happening to the environment. Industrialization, wealth, globalization, control of resources is driven by greed and profit for the few, and it is resulting in a crisis that the world may never recover from.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We are obviously very pleased with the general support expressed by the referees, and appreciate their critical comments. We detail below how we propose to respond to their suggestions and queries.

      In view of the fact that my lab is no longer in existence, I will have to rely on the kind generosity of my colleagues at EMBL to host former team members (the two first authors) for a limited period to come back to Heidelberg to carry out any further experimental work that may be needed. This means we will have to limit the work we can do to those experiments with the highest priority. However, we are optimistic that we will be able to obtain indicative results.

      We will also follow most of the referees’ other suggestions and requests for additional data and quantifications, as outlined (or already included) below.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: ASC is the Pyrin/CARD-containing adapter protein that functions as a core component of inflammasome signaling complexes. ASC functions downstream of various NLR- and ALR-inflammasome initiator proteins and upstream of the inflammatory caspases that function as inflammasome effector enzymes. This study uses a novel chimeric construct (Opto-ASC) comprising the Arabidopsis photo-oligomerizable cryptochrome 2 (Cry2-olig) protein with zebrafish ASC to generate transgenic zebrafish larvae wherein ASC oligomerization can be rapidly, dynamically and spatially induced by blue light illumination of either the entire larva or single cells within discrete tissues of an intact larva. Induction of these "opto-inflammasome" complexes is coupled with state-of-the-art, live-cell optical imaging of multiple single cell and integrative tissue parameters to assay various modes of regulated cell death within the peridermal and basal cellular layers of the larval skin. This experimental model was further combined with genetic manipulation of the expression of various zebrafish inflammatory or apoptotic caspases, as well as the two zebrafish members of the gasdermin family of pore-forming proteins which can mediate disruption of plasma membrane permeability without (pre-lytic) or with (pyroptosis) progression to lytic cell death.

      The main results of the study are: 1) introduction of a novel experimental system for dynamic and spatially resolved ASC oligomerization and speck formation within the cells of intact epithelial tissues of a living organism; 2) the ability of these optically induced ASC oligomers/specks to drive multiple modes of regulated cell death which exhibit some (but not all) features of lytic pyroptosis or non-lytic apoptosis depending on cell type and tissue location; 3) the ability of the dying epithelial cells containing optically-induced ASC specks to coordinate rapid adaptive responses in adjacent non-dying cells to maintain integrity/ continuity of skin epithelial barrier; and 4) unexpectedly, no obvious role for either of the two zebrafish gasdermins in the regulated cell death responses.

      Major Comments:

      1. Are the claims and the conclusions supported by the data or do they require additional experiments or analyses to support them? The major goal of this MS is to present a new experimental model (optogenetic activation of ASC oligomerization in transgenic zebrafish) that has the potential to provide new insights regarding the multiple mechanisms by which ASC can regulate inflammasome/ cell death signaling responses in the context of an intact organism. As noted above, some of the observed results are unexpected (e.g., lytic cell death independent of the zebrafish gasdermins in particular epithelial cells) and may reflect mechanisms unique to zebrafish as a non-mammalian vertebrate model versus the mammalian experimental systems (murine and human) that have informed most of our current understanding of how ASC coordinates inflammasome and cell death responses. However, the authors have used rigorous genetic approaches to rule out trivial explanations for the unexpected observations. Thus, no major additional experiments are required to support the claims and conclusions presented in the MS.

      2. Are the suggested experiments realistic in terms of time and resources? Yes. It would help if you could add an estimated time investment for substantial experiments: A few weeks.

      3. Are the data and the methods presented in such a way that they can be reproduced? Are the experiments adequately replicated and statistical analysis adequate? Yes.

      4. Are the experiments adequately replicated and statistical analysis adequate? Yes.

      Minor comments

      1. Specific experimental issues that are easily addressable:

      There's a significant concern with the use of LDC7559 (line 387) as a putative small molecule inhibitor of gasdermin D function to test roles (or lack thereof) of the zebrafish gasdermins in the ASC-triggered lytic cell death responses. A recent study (Amara et al. 2021. Cell. PMID34320407) has reported that LDC7559 does not inhibit gasdermin D (and possibly other gasdermins) but rather acts as an allosteric activator of PFKL (phosphofructosekinase-1 liver type) in neutrophils and thereby suppress generation of the NADPH required for the phagocytic oxidative burst and consequent NETosis. Thus, use of LDC7559 as a presumed gasdermin inhibitor in the current MS is problematic and should be deleted. As an alternative pharmacological approach to suppress gasdermin function, the authors might consider the use of either disulfiram (Hu et al. 2020. Nature Immunology. PMID32367036) and/or dimethylfumarate (Humphries et al. Science. 2020. PMID32820063). These would be straightforward additional experiments.

      We have ordered the reagents to do these experiments. We are optimistic that we will obtain data that will strengthen this part of the ms and be a pointer for future studies by others.

      We propose to keep the information on LDC7559 included, but to discuss the reservations the referee lists above - otherwise, others might ask why we did not even try this inhibitor. .

      Are prior studies referenced appropriately? there are some problems; see below. 2a. One paper is cited twice in lines 724-726 and 727-729. 2b. Another paper is cited twice in lines 790-792 and 793-795. 2c. No journal is included for the referenced study by Shkarina et al in lines 827-828. 2d. No journal is included for the referenced study by Stein et al in lines 831-832. 2e. No journal is included for the referenced study by Masumoto et al in lines 793-795. 2f. No journal is included for the referenced study by Kuri et al in lines 774-775.

      We are embarrassed about these omissions and mistakes and have corrected them..

      Are the text and figures clear and accurate? Generally, yes but with a few exceptions noted below: 3a. line 28: "morphological distinct" should read "morphologically distinct" 3b. line 161: this sentence contains in parentheses "for how long?" I think this was a comment by one author that wasn't removed from the final submitted MS 3c. line 945: spelling "balck" > "black" 3d. line 268: "whereas showed a delayed speck formation": the authors need to specify what model/ condition showed a delayed speck formation 3e. line 262: spelling "egnerated" > "generated"

      Thank you, all corrected.

      CROSS-CONSULTATION COMMENTS I also agree with the comments of the other 2 reviewers. Between the 3 sets of comments and suggestions, the aggregate review will provide the authors with a suitable range of feasible recommendations that will improve an already strong MS.

      Reviewer #1 (Significance (Required)):

      1. General assessment: As noted above, this the major goal of this MS is to present a new experimental model (optogenetic activation of ASC oligomerization in transgenic zebrafish) that has the potential to provide new insights regarding the multiple mechanisms by which ASC can regulate inflammasome/ cell death signaling responses in the context of an intact organism. The authors have used rigorous genetic approaches to rule out trivial explanations for the unexpected observations. In general, the MS describes an elegant model system that will provide a platform for identifying new mechanisms of ASC-dependent inflammasome signaling and regulated cell death.

      2. Advance: This MS describes a highly novel experimental model. Zebrafish are increasingly being used as a genetically tractable model for inflammasome signaling within integrated tissues of intact organism. As noted above, the advances are technical but also conceptual. Future application of this novel model is likely to yield identification of new mechanisms for ASC function in innate immunity and regulated cell death within the context of tissue homeostasis and host defense.

      3. Audience: Basic research and discovery.

      4. Please define your field of expertise with a few keywords to help the authors contextualize your point of view: My group investigates multiple aspects of inflammasome signaling biology at the cellular level with an emphasis on cell-type specific roles for gasdermins in coordinating downstream innate immune responses to inflammasome activation in various myeloid leukocytes (macrophages, dendritic cells, neutrophils, eosinophils, mast cells).

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Programmed cell death is critical for host defense and tissue homeostasis. How dead cells initiate cellular responses in the microenvironment with neighbouring cells in vivo is still largely unknown. The authors have chosen a Zebrafish model to tackle this question, given that this model shows advantages for imaging and addresses these pathways in a complex in vivo setting. Their recent development of light-induced activation of caspases (published in JEM) enabled them to investigate cellular responses to a specific type of cell death in vivo at a single cell resolution. In this study, the author further developed a light-induced activation of ASC to specifically look at inflammasome activation-mediated cell death in vivo. The authors have successfully established this system in zebrafish and also observed that Opto-Asc-induced cell death showed different phenotypes as compared to Opto-caspase-a/b-induced cell death. However, it is not really clear why. I have a few specific comments to be addressed or discussed.

      1. In Fig.3 and Fig.4, the majority of Opto-Asc localizes to the plasma membrane but not endogenous Asc. It seems that tagging affects its localization, which could potentially explain its slow kinetics in oligomerization.

      That is an interesting suggestion. The membrane enrichment is indeed reproducible and we have no full explanation for it. However, ASC itself seems to have some affinity for the cell cortex as seen by its association with the apical actin ridges in keratinocytes in the resting state (see e.g. figure 3A). Affinity of ASC for actin is also documented in the literature:(F-actin dampens NLRP3 inflammasome activity via flightless-1 and LRRFIP2 OPEN; https://doi.org/10.1038/srep29834).

      Perhaps the fusion to the optogenetic module somehow enhances the affinity through the initial dimerization. But we can only speculate and have no further evidence that would allow reliable conclusions.

      In Fig.7, the authors showed that deletion of Caspb, but not Caspa, affected the apical extrusion, without affecting cell death. This may indicate that other caspases, like Caspase-8 or/and caspase-3 were involved. This could be addressed through deletion of Caspase-8 or/and caspase-3.

      These are experiments we had in fact done. Unfortunately, they did not allow us to address the question, because the deletions resulted in embryonic lethality. We have added this information to the text.

      It is very surprising that Opto-Asc-mediated cell death is not dependent on Gasdermins, at least in Caspb-dependent apically extruded dead cells.

      Indeed – but see comment by and our response to reviewer 1. We hope to be able to provide additional data.

      CROSS-CONSULTATION COMMENTS I agree with the other two reviewers and don't have further comments.

      Reviewer #2 (Significance (Required)):

      The Opto-Asc zebrafish model developed in this study will enable us to specifically look at inflammasome-mediated cell death in vivo. This model is more physiologically relevant compared to Opto-caspase1 model.

      Audience interested in physiological function of inflammasome activation, but it is questionable whether such a tool will address mechanisms in mammalian cells. Eventually, more evidence for the latter could be provided.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this article, de Carvahlo and colleagues describe a novel optogenetic tool allowing single cell and temporally controlled induction of ASC clusters in vivo (in zebrafish), a central adaptator protein of the inflammasome complex which is involved in the induction of pyroptosis. This alternative mode of programmed cell death is involved in pathogen response and promote cell swelling and the release of pro-inflammatory factors. Previous works have shown that the inflammasome activation is associated with the formation of a large cluster of ASC protein (called speck) which promotes then the recruitment and the activation of caspase 1. Specks were previously characterised by the same group in vivo (in zebrafish larvae) and could be induced by the overexpression of ASC protein. This however was not compatible with fine spatio-temporal control of speck formation, thus preventing very refined characterisation of the dynamics and the distinction of the cell autonomous and non-cell autonomous effects.

      By fusing ASC to the blue-light sensitive oligomerising protein Cry2-olig under the control of a heat shock promoter, they could induce time controlled induction of speck at the single cell level, which is then followed by cell extrusion and cell death both in the periderm and the basal cell of the skin of zebrafish larvae. Doing so, they could characterise the dynamics of speck formation as well as key paramters affecting its dynamics and the subsequent extrusion. While ASC activation led to apical or basal extrusion in the periderm layer followed by non-apoptotic cell death, it triggers basal extrusion and apoptosis in the basal layer. Importantly, periderm cell elimination does not seem to strictly follow all the features of pyroptosis as it does not require GSDM, and relies on Caspb (not Caspa). It is also associated with strong Calcium release both in the dying and neighbouring cells.

      The authors performed a very careful characterisation of the tools and the optimisation of the condition to form speck and eliminate cells. The experiments are very well performed with all the necessary controls. The results, while to some extend still hard to fully interpret for some aspects, illustrate the plasticity of cell death and cell extrusion, which include several very interesting observations on the direction of extrusion, putative compensatory modes of cell death upon Caspase1 perturbation and the difference of response to ASC clustering depending on the tissue layer. While it is not the main point of this study, the observation that the direction of extrusion can vary very significantly in different genetic backgrounds is also extremely interesting.

      The atypical cell elimination revealed in the system may require further characterisation in the future and suggest that the tools may not be the best to study bona fide pyroptosis. However, I don't believe there is always such strict separation between the modes of cell death and I am sure that it could lead to very interesting insights on inflammasome formation, extrusion and charcaterisation of downstream signalling in vivo, so overall this will be a very interesting resource for the community working on inflammasome, cell death and extrusion.

      I have some suggestions that could help to better characterise the mode of elimination as well as the mechanism of speck formation. I have also some suggestions for comparison with other published results as well as some text editing.

      Main points :

      1. So far, it remains a bit unclear how the authors define precisely speck versus any aggregate and the light induced clusters of Cry2 olig. Is it related to the timescale of formation and/or the lifetime of the aggregates? Is it related to their size?

      There Is no ‘formal’ definition of an inflammatory speck apart from it being the unusually large aggregates that ASC forms once it is activated. Light-induced clusters of Cry2Olig alone, or of Cry2olig fusions with proteins that do not normally oligomerize are much smaller (extensive documentation in the literature).

      A speck is thus a stable aggregate of ASC which is usually around 1 µm in size and is able to activate downstream caspases. But neither of these aspects alone are unique to ASC: prion-like structures can also be large aggregates (indeed ASC-specks have been compared to prions), and much smaller molecular assemblies can activate caspases. Thus ‘speck’ is more an operational definition, and ‘natural’ specks do have both of these properties, but as our experiments show, the properties can actually be separated. I would rather not try to narrow or change the definition, but leave any further discussion to the experts in the field.

      Figure 4E shows a number of variants of ‘speck’-like and other multimers: ASC-mKate and Opto-ASC form large single specks in the presence of endogenous ASC. Opto-ASC specks are only slightly smaller than those formed by endogenously tagged ASC-GFP (see also Supplementary Figure 2E.. Opto-PYD recruits endogenous ASC and becomes incorporated into a speck of approximately the same size, while Opto-CARD does so less efficiently. All of these kill cells. In the absence of endogenous ASC, Opto-ASC forms much smaller specks, and very many in each cell, but these are still functional as seen by the fact that they still kill cells (not the large spot at t = 60 min in the right half of Fig. 4E is not a speck, but the contracted dying cell). Both Opto-PYD and Opto-CARD also form only the small aggregates (quantification will be included), with Opto-PYD still killing the cell by virtue of its ability to recruit caspases via their PYD, whereas Opto-CARD does not.

      Since the authors use most of the time constant blue light illumination, could they also assess how long the speck remains after stopping blue light exposure and quantify their lifetime (relative to the CRY2olig cluster lifetime)?

      Briefly, any speck that contains a functional ASC moiety remains stable and does not disassemble once the blue light is turned off. In skin cells it is not possible to make quantitative measurements because they are killed by the speck. Opto-ASC specks remain stable until they are taken up by macrophages, as originally reported for ASC-GFP specks in Kuri et al. 2017.

      Stability can best be assessed in muscle cells, which do not die upon speck formation. The figure below shows that specks begin to form within minutes of a short pulse of illumination and remain stable (and indeed grow further) for at least 60 min.

      Here is an example:

      Revisions Figure A:

      __Stability of __Opto-ASC specks in muscle cells after exposure to a single pulse of blue light

      Specks in muscle cells expressing Opto-AscTg(mCherry-Cry2olig-asc) are induced by a single illumination with blue light (488nm) at t = 0 for 32 seconds. Multiple oligomers begin to form within 6 minutes, continue to gradually increase in number and, and remain until the end of the movie (60 mins).

      Cell outlines in the overlying epithelium labeled by AKT-PH-GFP are faintly visible in the first frame. Scale bar is 20 mm.

      Similarly could they provide some comparison of the size and localisation of CRY2 olig clusters compared to the speck.

      For size, see above. In addition, the size of the Cry2 oligomers as well as of Opto-ASC specks can vary with expression levels.

      For location, Cry2olig clusters are usually distributed throughout the cell, as seen in most of the right panels in Fig 4E, and in earlier work in cultured cells (e.g. Taslimi et al 2014). ASC specks can form anywhere in the cell, while Cry2olig-ASC has a preference for the cell cortex, but this is not absolute. In keratinocytes, but not in basal cells, the speck usually forms close to the lateral membrane. In the absence of endogenous ASC no real speck is formed but Opto-ASC in this case shows no clear localisation of Opto-ASC to the membrane.

      In view of the variation we see, a strict quantification is difficult: what would be the ‘correct’ definition of classes to look at? To make statistically significant statements, we would need an enormous number of examples in which we could control for all of the variation of expression levels, cell size, day to day variation etc, and we currently don’t have these. We hope the qualitative evidence in the micrographs we show represents the differences well, and we will be happy to provide a larger number of images, if the referees feel this would be helpful.

      With the non functional CRY2olig Asc fusion (Cter fusion), do they still see transient olig2 clustering which then reverse when blue light illumination is gone? I think it might be useful to clarify these points in the main text since most of the quantifications are based on speck localisation/numbering, so their characteristics have to be very well defined.

      That would be interesting to work out, but after our initial experiments with this construct, we did not pursue this further, since it was not a pressing issue at the time. If we can fit this into our planned experimental time table, we will re-assess it. However, while of interest, we feel these data would not add substantially to what we know at this point.

      1. In all the snapshots of speck formation, there seems to be a relative enrichment of the ASC signal at the cytoplasmic membrane (relative to the cytoplasm) prior to strong speck formation. This seems specific of optoASC as it does not seem to happen for the endogeneous ASC or upon overexpression of ASC-mKate (both in this study and in the previous study published by the same group). Is this apparent membrane enrichment something reproducible? (I see that on pretty much every example of this manuscript). If so what could be the explanation? Is there an actual recruitment at the membrane or is it because the membrane/cortical pool takes longer to be recruited in the speck (hence looking relatively more enriched at intermediate time points).

      See our speculations in response to point 1 of the first referee.

      We too would really like to understand this, but see no easy and efficient way of testing it at this point.

      1. There is also a very distinctive ring accumulation that seems to match with apical constriction and/or a putative actomyosin ring (since this is perfectly round, it could match with a structure with high line tension) (see Figure 1E, Figure 3B, Figure 4D...). Is it something already known? Could the authors comment a bit more on this? This could suggest that ASC accumulates in actomyosin cortex, which would be a very interesting property.

      We see that we had failed to be clear about this.

      There are two types of actin-labelled rings that appear around dying cells. One is formed by the epithelial cells that surround the dying cell. This structure becomes visible as soon as the cell begins to shrink. That it is formed by the surrounding cells is clear from mosaics where the dying cell does not express the actin marker (e.g. suppl. Figure 4A) and the parts of the ring are seen only in the subset of surrounding cells that do express the marker. This ring is also not circular, but follows the polygonal shape of the shrinking cell. We believe that this is the contractile structure that closes the wound, as observed in many other cases of wound healing.

      The other is the one the referee describes here. It is formed within the dying cell, as shown by the fact that it is visible in labelled cells when all the surrounding cells are negative for the marker. The other difference is that it appears only once the dying cell has already contracted considerably and begins to round up and be extruded (most clearly seen in Fig. 1E). The third referee had raised a similar point in relation to the same structure seen in Fig. 6C, and we provide below the requested analysis. It relies on resolution in the y-axis, which is unsatisfactory, but nevertheless, it is clear that this ring is in a plane above the apical surface of the epithelium (marked by the red membrane marker, i.e is present in the detaching cell. It may well simply be actin appearing in the entire cortex of the cell as it rounds up and looking like a ring when seen from above. A completely different method for imaging would have to be set up to document this reliably, but we hope that these explanations help to clarify the confusion we may have created.

      We do not see this accumulation in cells that leave the epithelium towards the interior (see figure in the response to ‘minor points’ below).

      In the end, since cell death can also occur without visible speck formation, I am wondering if they are eventually the most relevant structure to be quantified. Is it because speck can be dissolved upon caspase activation and could it relates to the speed at which caspase are activated (which may not leave enough time for strong aggregation and visible speck formation)? I believe it would help to get more explanation/discussion on this point.

      As already mentioned above, it is indeed not obvious what the significance of the large speck is (and it is extremely puzzling why it is that normally one a single one forms in each cell). We agree that it is not necessarily functionally relevant for the signalling outcome to quantify this property – but nor was this the purpose of this work. Regardless of what kind of aggregate is formed, the optogenetic tool allows the induction of ASC-dependent cell death, and therefore the study of the ensuing cellular events.

      The compensatory mechanisms that lead to cell death/extrusion despite depletion of caspb is very interesting. Could the authors use some pan caspase inhibitor (like zvad FMK) to confirm that this block opto-ASC cell death also in this context? Alternatively could they check the status of effector caspase activation using live probe (nucview) or immunostaining in the context of caspb depletion?

      Those would be interesting avenues to pursue. However, for the reason stated above (Leptin lab closing down, members of fish group no longer at EMBL), we are forced to restrict ourselves to the most important experiments, and think we should prioritize the ones mentioned above.

      1. If I understand well, Figure 7C on the right side suggest that the double KO cells don't extrude (if indeed "no change" mean no extrusion, by the way this nomenclature may deserve some clarification in the legend). I don't think these results are mentioned at any point in the main text, but it would be important to include them (since this is an important control).

      This interpretation is in fact correct, and we have changed the labelling in the figure to ‘no immediate death’

      1. Waves of calcium following cell death and cell extrusion have been previously characterised (Takeushi et al. Curr Biol 2020, Y Fujita group). Interestingly, in this previous article they observed waves of calcium near Caspase8 induced death (in MDCK) as well as near laser induced death in zebrafish, while apparently the authors don't see such Calcium waves upon Caspase8 activation in the zebrafish here. I think it would be important to include a comparison of the authors results with this previous paper in the discussion

      We have included this in our discussion.

      There is also a previous study which characterised the impact of caspase1 on cell extrusion (Bonfim Melo et al. Cell Report 2022, A. Yap lab) which promotes apical extrusion in Caco2 cells. I think it would also be important to include this work in the discussion and to compare with the results obtain here in vivo.

      We have included this in our discussion.

      Other minor points:

      1. Line 439: are the numbers given in percentage? if these are absolute numbers, it is out of how many cells ? Same remark line 445: what are the number of cases representing? (percentage?)

      We have rephrased this to make it unambiguous.

      Figure 5: could the authors show periderm and basal cell extrusion with the same type of markers? (membrane or actin or ZO1)? This would help to really compare accurately the morphology and the remodellings associated.

      We used Utr-mNeonGreen to lable actin both in periderm and basal cells. Actin labeling of extruded periderm cells is shown in figure 6C, actin labeling of a dying basal cells and the overlying periderm cells is shown in supplementary figure 5A.

      Is there any obvious differences in cell size or characteristic cell shape between the classic lab strains (golden, AB, AB2B2) and the WIK and experiment strain used here? I do acknowledge that this is clearly not the focus of this study, but given this striking difference (which is related to an important question in the field of extrusion), it would interesting to mention this if there is anything obvious.

      We will make these measurements and include the data.

      1. Figure 6C: what is exactly the localisation in Z of this strong actin accumulation observed during apical extrusion? Is it apical or is it rather on the basal side of the cell? A lateral view of actin could be useful in this figure for all the different conditions described.

      See response to ‘main point 3’ above.

      The images that show this are below. However, even from these images it is hard to appreciate the locations. They are in fact much easier to see by following the movies over time, and through the z-sections at any given time point. We will of course submit the movies with the manuscript.

      Revisions figure B:

      Localization of actin in the yz and xz planes in Opto-Asc-induced cell death and Opto-caspase-8-induced apoptosis

      Orthogonal projections of images of apically (A) and basally (B, C) extruded cells at four time points from time lapse recordings. Each time point shows the x-z plane and the orthogonal yz and and xz planes, in which the apical sides of the epithelium faces the x-z image.

      Actin is labeled with mNeonGreen-UtrCH (cyan), plasma membranes and internal membranes by lyn-tagRFP (magenta). Actin is initially concentrated in the apical cortical ridges of periderm cells.

      1. Apically extruded cell after death is induced by Opto-Asc. As the cell dies actin is lost from the apical ridges and accumulates in the cell cortex in a plane above the original apical surface of the epithelium
      2. Basally extruded cell after death is induced by Opto-Asc. Actin is retained in the apical ridges as the cell shrinks and moves below the epithelium within the dying cell.
      3. Basally extruded cell after death is induced by Opto-Caspase 8. The apical surfaces forms a transient dome in which the actin ridges remain intact before the dying cell is internalized. .

      Figure S3B: could the authors show the utrophin-neonGreen channel separatly? Is there a ring of actin in the dying cell? Also are the membrane protrusion formed more basally? (I suspect this is a z projection, but this would need to be specified in the legend).

      1. Figure 4A legend: I guess the authors meant red arrowheads rather than frame ? This has been corrected

      2. I list below a number of typos I could find in the main text

      Thanks for noticing these, we have corrected all of these, as well as further typos we found.

      Line 29: in Line 30: but Line 151 : from the ...[...] (tissue ?) Line 161: there is most likely a text commenting that was not removed (for how long?) Line 262: generated (egnrtd) Line 268: whereas showed a delay (the subject is missing) Line 269: a point is missing Line 362: which the lack Line 368: a point is missing Line 400: a space is lacking "cellsdepending" Line 438: shrinkwe (space) Line 459 : or I infections Line 525: there is a point missing.

      CROSS-CONSULTATION COMMENTS I generally agree with all the comments raised by the other reviewers which partially overlap with comments I had (see for instance referee two for the role of other caspases and the membrane localisation of the probe).

      Reviewer #3 (Significance (Required)):

      In this article, de Carvahlo and colleagues describe a novel optogenetic tool allowing single cell and temporally controlled induction of ASC clusters in vivo (in zebrafish), a central adaptator protein of the inflammasome complex which is involved in the induction of pyroptosis. This alternative mode of programmed cell death is involved in pathogen response and promote cell swelling and the release of pro-inflammatory factors. Previous works have shown that the inflammasome activation is associated with the formation of a large cluster of ASC protein (called speck) which promotes then the recruitment and the activation of caspase 1. Specks were previously characterised by the same group in vivo (in zebrafish larvae) and could be induced by the overexpression of ASC protein. This however was not compatible with fine spatio-temporal control of speck formation, thus preventing very refined characterisation of the dynamics and the distinction of the cell autonomous and non-cell autonomous effects.

      By fusing ASC to the blue-light sensitive oligomerising protein Cry2-olig under the control of a heat shock promoter, they could induce time controlled induction of speck at the single cell level, which is then followed by cell extrusion and cell death both in the periderm and the basal cell of the skin of zebrafish larvae. Doing so, they could characterise the dynamics of speck formation as well as key paramters affecting its dynamics and the subsequent extrusion. While ASC activation led to apical or basal extrusion in the periderm layer followed by non-apoptotic cell death, it triggers basal extrusion and apoptosis in the basal layer. Importantly, periderm cell elimination does not seem to strictly follow all the features of pyroptosis as it does not require GSDM, and relies on Caspb (not Caspa). It is also associated with strong Calcium release both in the dying and neighbouring cells.

      The authors performed a very careful characterisation of the tools and the optimisation of the condition to form speck and eliminate cells. The experiments are very well performed with all the necessary controls. The results, while to some extend still hard to fully interpret for some aspect, illustrate the plasticity of cell death and cell extrusion, which include several very interesting observations on the direction of extrusion, putative compensatory modes of cell death upon Caspase1 perturbation and the difference of response to ASC clustering depending on the tissue layer. While it is not the main point of this study, the observation that the direction of extrusion can vary very significantly in different genetic backgrounds is also extremely interesting.

      The atypical cell elimination revealed in the system may require further characterisation in the future and suggest that the tools may not be the best to study bona fide pyroptosis. However, I don't believe there is always such strict separation between the modes of cell death and I am sure that it could lead to very interesting insights on inflammasome formation, extrusion and charcaterisation of downstream signalling in vivo, so overall this will be a very interesting resource for the community working on inflammasome, cell death and extrusion.

      My expertise are in cell extrusion, optogenetics, apoptosis and epithelial mechanics. I am not a specialist of the inflammasome and pyroptosis.

    1. Adam Marshall Dobrin • You Technocrat Founder at XCALIBER DAO/ARKLOUD.XYZ. Writer. Coder. Futurologist. Aspiring dad. 1m • 1 minute ago I came to the particular city that I am in to prove that Operation Gunsider and Project Y were "ruce's" ... #informationoperations that were part of a grand design that literally includes the whole of "Majestic" which is another key word in the research path to where we are going.It includes more than that, much more--on this song and who you all are. Closer to God, than ... "most." Closer to me, too. It includes the entirety of the KJV and "all of religion as seen through the eyes of the Christ." It includes missions to teach Latin and English and "reading and writing" to the entirety of humanity; and at this point we have to pause and really understand what is going on.We have an "Adam code" that is something like ##305407; its a word that includes research and development on what to do when the "everybody up?" generation fails or succeeds; it is a way to get "way more voters involved" in a place where we once had a world that could have have saved its past, clearly do to the inability to see it at all, much less travel to it. Today I need basic computer knowledge and general concepts of things like terraforming and physics added to the list of things that are "required to vote" in the Constitutional realms considerably here perhaps the somewhere between the third and the fifth Houses of the Capitol of the United States.I would like to make the entirety of the past, the entire A.D. timeline and perhaps something bigger than that "intelligent, omni-important, and oligarchical rulers of themselves." I would like to see Technocracy flourish as a word that literally involves the Halo of Cortana and its connection to "how we vote." I wrote for a brief time on how to engage an audience in something called "subconscious voting" and how to connect "checking your vote" to the only Labor the Party has to accomplish on it's WED/hour of "required work left once we are done with ... automation, roboticization and the revolution colloquially associated with Bolshevik and Ford.I need us to think today what kind of classes we would need to put together for ... "members of the midieval civilization of lore" ... people who coincided in cities with Cathedral's or Mission's that match the architecture associated with the One True Church--whether it be the source of the Spanish Armada or the Eastern Orthodox Byzantine Fault. What kind of classes are required to understand things like "game theory" and "solar fusion" and also the inner workings of Heaven enough to intelligently vote on whether or not another group of people, for instance, is "educated enough to be considered a peer, or a citizen."UK Home Office U.S. Immigration and Customs Enforcement (ICE) Immigration And Nationality Services (IANS) It's interesting to "see this answer" INS has aided me here in assuming you understand that acronym has changed from the historical truth, as we consider "naturalization" and what kind of history/nationhttps://lnkd.in/dvUKdGZf

      OPERATION JAZZORCIVILIZE

      Jazzercize is something my mommy did around the time I was born. It's literally just "jamboree" or some kind of popular women's ((predominantly)) exercize group. They met all over America in the 80's and they wore some funny socks ;) It is the word associated with "changing everybody up" to include the entirety of the capable group of humanity ever living on a rock with religion. It could be bigger than that, but here I've sort of defined it to literally link with significance only the Church of Rome and things that came after it. It is literally what it is, the A.D. timeline. It most likely includes a group of "less than all" who carried things like knowledge and Asimov's Foundation from the Pentagon Technocrat's "Torah guild" ... many thousands of years before the day Christ appears to have been born or died in history.

      this is a big deal. I have come to a place in Deseret I associate with a military group that is literally and ((I pray)) responsible for the colonization or the co-colonization of the known galaxy. I believe we have a number of coveted extra-galactic operations aswell, and that they include Soviet and American as well as European operations outside of Deseret. I have come here to prove that Operation's Gunsider, Holocaust and Y are "Information Operations" which is modern NSA-talk for "propaganda designed for a purpose." I do not believe the technologies are real, and it's important to understand I lived through the time others call "the Cold War" and saw with my own eyes videos of rocket's traveling along United States Federal USHWY1 up and down the Eastern Seaboard ... rather than I-95 though it existed because of known and intentional fortifications on that road for equipment so heavy it would crumble bridges. We are in a place where ... London may be the only bridge left in existence after the move from NM to NV .. if you know what it means to lose pillars of Samson in a place like the Holy Temple's heart.

      I need this to be taken seriously. If we want to stop moving towards a point where we are going to be angrier with each other than we should be; I need someone in the world with a public company to hire me to build something ... "more public than companies." It starts with software and it ends with codification in the Constitution and beyond. It's a "big deal" this is a revolution bigger than the invention of voting and money; this is big. I need a pay check from a company with that kind of oversight at the very least.

      I am open to FTSE, CAC, ASX, DAX, or similar companies on those exchanges to ones listed on the S&P 500 or the DOW. The exchanges listed are not all inclusive, but it means something that I "know what they are" I studied them and we need something at least as big as an entity governed by laws to be listed on "those" ... a private company in Dubai, for instance; is not large enough to do this properly.

      I came to the particular city that I am in to prove that Operation Gunsider and Project Y were "ruce's" ... #informationoperations that were part of a grand design that literally includes the whole of "Majestic" which is another key word in the research path to where we are going.

      It includes more than that, much more--on this song and who you all are. Closer to God, than ... "most." Closer to me, too. It includes the entirety of the KJV and "all of religion as seen through the eyes of the Christ." It includes missions to teach Latin and English and "reading and writing" to the entirety of humanity; and at this point we have to pause and really understand what is going on.

      We have an "Adam code" that is something like ##305407; its a word that includes research and development on what to do when the "everybody up?" generation fails or succeeds; it is a way to get "way more voters involved" in a place where we once had a world that could have have saved its past, clearly do to the inability to see it at all, much less travel to it. Today I need basic computer knowledge and general concepts of things like terraforming and physics added to the list of things that are "required to vote" in the Constitutional realms considerably here perhaps the somewhere between the third and the fifth Houses of the Capitol of the United States.

      I would like to make the entirety of the past, the entire A.D. timeline and perhaps something bigger than that "intelligent, omni-important, and oligarchical rulers of themselves." I would like to see Technocracy flourish as a word that literally involves the Halo of Cortana and its connection to "how we vote." I wrote for a brief time on how to engage an audience in something called "subconscious voting" and how to connect "checking your vote" to the only Labor the Party has to accomplish on it's WED/hour of "required work left once we are done with ... automation, roboticization and the revolution colloquially associated with Bolshevik and Ford.

      I need us to think today what kind of classes we would need to put together for ... "members of the midieval civilization of lore" ... people who coincided in cities with Cathedral's or Mission's that match the architecture associated with the One True Church--whether it be the source of the Spanish Armada or the Eastern Orthodox Byzantine Fault.

      What kind of classes are required to understand things like "game theory" and "solar fusion" and also the inner workings of Heaven enough to intelligently vote on whether or not another group of people, for instance, is "educated enough to be considered a peer, or a citizen."

      UK Home Office U.S. Immigration and Customs Enforcement (ICE) Immigration And Nationality Services (IANS)

      It's interesting to "see this answer" INS has aided me here in assuming you understand that acronym has changed from the historical truth, as we consider "naturalization" and what kind of history/nation

      https://lnkd.in/dvUKdGZf

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to reviewers

      Reviewer #1

      Reviewer #1 (evidence, reproducibility and clarity (required)):

      Winter et al. present a study of Ebola virus fusion in the acidic environment of the late endosome. Based on cryo-ET of Ebola virions undergoing entry into cells, they note that the VP40 matrix is disassembled and dissociated from the viral membrane in virions seen in the endosome. Subsequent in vitro and computational analyses suggest that protons diffuse across the viral membrane and neutralize anionic lipids on the inner leaflet. They argue that this loss of negative charge reduces the affinity of VP40 for the viral membrane. They further suggest that VP40 dissociation from the viral membrane precedes GP-mediated membrane fusion and contributes to reduction in the energy barrier for membrane stalk formation. Whereas most studies have focused on the importance of acidic pH in triggering GP conformational changes during fusion, the present work contributes new appreciation for VP40-membrane interactions.

      We would like to thank the reviewer for all the insightful comments and appreciation of the novelty.

      In the cryo-ET experiments aimed at visualizing Ebola entry, do the authors see evidence of viral membrane fusion? There is no mention of this in the text. Knowing that the virions that show disassembly of the VP40 matrix are in fact the virions that productively enter cells would support the conclusions of the study. As is stands, one is forced to wonder whether the virions that show VP40 disassembly prior to fusion ultimately fuse.

      *We first note that the EBOV virions shown in Figure 1 entering host cells were captured by cryo-ET at 48 hours post infection and resulted from 2-3 rounds of infection, thus the virions can productively enter the cells by micropinocytosis. Virions that are not able to undergo membrane fusion would be processed in the lysosomes and would not be detectable by cryo-ET at 48 hours post infection. In addition, the virions captured in late endosomes contain nucleocapsids, hence these virions are likely infectious. Together, this is good evidence that we really see events after successful membrane fusion. *

      *We fully agree with the reviewer that capturing a fusion event would provide further proof that fusion depends on prior disassembly of the VP40 matrix layer. To address this, we acquired additional data on cells infected at different time-points post-infection (15 cells imaged); regrettably, we have not been successful in capturing a membrane fusion event, presumably due its fast kinetics. In this study we are technically limited with the amount of the virus we can use for infection in BSL4. The current dataset was generated at an MOI of 0.1 and this makes capturing entry events difficult as we would need an MOI of at least 100-1000 to increase the chances of capturing such a rare event. *

      *Considering the technical difficulties to perform the experiment under BSL4 conditions, we have in addition performed a similar experiment using EBOV VLPs at high concentration (estimated MOI > 100) composed of VP40 and GP (Fig. S5). Despite the high VLP concentration, we could only find 2 tomograms out of 18 tomograms showing VLP entry events. These clearly show that the VP40 matrix is disassembled in VLPs residing in endosomes. The same lamellae displayed sites of viral fusion as evident from enlarged endosomal membrane surfaces studded with GPs facing endosomal lumina. Hence, this new data supports our results that VLPs that undergo VP40 disassembly are able to fuse. We have included the new supplementary figure S5 and added the following sentence to the main text: *

      Lines 96-102: “We were not able to capture virions residing in endosomes in the process of fusing with the endosomal membrane, presumably because virus membrane fusion is a rapid event. However, in a similar experiment using EBOV VLPs composed of VP40 and GP, we could confirm the absence of ordered VP40 matrix layers in VLPs inside endosomal compartments. Moreover, we were able to capture one fusion event and several intracellular membranes studded with luminal GPs, indicating that fusion had taken place (Fig. S5).”

      In the cryo-ET experiments that evaluate VP40 disassembly in vitro, why do the authors leave out NP from their VLP preparations? There is some evidence in the literature (Li et al., JVI 2016) that NP is necessary to form particles with native morphology. If the authors feel that NP is not necessary for their experiment, perhaps this could be noted.

      *Thank you very much for this important comment. Throughout this study, we mainly focused on the fate of the VP40 matrix during entry and thus reduced the complexity of the VLPs used to the minimum – VP40 and GP, so indeed NP was left out before. To address the role of the nucleocapsid in Ebola VLPs uncoating, we have now also included data on VLPs prepared by expression of nucleocapsid components (NP, VP24 and VP35) in addition to GP and VP40. Cryo-ET analysis of these VLPs showed that VLPs mainly contain loosely coiled nucleocapsid. This is consistent with a study by Bharat et al 2012, which shows that compared to virions, VLPs displayed heterogeneous nucleocapsid assembly states and reduced incorporation of nucleocapsids. It is important to note that VLPs containing nucleocapsid also displayed disassembled VP40 matrices at low pH (Fig. S7). Hence, nucleocapsid proteins do not influence the VP40 disassembly driven by low pH and GP-VP40 VLPs can be used as model to study VP40 uncoating. *

      *We included a statement shown on lines 150-153: “We further repeated the experiment using VLPs composed of VP40, GP and the nucleocapsid proteins NP, VP24 and VP35, and observed the same low pH-phenotype described above. These results show that nucleocapsid proteins do not influence the VP40 disassembly driven by low pH.” *

      The authors argue that acidic pH neutralizes the charge of PS phospholipids, thereby removing the electrostatic interactions of basic residues in VP40 and PS. They also note in the Methods section that 7 amino acids in VP40 are predicted by PROPKA to be protonated at pH 4.5. If the authors feel that protonation of these 7 amino acids is not involved in the loss of affinity for PS, this could be stated explicitly and justified. Could the protonation of these 7 amino acids contribute to disassembly of the VP40 lattice, rather than dissociation from the membrane?

      Thank you for this interesting comment. We note that the amino acids predicted to be protonated (*E76, E325, H61, H124, H210, H269, H315, see below) are far away from the interaction interface with the membrane and also away from the intra-dimerization domain. Hence, they do not likely contribute to the loss of affinity for PS but may contribute to conformational changes that facilitate the disassembly of the VP40 matrix. For clarification, we have added the following statement to the methods section: *

      Lines 541-544: “Importantly, these residues are located away from the interaction interface of VP40 with the membrane and their protonation accordingly does not influence membrane-binding. However, protonation of these residues may contribute to conformational changes that facilitate the VP40 matrix disassembly.

      Minor: Figure S5C is difficult to interpret. The red frame on the bars that indicates data acquired at low pH is nearly invisible. Better might be to indicate explicitly (ie, with words) the pH at which data were acquired.

      Thank you very much for this comment. We have changed the design of the graph accordingly. Please note that the figure numbering has changed and that Figure S5C is now Figure S6C.* * Reviewer #1 (significance (required)): The significance of the study stems from the idea that the VP40 lattice and its interaction with the viral membrane plays a direct role in facilitating viral fusion. To my knowledge, this has not been previously addressed. The significance would be considerably increased if the authors were able to demonstrate by cryo-ET that the virions with disassembled VP40 were in fact the virions that productively fused. Nonetheless, this work should be of broad interest to researchers studying viral fusion as it may represent a phenomenon relevant to numerous viruses that enter cells via the endocytic route.

      Reviewer #2 Reviewer #2 (evidence, reproducibility and clarity (required)):

      The manuscript by Winter et al., entitled "The Ebola virus VP40 matrix undergoes endosomal disassembly essential for membrane fusion" describes the structural aspects of the events that precede Ebola virus (EBOV) membrane fusion in late endosome and virion uncoating in the cytosol. By combining state-of-the-art cryo-electron tomography (cryo-ET) with biophysical and computational techniques, they have elucidated the pivotal role of the ebolaviral matrix virion protein 40 (VP40) in modulating the fusion process, in particular discovering that disassembly of the VP40 ordered lattice is low pH-driven, occurs despite the absence of a viral ion channel within the filovirus envelope and takes place through the weakening of VP40 interactions with lipids at the interface between the ebolaviral envelope and matrix. Overall, the manuscript is well written and the research work is very well conceived, with solid orthogonal experimental approaches that mutually validate their respective results. It is opinion of this reviewer that the paper contributes to the elucidation of a key step in the EBOV infection cycle and that it will be of great interest for the readership of Review Commons and for the community of structural biologists. Therefore, I recommend the publication of this paper, however after some minor revision to the text, the figures and the figure legends, which show inconsistencies in the terminology used, the acronyms and could be easily improved by some little graphical editing.

      Thank you very much for your positive feedback and your comments.

      Comments:

      • By starting their abstract and introduction sessions with the term "Ebola viruses" the authors are (on purpose?) preparing the reader to the implicit statement that their findings could be a paradigm model for the other members of the Ebolavirus genus. This is an exciting picture, especially in perspective of VP40-targeting drugs development. Therefore, although conclusions in this sense would probably require further studies, I encourage the authors to implement their figure 3 (or related supplementary figure) with a multiple-sequence alignment, and the relative text in the manuscript, by showing if and how much the basic patch at the C-terminus of VP40 is conserved within the Ebolavirus genus, especially the residues Lys224, Lys225, Lys274 and Lys275.

      Thank you very much for this comment. We have added a corresponding sequence alignment highlighting the high conservation of the basic patch of amino acids across all Ebola virus species (Suppl. Fig. S6). In the text, we refer to the sequence conservation as follows:

      Lines 213-215: “These interactions are driven by basic patches of amino acids which are highly conserved across all EBOV species (Fig. S8 H), further emphasizing their importance in adaptable membrane binding.”

      • It is a bit inconvenient for the reader to follow how a story unfolds while jumping back and forth between figures, and this is why I would recommend to move the period of the sentence at lines 88-91 to the session where figure 5 is discussed.

      *We refer in fact to Figure 1 and fixed the reference accordingly (line 95). *

      • Please, avoid the use of the slang "Ebola" without the apposition "virus", and make the text consistent throughout the manuscript by only using the acronym of each term after it was introduced for the first time.

      Thank you for this comment. We have thoroughly revised the use of technical terms.

      Minor revisions: Line 1: "matrix protein undergoes" We refer here to the entire VP40 matrix layer composed of many VP40 proteins and not to single VP40 proteins (as the individual proteins do not disassemble, but their macromolecular assembly does). For clarification, we changed the title to “matrix layer undergoes”.

      Line 19: "the matrix viral protein 40 (VP40)" We have corrected the statement.

      Line 18: considering that a virus "exists" in the form of a virion while temporarily located outside the cell, and as a "molecular entity" consisting of viral proteins and nucleic acids organised in macromolecular complexes during its life cycle inside the infected cell, this reviewer encourages the authors to rephrase as follows: " Ebola viruses (EBOVs) virions are filamentous particles, ..." Thank you for your suggestion. We have rephrased it to: „Ebola viruses (EBOVs) assemble into filamentous virions“ (line 18).

      Lines 35-36 and line 40: "that is determined by the matrix made up by the viral protein 40 (VP40), which drives ..." And then, directly use the acronym VP24 at line 40

      We have corrected the statement.

      Line 40: as VP24 and VP35 interact with NP but do not interact with the ssRNA genome, please rephrase as follows "the nucleoprotein (NP) which encapsidates the ssRNA genome, and the viral proteins VP24 and VP35 which, together with NP, form the nucleocapsid"

      We have corrected the statement.

      Lines 47-48: "...fusion glycoprotein (GP)...[...] the ebolaviral envelope"

      We have corrected the statement.

      Line 51: "...remarkably long virion of EBOVs undergoes..."

      We have rephrased the statement: line 55: “…remarkably long EBOV virions undergo…”

      Line 63: "... in vitro, and in endo-lysosomal compartments in situ, by cryo-electron..."

      We have corrected the statement.

      Lines 70-71: " to shed light on EBOVs ... [...] with EBOV (Zaire ebolavirus species, Mayinga strain) in biosafety level 4 (BSL4) containment"

      We have corrected the statement.

      Line 72: chemically fixed by? (PFA and GA acronyms have been annotated in figure 1, but should be first mentioned in their explicit form in the text)

      We have now mentioned annotations for GA and PFA both in the main text and in the figure legend in their explicit forms.

      Line 73 (cryo-FIB)

      We have corrected the acronym.

      Line 80: EBOV virions

      We have corrected the statement.

      Figure 1A and line 97: for consistency with the terminology used in the main text, should be perhaps in the second step preferred the term "vitrification" instead of cryofixation? Readers not familiar with the field could be confused by the use of the two synonyms

      We have replaced the term as suggested.

      Lines 92-93: "...these data indicate [...] and suggest..."

      We have corrected the statement.

      Figure 1C and line 100: in the color legend EBOV is annotated as dark teal, however in the segmentation of the reconstructed tomogram there are three objects, one of which in dark teal is evidently a portion of EBOV virion inside the endosome, and other two are in different shades of green. What are those? Please, could author specify their identity in the figure legend with their corresponding color code? The same applies to supplementary figure S2 (see comment below).

      Thank you very much for this comment. All three green objects are EBOV virions. For clarification, we have added numbers 1-3 to the figure and legend and adjusted the text in the legend accordingly (lines 109-110).

      Line 95: "...tomography of EBOV virions..."

      We have corrected the statement.

      Line 98: "...showing EBOV virions..." (This reviewer refers to the use of the term 'EBOVs' as for different species within the genus rather than for different EBOV particles within a dataset)

      We have corrected the statement.

      Line 105: "... a purified EBOV before..." *We realized a mistake in our phrasing: the virion shown in Fig. 1 H is not purified, but a virion found adjacent to the plasma membrane of an infected cell. We have changed the phrasing accordingly (lines 117-118). *

      Line 110 and 113: "...EBOV matrix..." And "EBOV virus-like particles (VLP)"

      We have corrected the statement.

      Line 140, 141, 145 and 147: "EBOV VLPs" and "EBOV VLP"; idem at lines 188-189, 209 and anywhere else in the manuscript (including figure 4A) We have corrected the use of “EBOV VLP(s)” as suggested.

      Line 235: "influenza virus ion channel..."

      We have corrected the statement.

      Line 249: please, use directly the above-introduced acronym for the detergent

      We have revised the use of acronyms.

      Figure 5F: in plot's X axis label: thermolysin (T)?

      Yes, this is correct and stated in the figure legend.* * Line 342: "EBOV have remarkably long..."

      We have corrected the statement.

      Line 420 "...matrix-specific"

      We have corrected the spelling error.

      Line 464: "grids"

      We have corrected the spelling error.

      Line 465: "for cryo-FIB milling"

      We have corrected the statement.

      Line 611: "influenza virus M2 ..." (Please, from which influenza virus strain does the gene come from? Alternatively, which is the NCBI Protein and/or UniProt database code?)

      We have added the information to the Methods (line 648): “….A/Udorn/307/1972 (subtype H3N2))…”

      Line 623: please, use the above-designated acronym for the detergent

      *We have used the acronym as suggested. *

      Line 716: "...based on cryo-ET..." We have corrected the statement.

      Line 718: "influenza virus" We have corrected the term.

      Line 734: "cryo-ET data" We have corrected the term.

      Fig. S8: for consistency with the main text, "thermolysin" We have corrected the spelling of thermolysin throughout the manuscript.

      Fig. S2, C and F: are these EBOV virions (as mentioned in the figure title) or EBOV VLPs (as the legends in the two panels of this figure seem to suggest)? Please, the authors should clarify

      Thank you very much for spotting this mistake! These are indeed EBOV virions and we have changed the legends within the figure accordingly.

      Line 1046: "malleable lipid envelope of the EBOV"; this adjective sounds confusing; the reviewer encourages the authors to rephrase for more clarity.

      We have removed the adjective „malleable”.

      Reviewer #2 (significance (required)): see above.

      __Reviewer #3__Reviewer #3 (evidence, reproducibility and clarity (required)):

      Winter and colleagues describe the molecular architecture of Ebola virus during entry into host cells. The main claims of the paper are that VP40 is disassembled prior to fusion. Disassembly is driven by the low pH environment in the endosomes. PH-induced uncoating works via "passive equilibration" because the Ebola virus envelope does not contain an ion channel. The authors conclude that structural remodeling of VP40 acts as a molecular switch coupling uncoating to fusion. The main novel results of the manuscript are: In situ cryo-ET of endosomal compartments shows EBOV particles with intact condensed nucleocapsids and disordered protein densities that may relate to detached VP40. Five EBOV particles were imaged in the endosome and all had detached VP40 layers. Controls, budding virions and extracellular virions showed intact VP40 layers. Incubation of VP40-Gp VLPs with a pH 4.5 buffer leads to the disorder of the VP40 matrix in vitro, which is independent of Gp presence in the VLPs. MD simulation showed VP40 dimer binding to model membranes containing 30 % PS at pH7 and reduced binding at pH 4.5. Lipidomics revealed the lipid composition of VP40-Gp VLPs demonstrating 9% PS.

      VP40-PHluorin fusions were used to determine acidification of VLPs in vitro and to calculate a permeability coefficient of 1.2 Å sec-1, which is quite low compared to the permeability of the plasma membrane (345 Å sec-1). Next they modeled membrane fusion showing that fusion is more favorable after VP40 disassembly, especially favoring stalk formation. The authors propose further that fusion pore opening is more favorable in the presence of VP40. The authors claim that strong interactions of lipids with VP40 stabilizes the hemifusion intermediate. VP40 Gp VLPs can enter host cells independent of pH once Gp has been activated by thermolysin.

      We thank the reviewer for these interesting comments and valuable suggestions.

      Some of the results are over interpreted and require appropriate modifications.

      Main points that need to be addressed: Imperfections of the membrane could be induced by proteins. Does acidification of the virion depend on GP and its transmembrane region? This can be tested with chimeric GP replacing its TM by unrelated trimeric TMs.

      We agree that this is important to consider. We have addressed this question in Fig. 2 K using VLPs composed of VP40 alone. These VLPs lack GP and still display luminal acidification as evident from the disassembled VP40 matrix when incubated at low pH. Therefore, acidification does not depend on GP. For clarification, we have adjusted the following sentence in the discussion:

      Lines 410-413: “Using VLPs of minimal protein composition (VP40 and GP, and VP40 alone), we show that VP40‑disassembly, i.e. the detachment of the matrix from the viral envelope is triggered by low endosomal pH (Fig. 2). This indicates that VP40 disassembly does not depend on structural changes of other viral proteins, including GP, and is driven solely by the acidic environment.*” *

      Virus entry assays, line 292. The low pH is not only used for Gp cleavage, but induces the conformational changes leading to the post fusion conformation of Gp2. The authors need to check what happens to Gp once it is cleaved by thermolysin. Is this sufficient to induce the conformational changes in Gp? And if so how does entry of such VLPs work, because once the conformational change is triggered, GP2 will adopt the post fusion conformation which is inactive in fusion. This requires further clarification.

      To our knowledge, there is only one study showing that EBOV GP2 changes conformation at low pH in the form of a re-arrangement of the fusion peptide from an extended loop to a kinked conformation (Gregory et al 2011). Importantly, low pH alone is not sufficient to trigger GP mediated membrane fusion and NPC1 is needed as a trigger for membrane fusion process (Das et al, 2020). Hence proteolytically processed GP requires NPC1 binding to change its conformation to post-fusion state. We addressed this question by using pre-cleaved (= GP2) and low pH- treated VLPs in our entry assay (Fig. 5 F). Since low pH-treated VLPs enter host cells as efficiently as VLPs incubated at neutral pH, and low pH-treated and additionally pre-cleaved VLPs enter even more efficiently, it is highly unlikely that low pH triggers the post-fusion conformation as this should inhibit virus entry (as the reviewer pointed out). In conclusion, low pH does not induce the post-conformation in GP2 and we have included a respective sentence for clarification:

      Lines 339-343: * Since thermolysin-treated EBOV VLPs efficiently enter untreated host cells at neutral and low pH, we further conclude that low pH alone does not induce the GP2 post-fusion conformation, which would inhibit virus entry. Together, this suggests a role of low endosomal pH beyond proteolytic processing of EBOV GP, likely for the disassembly of the VP40 matrix.”*

      In the fusion model, the authors claim that VP40 disassembly is more favorable for stalk formation, which is likely true. However, they also claim that strong VP40 interaction, which I would interpret as VP40 filaments interacting with the membrane, favor fusion pore opening. The tomograms and the in vitro experiments with VLPs indicate that the complete VP40 matrix is detached from the membrane under low pH conditions.

      We would like to stress that the modelling results for hemifusion formation and pore opening are independently calculated but have to be interpreted together because they occur sequentially. Hemifusion precedes formation of the pore and hence even though the model shows that the fusion pore opening is favored in the presence of VP40 interaction, membrane fusion cannot proceed to this stage because hemifusion is blocked until the VP40 matrix layer disassembles from the membrane. We apologize for lack of clarity, and we have added the sentences:

      Lines 315-318: “However, it is important to note that hemifusion precedes pore formation in the membrane fusion pathway. Since the disassembly of the VP40 matrix is required for hemifusion and hence for the initiation of membrane fusion, it determines the outcome of the membrane fusion pathway.*” *

      VLPs are purified. Can the authors exclude the possibility that the purification protocol does not damage the VLP membrane leading to in vitro acidification in a low pH environment? Can some of the assays be repeated with non-purified VLPs?

      *Thank you very much for this important comment. To address this question, we had performed the cryo-ET experiments using purified and unpurified VLPs and found that they are virtually indistinguishable. Importantly, unpurified VLPs also undergo VP40 disassembly. We now show images from unpurified VLPs in a supplementary figure (Fig. S7). Thereby, the manuscript contains data of purified VLPs while we also provide proof that the purification protocol does not influence the disassembly of the VP40 matrix. We added the following explanatory sentence to the main text: *

      Lines 151-156: “*We further repeated the experiment using VLPs composed of VP40, GP and the nucleocapsid proteins NP, VP24 and VP35, and observed the same low pH-phenotype described above (Fig. S5 C). Performing the experiments on unpurified VLPs harvested from the supernatant of transfected cells confirmed that the purification protocol applied did not influence the disassembly of the VP40 matrix (Fig. S7). “ *

      Does acidification only work at pH 4.5?

      *We also attempted to verify the acidification of VLPs at higher pH (~5.5. and ~6.0) by cryo-ET, however, subtle structural differences were difficult to quantify. Considering the lower permeability of the VLP membrane compared to the plasma membrane, we think that acidification occurs indeed also at higher pH (as shown for cells), albeit at slower kinetics. *

      Minor points Line 37: Ruigrok et al. 2000 J Mol Biol showed first that Ebola VP40 requires negatively charged lipids for interaction.

      *Thank you for pointing out this reference. We have included it in the text. *

      Fig. 1f: Is VP40 detaching as a filament?

      We have not observed that VP40 detaches as a filament or a linear segment of multiple VP40 dimers. *Since the VP40 dimer is inherently flexible (Fig. 3, Fig. S8) and can rotate along the N- and C-terminal intra- and inter-dimer interfaces, we believe disassembly occurs in a non-ordered fashion (not as filaments, see also Figure 2 G-K). *

      References 8 and 28 are the same. We have corrected the reference duplication.

      Lipidomics: The authors find only 9% PS in the VLPs. How do these results compare to the composition of other enevloped viruses that have been reported to assemble on negatively charged lipids.

      *We compared the lipid composition of the EBOV VLPs to the lipid composition of influenza viruses and HIV, which both bud from the plasma membrane and require negatively charged lipids. When grown in eggs, the envelope of influenza viruses contains 22-25 % PS (Ivanova et al 2015, Li et al 2011), and approximately 12% when produced from MDCK cells (Gerl et al 2012). The envelope of HIV virions produced from HeLa or MT4 cells contains 10-15% PS. These numbers suggest that the producing cell line strongly influences the lipid composition of the virus particles. Besides differences in the producing cell line, the lower amount of PS found in EBOV VLPs could have multiple implications: first, apart from PS, PIP2 has also been shown to interact specifically with VP40 at budding sites in the plasma membrane (Jeevan et al 2017, Johnson et al 2018) and thus also contributes to virion assembly (potentially allowing for a lower PS concentration); second, as recently shown for paramyxoviruses (Norris et al 2022), binding of PS to viral proteins is not based on charge alone but may include specific binding – in which case a high affinity of viral proteins to PS may allow for a lower PS concentration in the target membrane. Overall, the rather low PS content in Ebola VLPs might be important for VP40 interaction and low pH-driven disassembly. *

      EBO virus was suggested to assemble at lipid rafts. Is this reflected by the lipid composition?

      *Yes, that is correct. A hallmark of lipid rafts is the enrichment of cholesterol and sphingomyelin (~32 mol% cholesterol, ~ 14 mol% sphingomyelin) in the microdomains (Pike et al 2002). The lipid composition of the EBOV VLPs determined in our study (~ 39% cholesterol and ~10 mol% sphingomyelin) is consistent with the assembly at lipid rafts. Minor differences stem from the different cell lines and lipidomic approaches used to determine the lipid species. *

      Reviewer #3 (significance (required)): In summary, the manuscript is of high technical quality and the observation that VP40 detaches from the viral membrane prior to membrane fusion is novel and interesting to the field of virus fusion. How acidification occurs in the absence of an ion channel remains to be determined. The authors provide little insight how this might work. The strong part of the manuscript is the EM part, which shows convincing detachement of the VP40 matrix. I cannot comment too much on the modelling part, which, however, sounds solid.

    1. THE **MONO**LOGUE **C**ONTINUES, **UNDERSTAND ME**. It doesn’t take much “thought” to see these star charts–our Astrological road maps to ‘wisdom of the Ancients’ might actually be something closer to road maps than I could have previously fathomed–let alone imagined. I’m staring at “Monoceros” and seeing it’s definately connected to “the kissing disease” and to Eros and to Cupid–and seeing … this one not for the first time that character linked to Orion and to the “Speare” of Sagittarius. I’ve commented … ‘on the show in my head’ that it seems the entirety of the Milky Way might be something like our world … it could be a microcosmic map to something much larger–it could be the seed of “galaxies” in this place that might very well be the “thing” that connects the end and the beginning; rather than the beginning and the end as I once … commented was the original “glyph” i read in the letter “H.”

      THE MONOLOGUE CONTINUES, UNDERSTAND ME.

      It doesn’t take much “thought” to see these star charts–our Astrological road maps to ‘wisdom of the Ancients’ might actually be something closer to road maps than I could have previously fathomed–let alone imagined. I’m staring at “Monoceros” and seeing it’s definately connected to “the kissing disease” and to Eros and to Cupid–and seeing … this one not for the first time that character linked to Orion and to the “Speare” of Sagittarius.

      I’ve commented … ‘on the show in my head’ that it seems the entirety of the Milky Way might be something like our world … it could be a microcosmic map to something much larger–it could be the seed of “galaxies” in this place that might very well be the “thing” that connects the end and the beginning; rather than the beginning and the end as I once … commented was the original “glyph” i read in the letter “H.”

      --

      someone commented on the site, they posted a picture from my earlier work ... the "WHY?" one that depicted starvation and crucifixion and no press.

      in related news the LA Times spoke, it echoed "and he's thinking about his own mortality" seconds after the event ... the self questioning of whether or not I have any "kind of divinity" in me at all. Dana too, has echoed back that there's a message I am missing.

      I forgot to mention the press junket's every day, that was a kind of speech that you can't really "feel" in the rest of the articles that talk about things like walls and "something missing." Acosta may have written more on the reason, but I wasn't able to find out exactly what it was they were saying.

      Lately science has started talking about things ... "going haywire" I'm here with "IGNITION" and LLNL on my mind, and also the power of star creation and destruction connecting to the Pentagon and Deuteronomy and ... deuterium and fusion and fission and the Vooshan Young.

      I imagine some people read through this feed, the one I'm posting to. I've just posted this:

      This is what I have to do to ensure things "aren't vanished upon death" or worse, while I'm still typing about them in the very same day. The post was "vanished" from github, and I mean; it's here for the protection of not just veracity and Americana, but Hypothesis itself.

      Edit: the Github posting didn't disappear it was just marked as closed; along with an explanation "about hearing and answering before."

      There's a fortune in building the thing; and putting it together; it's basically "the next big thing" a news and "what's popular on the web" aggregator that has advanced search and friendship capabilities. Integrating with LinkedIn and Facebook and Twitter and ... "most of all abstracting those things with an identity system that ties directly to IPFS and strong identity validation and authentication--

      Out of the Ether ..

      PS: noting that this link ties to the root directory of fromthemachine dot org and it's already an aged post about this very thing, building something with Ethereum that "is glaringly missing" .. including "find your friends" integration that doesn't require them to send you a long random string of letters and numbers--just being "already connected". Creating a Wallet/Address system that ties together the social networks and "crypto trading" is glaringly missing, and we can already see applications like Snapchat and Tik-Tok that have done it in a way that creates a significant growth factor ... I mean it almost instantly made Tik Tok as big as Instagram.

      This tool is a key; being able to see "what everyone is saying about the front page news on the outlets you read every day" in a news feed and interface similar to Facebook's ... "I think that's a game changer for me; I would use it."

    1. Note 9/8j says - "There is a note in the Zettelkasten that contains the argument that refutes the claims on every other note. But this note disappears as soon as one opens the Zettelkasten. I.e. it appropriates a different number, changes position (or: disguises itself) and is then not to be found. A joker." Is he talking about some hypothetical note? What did he mean by disappearing? Can someone please shed some light on what he really meant?

      On the Jokerzettel

      9/8j Im Zettelkasten ist ein Zettel, der das Argument enthält, das die Behauptungen auf allen anderen Zetteln widerlegt.

      Aber dieser Zettel verschwindet, sobald man den Zettelkasten aufzieht.

      D.h. er nimmt eine andere Nummer an, verstellt sich und ist dann nicht zu finden.

      Ein Joker.

      —Niklas Luhmann, ZK II: Zettel 9/8j

      Translation:

      9/8j In the slip box is a slip containing the argument that refutes the claims on all the other slips. But this slip disappears as soon as you open the slip box. That is, he assumes a different number, disguises himself and then cannot be found. A joker.

      Many have asked about the meaning of this jokerzettel over the past several years. Here's my slightly extended interpretation, based on my own practice with thousands of cards, about what Luhmann meant:

      Imagine you've spent your life making and collecting notes and ideas and placing them lovingly on index cards. You've made tens of thousands and they're a major part of your daily workflow and support your life's work. They define you and how you think. You agree with Friedrich Nietzsche's concession to Heinrich Köselitz that “You are right — our writing tools take part in the forming of our thoughts.” Your time is alive with McLuhan's idea that "The medium is the message." or in which his friend John Culkin said, "We shape our tools and thereafter they shape us."

      Eventually you're going to worry about accidentally throwing your cards away, people stealing or copying them, fires (oh! the fires), floods, or other natural disasters. You don't have the ability to do digital back ups yet. You ask yourself, can I truly trust my spouse not to destroy them?,What about accidents like dropping them all over the floor and needing to reorganize them or worse, the ghost in the machine should rear its head?

      You'll fear the worst, but the worst only grows logarithmically in proportion to your collection.

      Eventually you pass on opportunities elsewhere because you're worried about moving your ever-growing collection. What if the war should obliterate your work? Maybe you should take them into the war with you, because you can't bear to be apart?

      If you grow up at a time when Schrodinger's cat is in the zeitgeist, you're definitely going to have nightmares that what's written on your cards could horrifyingly change every time you look at them. Worse, knowing about the Heisenberg Uncertainly Principle, you're deathly afraid that there might be cards, like electrons, which are always changing position in ways you'll never be able to know or predict.

      As a systems theorist, you view your own note taking system as a input/output machine. Then you see Claude Shannon's "useless machine" (based on an idea of Marvin Minsky) whose only function is to switch itself off. You become horrified with the idea that the knowledge machine you've painstakingly built and have documented the ways it acts as an independent thought partner may somehow become self-aware and shut itself off!?!

      https://www.youtube.com/watch?v=gNa9v8Z7Rac

      And worst of all, on top of all this, all your hard work, effort, and untold hours of sweat creating thousands of cards will be wiped away by a potential unknowable single bit of information on a lone, malicious card and your only recourse is suicide, the unfortunate victim of dataism.

      Of course, if you somehow manage to overcome the hurdle of suicidal thoughts, and your collection keeps growing without bound, then you're sure to die in a torrential whirlwind avalanche of information and cards, literally done in by information overload.

      But, not wishing to admit any of this, much less all of this, you imagine a simple trickster, a joker, something silly. You write it down on yet another card and you file it away into the box, linked only to the card in front of it, the end of a short line of cards with nothing following it, because what could follow it? Put it out of your mind and hope your fears disappear away with it, lost in your box like the jokerzettel you imagined. You do this with a self-assured confidence that this way of making sense of the world works well for you, and you settle back into the methodical work of reading and writing, intent on making your next thousands of cards.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01756

      Corresponding author(s): Wenya, Hou

      1. General Responses

      Dear Editors and Reviewers,

      We deeply appreciate all critical comments and constructive suggestion from all Reviewers, which have inspired us to conceive at least 8 new important experiments and mathematic analysis/modeling (shown in dark red). In addition, we will include more repeats with quantification for spot assays (with more HU doses) and biochemical experiments as well as language revision (shown in orange).

      Below we only list the general response to the Major Concerns raised by at least two Reviewers:

      • To perform mathematic analysis of the single-cell quantitative data (Fig 4, Fig 5 and Fig S4) (Analysis #1).

      50% Sic1 degradation time from Sic1peak

      WT SC

      7.62 min

      whi7 whi5 SC

      7.91 min

      WT HU

      36 min

      whi7 whi5 del HU

      7.49 min

      50% nuclear exit time of Whi5

      WT SC

      4.69 min

      rad53Δsml1Δ SC

      7.60 min

      WT HU

      22.33 min

      rad53Δsml1ΔHU

      13.41 min

      Table R1. 50% Sic1 degradation time calculated from Sic1peak and 50% nuclear exit time of Whi5 based on the experimental data shown in Fig 5 and Fig 4, respectively.

      (2) To reinterpret the HU-induced extension of G1/S transition with an updated model (Analysis #2).

      (3) predict that like WHI7/5 overexpression, CKS1 deletion (PMID: 7958905) or sic1 mutants with longer destruction timing (T2,5S-VLLPP or T2,5S-RXL reported in Fig. 6C, PMID: 32296067), can suppress the HU sensitivity of rad53 mutants according to our model. Moreover, their suppression effects should be epistatic to WHI7/5 overexpression. Alternatively, the dosage suppression of WHI7/5 might be reversed by CKS1 overexpression or sic1 mutants with shorter destruction timing (unfortunately no such mutant has been reported yet). We will perform this set of genetic experiment to test these predictions and thereby functionally reinforce the Whi7/5-Cks1-Sic1 axis (Experiment #1).

      (4) do DNA replication profiling to examine the number of origin firing or replication capacity (Experiment #2).

      (5) To address the suppression effect of phosphorylation in Fig 2E. We agree that the phenotypes of the A-mutants of Whi7 have a weak difference compared with WT, but become much stronger (5-fold difference between two dilutions) compared with the D-mutants. As shown lately in Fig 3, phosphorylation solely facilitates protein stabilization/total levels, which can be masked by ectopic overexpression from an extra plasmid. Moreover, phosphorylation does NOT enhance Whi7’s interaction with Cks1. We should tune down the contribution of phosphorylation and focus more on the stability/protein level. Furthermore, we will do competition assays using A-/D- mutants with GFP and RFP labels (Experiment #3), and add back whi7 13A or 13D in its endogenous locus in the whi7Δwhi5Δ double mutant to test the effect on Sic1 turnover (Experiment #4).

      (6) To add more repeats with quantification for spot assays (with more HU doses) and biochemical experiments (shown in orange).

      Besides reinforcing the current model, these experiments, analysis and re-interpretation may help to clarify two concepts which remain elusive in current version:

      • S-CDK activation can switch from an abrupt/all-or-none pattern under normal condition to a gradually flattened one under replication stress.
      • Consequently, the Whi7/5-Cks1-S-CDKs axis may determine replication capacity and/or number of origin firing. Thus, we did not include a preliminary revision this time due to significant changes. We plan to request at least 6 months for an extensive full revision (e.g., from a short letter to a regular article) to improve this study to a higher level with more general significance. Therefore, we request a revision opportunity from The EMBO Journal.

      2. Point-to-point responses

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      SUMMARY

      This work begins with a heterologous screen, introducing human genes in double mec1,sml1 yeast deletants, which are alive, but sensitive to hydroxyurea. The readout was mec1,sml1 proliferation in the presence of hydroxyurea. They found that mec1,sml1 yeast mutants carrying the human RB1 gene (a G1/S transcriptional repressor) proliferated on hydroxyurea. Then, they test if known yeast G1/S transcriptional repressors (Whi5 and Whi7) could have similar effects if provided at higher than normal levels (they did). With this initial result, followed up by a variety of experiments, the authors then go on to propose that replication stress, which activates Mec1 and Rad53, triggers the phosphorylation of Whi7 (by Mec1) and Whi5 (by both Rad53 and Mec1) blocking their eviction from the nucleus, allowing them instead to bind and inhibit Cks1, a Cdk processivity factor, needed for the complete phosphorylation and degradation of a Cdk inhibitor, Sic1. This is different from published work a decade earlier in mammalian cells (ref. 37; Bertoli et al.), which showed that upon replication stress, Chk1 phosphorylates G1/S transcriptional repressors to maintain G1/S transcription, which could help genome stability. Here, the authors propose that replication stress could block the G1/S transition. While the model and some of the experiments are interesting, the rationale for some experiments was shaky, and the data do not fully support the conclusions.

      MAJOR POINTS

        • Any cell that undergoes DNA replication must have already destroyed Sic1. It has been known for 25+ years that targeting Sic1 is the only necessary function of G1/Cdk to enable DNA replication (PMID: 8755551). Sic1 does not reappear until the M/G1 transition. Hence, in the authors' model, where cells are already in the S phase, how can multisite phosphorylation and degradation of Sic1 be the critical and final output of the pathway they propose when there shouldn't be any Sic1 around, to begin with? Why would a cell that has already completed Start and the G1/S transition, is in the S phase and experiencing replication stress, care about going through the G1/S? A: Yes, S-CDK activity is regarded as an abrupt or so-called “all-or-none transition” due to a relative short half-life of Sic1 controlled by a robust double-negative feedback loop (PMID: 24130459; 23230424). Sic1 degradation requires multi-phosphorylation events including prime phosphorylation by G1-CDKs, two opposing multi-phosphorylation by S-CDK complex (Clb5–Cdk1–Cks1), one to trigger phosphodegrons and the other to terminate the degron route (PMID: 32296067). The timing and speed (or “sharpness”) of Sic1 degradation is determined by G1-CDKs and S-CDKs, respectively (PMID: 24130459 and PMID: 32296067). Sic1 degradation is not an instantaneous “all-or-none” event even under the optimal growth conditions. The Sic1 destruction timing calculated from Start (defined as 50% nuclear exit of Whi5) is about 14.2 min, whereas the time between Start and Sic1peak is about 5 min from independent studies (Fig 4G, PMID: 24130459; Fig. 6C, PMID: 32296067; Fig. 7B, 32976810). Similarly, the 50% Sic1 degradation time calculated from Sic1peak (50% of Sic1peak) is about 8 min for WT and whi7, in agreement with the results in Figure 2E, PMID: 24130459. However, in the presence of HU, the 50% of Sic1peak time remains constant (7.49 min) in whi7Δwhi5Δ cells but becomes greater than 36 min in WT. Meanwhile, the 50% nuclear exit time of Whi5 (Start) is about 22 min in WT compared to 13 min in rad53Δsml1*Δ upon HU treatment.

      50% Sic1 degradation time from Sic1peak

      WT SC

      7.62 min

      whi7 whi5 SC

      7.91 min

      WT HU

      36 min

      whi7 whi5 del HU

      7.49 min

      50% nuclear exit time of Whi5

      WT SC

      4.69 min

      rad53Δsml1Δ SC

      7.60 min

      WT HU

      22.33 min

      rad53Δsml1ΔHU

      13.41 min

      Table R1. 50% Sic1 degradation time calculated from Sic1peak and 50% nuclear exit time of Whi5 based on the experimental data shown in Fig 5 and Fig 4, respectively.

      Therefore, G1/S transition is a “transition zone” (from Start to 50% of Sic1peak) rather than a single borderline. The key finding of this study is that in the presence of HU, Sic1 degradation speed/sharpness is significantly reduced (Figure 5), mechanistically due to the inhibition of S-CDK-Cks1 by Whi7/5. This eventually reflects a flattened S-CDK activity curve, no longer an “all-or-none activation” any more upon replication stress. S-CDKs phosphorylate the two essential targets (Sld2 and Sld3) to enable DNA replication. Therefore, the Sic1 levels determine the S-CDK activities, which in turn determine the DNA replication capacity (the maximal amount of DNA a cell can synthesize per unit time). In sum, under optimal conditions, the S-CDK activity appears an abrupt/sharp transition and cells replicate DNA in its maximum capacity (i.e., minimal S phase length). When cells encounter replication stress (HU), S-CDK is activated very slowly (very low Sic1 destruction speed) and replicate DNA with a low capacity (slow fork speed and/or few origin firing) to meet the limited resource. Recently, the de Bruin group demonstrates that replication capacity can be tuned by E2F-dependent transcription (includes S-Cyclin genes) in mammalian cells (PMID: 32665547).

      Inspired by these questions, we plan to

      (1) perform mathematic analysis of the single-cell quantitative data (Fig. 5 and S4) (Analysis #1).

      (2) reinterpret the HU-induced extension of G1/S transition with an updated model (Analysis #2).

      (3) predict that like WHI7/5 overexpression, CKS1 deletion (PMID: 7958905) or sic1 mutants with longer destruction timing (T2,5S-VLLPP or T2,5S-RXL reported in Fig. 6C, PMID: 32296067), can suppress the HU sensitivity of rad53 mutants according to our model. Moreover, their suppression effects should be epistatic to WHI7/5 overexpression. Alternatively, the dosage suppression of WHI7/5 might be reversed by CKS1 overexpression or sic1 mutants with shorter destruction timing (unfortunately no such mutant has been reported yet). We will perform this set of genetic experiment to test these predictions and thereby functionally reinforce the Whi7/5-Cks1-Sic1 axis (Experiment #1).

      (4) do DNA replication profiling to examine the number of origin firing or replication capacity (Experiment #2).

      • The results in Figure 2C are confusing and difficult to interpret. For example, comparing lane 8 (WT without hydroxyurea) to lane 7 (WT with hydroxyurea), it appears that there is more phosphorylated Whi7 in lane 7 (hydroxyurea treatment) than in lane 8 (no treatment). But, the ratio of phosphorylated/unphosphorylated Whi7 is not that different (there is very little unphosphorylated Whi7 in lane 8). Same problem when comparing lanes 3 and 4. I understand that they later show that Whi7 is stabilized by hydroxyurea, but from the data in this figure, what exactly can they conclude here?*

      A: Yes, phosphorylation is a bit confusing according to the current statement. Without HU, Whi7 is phosphorylated by G1-CDKs with a much less total protein level as well. With HU, whi7 is phosphorylated by Mec1 and Rad53, because Whi7-P largely disappeared in rad53 mutant (lane 1) and 13A (with all putative Mec1-Rad53 sites mutated, lane 5). Lanes 3 and 4 are biological repeats of Lanes 7-8 with less loading. We will clarify our statement.

      • Their data in Figure 2E show that phosphorylation of Whi7 is not required for suppressing the lethality of rad53,sml1 cells treated with hydroxyurea. Cells carrying Whi7-41A (lacking all possible phosphorylations) suppressed nearly as well as wild-type Whi7 did. The purported differences in the suppression are minuscule at best and not evident at the dilutions tested. It is not clear at all how they can conclude that phosphorylation of Whi7 has anything to do with the ability of Whi7 overexpression to suppress the lethality of rad53,sml1 cells.*

      A: Yes, we agree that the phenotypes of the A-mutants of Whi7 have a weak difference compared with WT, but become much stronger (5-fold difference between two dilutions) compared with the D-mutants. As shown lately in Fig 3, phosphorylation solely facilitates protein stabilization/total levels, which can be masked by ectopic overexpression from an extra plasmid. Moreover, phosphorylation does NOT enhance Whi7’s interaction with Cks1.

      Anyway, we should tune down the contribution of phosphorylation and focus more on the stability/protein level. Furthermore, we will do competition assays using A-/D- mutants with GFP and RFP labels __(Experiment #3) __and add back whi7 13A or 13D in its endogenous locus in the whi7

      • For all the arguments they make about this new role of Whi5 and Whi7 at Start, they do not examine size homeostasis or the kinetics of cell cycle progression in any of their experiments and their mutants, with or without hydroxyurea treatment.*

      A: Good suggestion. We will examine size homeostasis, budding index or the cell cycle progression in the related experiments (Experiment #5). In Fig. S5, we only showed the cell cycle progression profiles in wild-type cells carrying an extra copy of Whi7 WIQ or Whi7 WIQ ΔC. WIQ mutant (without Swi6 binding activity) significantly slowed the cell cycle progression under normal conditions.

      • The Sic1 stability experiments they show in Figure 5 are nice. They would need to be extended to their various mutants, including their Whi7 phosphomutants, to make a case for phosphorylation by Rad53 and Mec1 in this process.*

      A: Thanks, very good suggestion, we will add back whi7 13A or 13D in its endogenous locus in the whi7Δwhi5Δ double mutant (Experiment #4), to avoid the effects of overexpression.

      MINOR POINTS

        • The language is awkward. Editing for style will be necessary.* A: We will request language editing.
      1. They use different hydroxyurea doses in the experiments they show, making it difficult to conclude much when comparing different figures. Why aren't they consistent from experiment to experiment?*

      A: Sorry for the confusing. We used at least three HU concentration gradients in each experiment, but only showed one of them to save the space for a short article. Notably, S. cerevisiae has a much broader range of HU doses (up to 300 mM) than other species (less than 10 mM). We’ll add other Figures during revision.

      **Referees cross-commenting**

      Overall, all reviews are well-aligned. The points raised by the other reviewers are valid, and the reviews are thorough and detailed. I don't know whether the authors will be able to respond since the list is quite long. Even if they do, the manuscript will look very different. I do not have anything else to add.

      Reviewer #1 (Significance (Required)):

      The manuscript presents some interesting data, most notably the role of Whi7 and Whi5 in the stability of Sic1 in vivo and the various in vitro experiments the authors present. The advance is conceptual and mechanistic, offering a different and unanticipated model for the role of these proteins at Start, under replication stress. Unfortunately, the significance of the manuscript is limited. A convincing case for their model and its importance has not been made. For example, their data in Figure 2E, measuring the ability of phosphomutants to suppress the lethality of rad53,sml1 cells upon replication stress, is underwhelming and undermines the importance of the study, particularly to a wider audience.

      A: Thanks for the suggestion, we will improve the model as discussed above.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Jin et al demonstrate a novel type of regulation of the G1/S transition in response to hydroxyurea stress. They approach this by first screening a library of human proteins (cDNA on yeast plasmids) for repressors of the mec1 or rad53 HU sensitivity. HU inhibits ribonucleotide reductase and thus lowers dNTP pools needed for S-phase. This slows replication and leads to stalled replication forks, triggering a "replication stress" response, which is executed by the kinases Mec1 and Rad53. Deletions of mec1 or rad53 are viable in unstressed conditions (with additional sml1 deletion), but are lethal on even low doses of HU. One main hit that rescued this lethality was the human G1/S inhibitor RB. They then went on to confirm that also the yeast analogs Whi5 and Whi7 can rescue mec1 or rad53 lethality when overexpressed. To track down the mechanism, the authors do a variety of genetic and biochemical assays. The resulting model is that Mec1 and Rad53 phosphorylate and stabilize Whi7, which binds to and inhibits the S-phase-CDK complex via the processivity factor Cks1. So on top of acting as a transcriptional repressor, Whi7 (and probably also Whi5) is also a direct interactor and inhibitor of CDK. The binding of Whi7 to Cks1-Clb5/6-CDK prevents the hyperphosphorylation and degradation of the inhibitor Sic1, and thus slows the G1/S transition in response to HU.

      Major comments:

      - Are the key conclusions convincing?

      ->Overall I think the sum of the evidence supports the suggested model, individual claims though are on somewhat shaky grounds based often on single replicates, see below.

      My main conceptual issue may be somewhat just a "semantic" problem. In my understanding "replication stress" refers to stalled replications forks and/or large stretches of single-strand DNA which then triggers a checkpoint response. So how would slowing the G1/S transition help to deal with "replication stress", if replication is not yet happening in these cells? I am assuming Mec1 senses dNTP depletion also in the absence of replication and that is how Mec1 and Rad53 become active in G1. But then maybe the model and the arguments can be phrased differently? What exactly is slowing down Sic1 degradation doing for the cell? Replenishing dNTP pools before the first origins fire? Or is maybe Sic1 not the most important target of this regulation? Maybe also during S-phase, partially inhibiting CDK is beneficial, maybe to stretch out origin firing... or?

      A: Thank you, very good suggestion. This also helps to address the Major Point 1 raised by Reviewer #1. This also reminds us about the work from Pasero’s group demonstrating that Mec1 is activated at the onset of normal S phase by low dNTPs (PMID: 32169162). We will revise the text, and do DNA replication profiling __(Experiment #2) __to examine the number of origin firing or replication speed. Also see response to Point 1 of Reviewer #1.

      - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      ->Most of the work is done on Whi7 and then some Whi5 in the end, I would tone down on the Whi5 claims a bit.

      A: Very good suggestion. We have to include Whi5 in the story because it plays a redundant role with Whi7.

      - Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      -> Since the authors are clearly able to do quantitative live cell imaging, I do not understand why they do not quantify Whi7 concentrations and localization in response to HU instead of using Western blots of synchronized cells. This would make the whole thing much more credible, especially given the current lack of replicates (see below). This would also allow correlating the timing and amount of the Whi7 response with the stabilizing of Sic1 in single cells.

      A: Yes, we tried but did not see Whi7-GFP clearly because of its very low protein abundance, which is also not shown in literature as far as we know. Only overexpressed Whi7 fluorescence detection(PMID: 33443080).

      ->The causality of phosphorylation being required for stabilization seems plausible from the genetics, but is far from clear in the western blots. Here, concentration increase seems to precede phosphorylation. Could this due to induced Whi7 transcription?

      A: Good suggestion. We will detect Whi7 mRNA levels through qPCR (Experiment #6).

      ->Many if not most claims are based on single replicates. See below.

      - Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      -I am not suggesting any different types of experiments or new methods, so it should be doable within a few weeks.

      - Are the data and the methods presented in such a way that they can be reproduced?

      -I would suggest the authors spell out all of their experimental procedures instead of referring to "as described previously". I think everyone knows the pains of going on a wild goose chase of following references to the original method description.

      A: Good suggestion. I will described all experimental procedures to replace "as described previously".

      - Are the experiments adequately replicated and statistical analysis adequate?

      -The key weakness of this entire paper is imho that many claims are based on single experiments, that are neither replicated nor quantified. For example, all the co-IPs (such as 1E or 3F) should be replicated and the ratio of bait to target quantified and averaged.

      A: Good suggestion. We will show the biological repeats and quantification.

      -If a claim is made regarding increased phosphorylation in vivo, then again this should be replicated and the ratio of phosphorylated to unphosphorylated bands quantified. In many Whi7 gels it looks like it is mainly the total amount of the protein that is changing rather than the phosphorylation state. But again, by eye and from a single replicate, this is hard to tell.

      A: Good suggestion. We will add more repeats.

      -A similar thing holds true for the spot assays. Spot assays are great to show lethality and rescue as in the first figure. But making semi-quantitative claims of different degrees of "partial rescue" from a single spot assay is a bit speculative. This seems especially true since the authors are using different and seemingly random HU concentrations for every spot assay, which suggests that the effect is not very robust and can only be seen in very specific concentration ranges. If e.g. the degree of rescue between WT, A and D mutants or truncations matters for the model/the storyline, then more quantitative growth or competition assays should be added.

      A: Good suggestion. sorry for the confusing. We used at least three HU concentration gradients in each experiment, but only showed one of them to save the space for a short article. Notably, S. cerevisiae has a much broader range of HU doses (up to 300 mM) than other species (less than 10 mM). We’ll add other Figures during revision, and do competition assays using A-/D- mutants with GFP and RFP labels

      Minor comments:

      - Specific experimental issues that are easily addressable.

      ->At least some of the alpha-factor release experiments should contain infos on budding index and/or DNA content to understand see the delay in timing by HU addition.

      A: Good suggestion. We will examine size homeostasis, budding index or the cell cycle progression in the related experiments (Experiment #5).

      - Are prior studies referenced appropriately?

      ->Seems fine from the G1/S side, but I don't know the Mec1/Rad53 literature well enough to judge.

      - Are the text and figures clear and accurate?

      ->The authors could do another round of proofing figures and legends. For example, Fig 5C contains scale bars that are not defined, blot 3E has an asterix labeling that is not defined, the model in 5E has misspelled "degradation"...

      A: We will proofread and revise the full text again.

      - Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      -> The authors use a lot of different mutants (especially for Whi7). Even for someone who knows the proteins fairly well, it is hard to remember throughout the text which abbreviation is relating to which mutations and which function that is addressing. Maybe occasionally remind the reader of what the mutant is or use terms like Whi7non-binding rather than WIQ.

      A: Thank you for your suggestion. We will add (TF non-binding) after WIQ.

      ->The text could also use another round of proof-reading. The overall flow of the storyline is easily comprehensible, but sometimes there is a sudden switch of topics or new proteins come out of nowhere. Some expressions are used in a way that is not common English.

      A: We will request language editing.

      Reviewer #2 (Significance (Required)):

      - Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      ->This study is a major conceptual contribution to understanding G1/S regulation in perturbed conditions (assuming the results can be replicated and quantified as detailed above). That Whi7 (and maybe Whi5) directly inhibit Clb5/Clb6-CDK through Cks1 binding is an important addition/modification to the current model and may well be important beyond genotoxic stress.

      A: Thanks and we’ll reinforce it with more repeats and quantification.

      - Place the work in the context of the existing literature (provide references, where appropriate).

      ->The authors do this reasonably well.

      - State what audience might be interested in and influenced by the reported findings.

      -> Anyone in the yeast cell cycle/replication field should find this interesting. It should also have important implications for the mammalian cell cycle/replication/DNA damage field.

      - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      ->I am well familiar with G1/S control and all the methods used in the study. I am not an expert on replication stress/DNA damage/ checkpoint signaling.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary

      In their manuscript "Transcription-independent hold of the G1/S transition is exploited to cope with DNA replication stress", Jin et al. intend to show that Retinoblastoma-like G1/S transcriptional repressors can also work as S-CDK-Cks1 inhibitors in response to DNA replication stress, hence prolongating the G1/S transition to enable cells to deal with replication stress. In particular, they aim to identify the mechanism by which Whi7/Srl3 (Suppressor of Rad53 Lethality) rescues the lethality of rad53 yeast mutants. Even though their very first experiment is performed using human RB1, the remainder or the work is performed in the yeast model organism. Experimental methods used include mostly immunoprecipitation experiments (Western blots), spot assays, and some single cell microscopy (not specified if widefield or confocal).

      Major comments

      1) the authors refer to a cross-species screen where they aim to detect human proteins that rescue, upon overexpression, the yeast mec1Dsml1D and rad53Dsml1D lethality (of note, not mec1D/rad53D: why?). They identify hRB1 this way. But the entire screen data is missing, either is the analysis pipeline and "hit selection thresholds" (if applicable). Then no more experiments are performed on human cells or using human proteins. In my opinion this cross-specie approach is not necessary, or not developed enough.

      A: Yes, we have only performed a pilot screen based on the growing on 4 mM HU. We consider removing it. The reason to use mec1Dsml1D for genetic screen is that mec1D/rad53D cells are dead even without HU, whereas dissection assays do not fit for large-scale screening.

      2) Moreover, the interpretation of the data provided as a whole is strongly complicated by the variability in the HU doses used to trigger the Mec1/Rad53 response. While most immunoprecipitation experiments are performed with 200mM, spot assays are performed at various HU concentrations ranging from 3 to 21mM (and exploring the entire range). Sometimes HU concentrations differ on the same Figure panels. Downstream effects of such diverse HU concentrations might also be very diverse and due to this it is difficult to get an understanding of how the different experiments fit together.

      A: Sorry for the confusing. We used at least three HU concentration gradients in each experiment, but only showed one of them to save the space for a short article. Notably, S. cerevisiae has a much broader range of HU doses (up to 300 mM) than other species (less than 10 mM). Spot assays (HU are persistent) are mostly done in the mec1Dsml1D and rad53Dsml1D background (sensitive to 4 mM HU), whereas the IP experiments (only 2-3 h treatment and then removal) are mainly performed in WT or at least in comparison with WT background (resistant up to 250 mM HU). We’ll add other Figures during revision.

      3) Likewise, some experiments are performed only on rad53D backgrounds, or only on mec1D backgrounds (e.g. Fig1B and Fig1F, respectively), while results are claimed valid for the two gene deletion backgrounds.

      A: Thank you. We will add some “not shown data” and remove the invalid claims without data.

      4) Finally, the experiments performed in this study and/or their quantitative analysis are insufficient to support several of the claims, and results are often "over-interpreted". Below I have listed some of such insufficient experiments/analyses, in regard of the interpretation that the authors make of each piece of data.

      - Fig1B could indeed show that Whi7 could rescue rad53D lethality but it is hard to judge from just one tetrad. Many tetrads should be shown to exclude "random sampling" effects.

      A: Thank you. We will add more repeats and remove over-statements. Fig 1B was carried out for at least 12 tetrads but the original picture has been unintentionally lost. We can repeat it if necessary, but the result was validated by the plasmid shuffling experiment (Fig 1C).

      - Fig1F indeed shows that the rescue effect of Whi7 overexpression on mec1Dsml1D lethality in HU does not require its G1/S transcription factor-binding motif (GTB); however, it does not prove that it is independent on any putative effects that Whi7 could have on transcription (it could affect other transcription factors, or even the same ones via other domains).

      A: Good suggestion. As far as we know, there are no reports proving that Whi7 binds to other transcription factors. To rule out this possibility, we will detect whether overexpression of WHI7 affects the transcription of representative G1/S genes (Experiment #7).

      - FigS2A does not really support the authors' claim that Whi7 is hyperphosphorylated upon HU-treatment: the first lane before HU treatment already show the same hyperphosphorylated bands than the second lane (see "darker exposure"); however, the signal intensity is clearly lower so the overall levels of Whi7 are clearly increased by HU, rather than the relative fractions of phosphorylated species.

      A: Yes, we will modify the statement as suggested.

      - Fig2B shows that HU-dependent increase in Whi7 levels is partially abrogated in rad53Dsml1D and mec1Dsml1D mutant backgrounds, which demonstrates that Whi7 upregulation requires either Rad53 or Sml1, and Mec1 or Sml1, but not Rad53/Mec1 as claimed by the authors.

      A: Thank you, we will revise the statement. The only known function of Sml1 is a small unstructured protein inhibitor of Rnr1.

      - Likewise, Fig2B does not show any significant Whi7 phosphorylation following HU-treatment in the whi7-13AP mutant with all CDK consensus sites mutated to alanine. There is indeed a slightly slower migrating band appearing as acknowledge by the authors, which also appears in the mec1Dsml1D and rad53Dsml1D backgrounds. Again here, higher Whi7 levels in the WT background make the comparison with mec1Dsml1D and rad53Dsml1D backgrounds almost impossible. Quantification of the blots, including normalization of the signals of each phosphorylated band to the total signal, could help. But overall this figure does not demonstrate any Mec1/Rad53-dependent Whi7 phosphorylation following HU treatment. The phostag gel Fig2C might show the same result, as the differences in phosTag signals between different conditions might just simply reflect the differences in total amount of Whi7 between those same conditions. However, I acknowledge that Figs 2D and S2C shows Rad53- and Mec1-triggered Whi7 phosphorylation in vitro, but the conditions of this experiments likely differ a lot from in vivo context (kinase levels, competing substrates, presence of co-factors...).

      A: Thank you, we will quantify the blotting as suggested.

      - Along the same lines, Fig3E seems to show that truncation of Whi7 C terminus slightly reduces its efficiency in pulling down Cks1 (indicating reduced interaction). However, the total amount of WT Whi7 in the pull down seems to exceed the total amount of Whi7-DeltaC protein, which could in part explain the difference in Cks1 signal. Here again, quantification of the WB signals and adequate normalization would maybe make this figure more convincing.

      A: Good suggestion. We will show the biological repeats and quantification.

      - Fig4A-B (Whi5 GFP data): the cell representing the absence of HU shows Whi5 nuclear export and therefore likely passes through G1/S; the HU-treated cell shown as example does not export Whi5 from the nucleus, certainly because it does not pass G1/S. IMHO this demonstrates that the G1/S transition is delayed in HU-treated cells (as shown previously), irrespective of any role of Whi5 or Whi7 in this delay.

      - Likewise, Fig4C shows the absence of HU-induced delay in Whi5 nuclear export in rad53Dsml1D cells; however, while the authors claim this indicates "Rad53-dependent nuclear detention of Whi5", it is equally plausible that it indicates that rad53Dsml1D cells do not delay the G1/S transition under HU treatment.

      A: good comments. We should claim both possibilities at this stage. Previous studies mainly show delays in the Start stage (e.g., down-regulate SBF transcription). CLN1/2 deletion is known to delay DNA replication in a Sic1-dependent manner albeit with unknown mechanism in the S-CDK activation stage.

      - The same ambiguity holds for Fig5A,B (Sic1-GFP quantification in whi5Dwhi7D double deletion strain following release from alpha factor block): indeed Sic1 is degraded fast after release from alpha factor block both in presence of HU, while in WT cells Sic1 is not immediately degraded in presence of HU. While authors claim that "Whi7 and Whi5 significantly slow down the Sic1 degradation", this result could also likely reflect that whi5Dwhi7D cells pass G1/S even in this context, and therefore that whi5 or whi7 or both have a role in maintaining cells in G1, not showing any direct implication of Whi5/Whi7 in Sic1 degradation.

      A: good comments. It only provides some indirect hints. For instance, whi5Dwhi7D cells pass G1/S in a same timing as WT in the absence of HU (Fig. S4), indicating that the role of Whi5/7 in the G1/S delay is related to additional checkpoint function, not normal G1 maintaining function. Moreover, it should be combined with other results, for example, dosage suppression effects in the presence of HU and inhibitory effects in the absence of HU. Direct evidence of Whi5/Whi7 in Sic1 degradation and Cks1 inhibition comes only from the biochemical experiments shown in Fig 3E-3H.

      - FigS5: the authors show here that overexpression of Whi7-WIQ (that does not bind SBF) slows down the G1/S transition following release from alpha factor blockade, but this data does not demonstrate anything related to the role of Whi7 in the DNA replication stress response. Indeed, since Whi7 sequesters Cln3 in the ER (independent of any putative role on transcription regulation), its overexpression could simply reflect an increased sequestering of Cln3 pool. What does this result become in a cln3D background?

      A: Very good suggestion. We will check whether cln3Δ affects the suppression effect of Whi7 (Experiment #8).

      Due to the fundamental concerns raised above in the interpretation of the data, it is difficult to predict the outcome of more controlled experiments that would aim to prove the same statements. This makes the estimation of the time and resources required to complete the study almost impossible.

      Minor comments

      Owing to the major comments above, an important re-structuration of the study is required, and minor comments I may have on this version are likely to be irrelevant to the revised manuscript.

      Reviewer #3 (Significance (Required)):

      The study aims to establish a molecular link between the progression through the G1/S transition and the DNA damage and DNA replication stress responses. Establishing molecular links between different phases of the cell cycle is an important question in basic research, and might be of interest for a broad range of cell biologists, even though the study is conducted in a model organism (budding yeast). The link proposed involves G1/S inhibitors Whi5 and Whi7, that would bind and inhibit the Cks1 subunit of S-CDK complexes, downstream of Rad53 and Mec1 signaling. The authors confirm some known results (e.g., Whi7 overexpression bypasses rad53 lethality in presence of HU) and gather new pieces of data using well-established methods (immunoprecipitation, spot assays, fluorescence microscopy). However, many experiments reported in this study are not sufficient to support the authors' claims, and therefore the novel mechanistic insight that this study ambitions to provide is not established.

      My scientific background being more in bio-imaging than in biochemistry, it is possible that I missed some hands-on experience to correctly interpret artefacts on western blots, however I do not feel like I missed sufficient expertise to evaluate any section of the manuscript.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, the authors report dorsomedial hypothalamus-specific PR-domain containing protein 13-knockout (DMH-Prdm13-KO) mice recapitulated age-associated sleep alterations such as sleep fragmentation and increased sleep attempts during sleep deprivation (SD). These phenotypes were further exacerbated during aging, with increased adiposity and decreased physical activity, resulting in shortened lifespan. Moreover, overexpression of Prdm13 in the DMH ameliorated sleep fragmentation and excessive sleepiness during SD in old mice. They identified maintaining Prdm13 signaling in the DMH might play an important role to control sleep-wake patterns during aging. These findings are interesting and novel and the evidence they provided looks solid.*

      We deeply appreciate that this reviewer found our findings are interesting and the evidence solid.

      *Major comments 1. The author spent a lot of words on Sirt1 in the introduction. Since Sirt1 regulates Prdm13, is there a link between the two in age-related sleep changes? If so, you can add some results and discussion. *

      Thank you very much for raising this important issue. Our previous study demonstrated that a mouse model with high hypothalamic Sirt1 activity displays reduced number of transitions between wakefulness and NREM sleep (reference # 15), revealing that hypothalamic Sirt1, as well as Prdm13, is involved in the regulation of sleep fragmentation.However, sleep propensity was not altered in Sirt1-overexpressing transgenic mice (reference #13) and DMH-Prdm13-KO mice (Fig. 1). Based on these findings, we added the following sentence in the Results.

      On page 11, line 267-274

      "...... Similarly, a mouse model with high hypothalamic Sirt1 activity displays reduced number of transitions between wakefulness and NREM sleep15, revealing that hypothalamic Sirt1, as well as Prdm13, is involved in the regulation of sleep fragmentation. Sleep propensity was not altered in Sirt1-overexpressing transgenic mice13. Given that the level of hypothalamic Prdm13 and its function decline with age, age-associated sleep fragmentation could be promoted through the reduction of Prdm13/Sirt1 signaling in the DMH, but sleep propensity might be increased by other mechanisms. "

      • In Figure 2e, the author describes n=7-8 in the figure legend, but why do both groups on the column show eight data? Is there something wrong with the statistics? Please check the statistics in the article carefully. *

      We corrected n=7-8 to n=8 in the figure legend of Fig. 2e.

      • DMH is known as one of the major outputs of hypothalamus circadian system and is involved in the circadian regulation of sleep-wakefulness (J.Neurosci. 23, 10691-10702 ; Nat Neurosci 4:732-738). Does Prdm13 correlate with circadian rhythms? The author can add relevant content to the discussion *

      As per this reviewer's suggestion, we added the following sentence in the Discussion on page 20, line 500-508,

      "For instance, it would be of great interest to elucidate whether Prdm13 signaling in the DMH contributes to regulate the circadian system, since the DMH is known to be involved in the regulation of several circadian behaviors32,33. Although DMH-Prdm13-KO mice did not display abnormal period length compared with controls, further studies are needed to address this possibility."

      *Minor comments 1. The immunohistochemical diagram in the paper is not representative enough, as shown in FIG. 2b and c. *

      We apologize that our presentation in Figs. 2a-c was confusing. Although Fig. 2b shows the numbers of cFos cells in the entire region of the DMH (summed up from three DMH regions), the images in Fig. 2c are from one of DMH regions for each condition. To avoid confusion, we revised the legend of Figs. 2a-c and the manuscript in the Results as follows:

      -In the figure legend of Figs. 2a-c

      "a, Total numbers of cFos+ cells ......... b,c, Images of DMH sections at bregma -1.67 mm ......."

      -In the Results on page 7, line 180

      "...... the hypothalamus, the DMH (summed up from bregma -1.67 to -1.91mm) showed a greater number of cFos+ cells during SD compared to SD-Cont (Fig. 2a-c, Supplementary Fig. 2a)..... "

      • In FIG. 5h, the authors showed that the effect of overexpression of Prdm13 was very obvious, but the expression range of the virus after injection was lacking. Is there a fluorescent gene such as GFP on the virus to directly see the expression of the virus in the brain? *

      Unfortunately, we do not hold extra samples to check the distribution of the virus after injection. However, we have established sufficient injection technique to target the DMH using the lentivirus system that we used in this study (Satoh et al Cell Metab 2013).

      • Were mice singly housed or housed in groups? *

      Most of the mice were housed in groups, except for the DR study. We added this information in the section Animal models of the Methods on page 41, line 935

      ".....RIKEN BRC. Most of the mice were housed in groups, except for the DR study. For the DR study ,..... "

      • The part of sleep analysis needs to be further refined. How can REM and NREM in mice be distinguished and according to what criteria? *

      We added the criteria to define NREM and REM in the section Sleep analysis of the Methods on page 42, line 995-998.

      ".......with visual examination. EEG periods dominated by higher amplitude delta wave activity with nuchal muscle atonia were scored as NREM sleep epochs. REM sleep consisted of periods of semi-uniform theta activity EEG with muscle atonia and/or muscle atonia with brief myoclonic twitches. Score was blinded ......"

      • The authors may consider adding more recent literature related to DMH and sleep, such as DOI: 10.1093/cercor/bhac258 * We incorporated this reference to the following sentence in the section Results on page 8, line 194.

      "........ Although DMH neurons are linked to sleep21, aging and longevity .... "

      *Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: In this study, Tsuji et al. demonstrate that Prdm13 signaling is involved in the regulation of sleep-wake pattern. They also identified Prdm13 as a transcription factor in the DMH neurons. Major comments: 1. The evidence presented in Fig. 1 of age-related sleep fragmentation is potentially problematic. Although many previous studies have demonstrated fragmented sleep, especially fragmentation of NREM sleep, in aged mice compared to young mice, the data here do not suggest NREM fragmentation, because no change in the NREM bout duration was found. REM, on the other hand, may indeed have fragmentation during the dark phase, but REM only takes a small portion of the total sleep. Therefore, the conclusion that sleep is fragmented in old mice is not fully supported by Fig. 1. I noticed that the authors used 4-6 months old mice as the young group. Mice of this age can hardly be called "young". The females even start to have lowered fertility. This might be one of the reasons for the discrepancy between this and other studies. Repeating these experiments (and others involving the young group) with mice of more appropriate age (usually 2-3 months old) is recommended. Nonetheless, aging-caused sleep change is not new knowledge and has been reported repeatedly. This part of the results should be in the supplementary figures. *

      We deeply appreciate this reviewer's comment. In accordance with this reviewer's suggestion, we carefully reconsidered the age of young mice. Most of published studies used mice at 2 to 4 months of age as the young group [2 to 4-month-old (7 studies), 4.6-month-old (1 study), 6-month-old (1 study), 2 to 6-month-old (1 study)]. Thus, to strictly use mice at 3-4 months of age as the young group, we excluded data of one cohort using mice at 6 months of age (2 mice each age group). Consistent with many previous studies, our revised data demonstrated that sleep fragmentation during NREM sleep is predominantly observed in old mice compared with young mice, particularly during the dark period. Based on these new results, we revised Fig.1, Suppl Fig.1, and all description related to Fig. 1 (manuscript on page 5-7, line 103-171). We would like to keep Fig. 1 as it is. Since most of the previous studies used males but not females, data from females are still lacking in the field (Campos-Beltran and Marshall, Pflugers. Arch., 473:841-851, 2021).

      • The sleep phenotypes in aged mice and in Prdm13-KO mice are clearly distinct from each other. In the old mice (Fig. 1), REM sleep is fragmented but the total amount remains unchanged, and NREM sleep is increased (both bout number and total amount), indicating there may be more REM-to-NREM transitions, which the authors should quantify. However, Fig. 3 shows in Prdm13-KO mice, there is no REM fragmentation. In fact, it even seems to stabilize REM. But NREM duration is shorted, and no change in the total NREM or REM sleep time. These results suggest that the sleep alterations caused by aging and Prdm13-KO might have some overlap but are mostly in parallel and likely through different mechanisms. Therefore, the rationale of connecting Prdm13 signaling to aging-caused sleep changes is questionable. Is there a developmental change of Prdm13 expression in DMH between young and old mice? The authors also showed that Prdm13-KO in old mice caused decrease in NREM duration but has no effect on REM sleep, but in normal old mice, it is REM, but not NREM that has a defect. Prdm13 overexpression also only mildly decreased NREM bout number without affecting the episode duration of either NREM or REM, which can hardly be interpreted as "ameliorating sleep fragmentation". To me, all these results just suggest parallel actions of Prdm13 and aging on sleep, with Prdm13 mostly affecting NREM sleep but aging mostly impairing REM sleep. *

      We deeply appreciate this reviewer's keen eyes. We carefully reassessed REM sleep data in Fig. 3. The revised data showed that whereas the duration of NREM episodes in DMH-Prdm13-KO mice during the dark period were significantly shorter compared to control group, the duration of REM episodes in the KO mice was not significantly altered. Therefore, after revising Fig. 1 and 3, our results showed that both aging and Prdm13-KO similarly affect the duration of NREM sleep episodes. These results suggest that sleep fragmentation, in particular, during NREM sleep, is commonly observed in old mice and DMH-Prdm13-KO mice. In addition to sleep fragmentation during NREM sleep, excessive sleepiness during SD was also commonly observed in old mice and DMH-Prdm13-KO mice. On the other hand, the effect of aging and Prdm13-KO on sleep propensity was distinct from each other. We think that age-associated sleep fragmentation could be promoted through Prdm13 signaling in the DMH, but sleep propensity might be increased by other mechanisms. We described these results and possibilities in the Results, and revised the Abstract as follows:

      On page 11, line 264-274

      "activity in DMH-Prdm13-KO mice (Fig. 3h, Supplementary Fig. 3f-h). Together, sleep fragmentation during NREM sleep and excessive sleepiness during SD are commonly observed in old mice and DMH-Prdm13-KO mice, but the effects of aging and Prdm13-KO on sleep propensity were distinct from each other.............. Given that the level of hypothalamic Prdm13 and its function decline with age16, age-associated sleep fragmentation could be promoted through the reduction of Prdm13/Sirt1 signaling in the DMH, but sleep propensity might be increased by other mechanisms."

      On page 2, line 45-46

      "Dietary restriction (DR), a well-known anti-aging intervention in diverse organisms, ameliorated age-associated sleep fragmentation and increased sleep attempts during SD, whereas these effects of DR were abrogated in DMH-Prdm13-KO mice."

      As this reviewer pointed out, the effect of Prdm13 overexpression on NREM sleep fragmentation seems to be moderate, but we still observed effects on excessive sleepiness during SD. Thus, we revised the manuscript related to Prdm13-overexpression study in the Abstract and Results as follows:

      On page 2, line 47-48

      "Moreover, overexpression of Prdm13 in the DMH ameliorated sleep fragmentation and excessive sleepiness during SD in old mice."

      On page 16, line 387-401

      "Overexpression of Prdm13 in the DMH partially affects age-associated sleep alterations

      ...... (Fig. 5h). The number of wakefulness and NREM sleep episodes in old Prdm13-OE mice were significantly lower, whereas duration of wakefulness in old Prdm13-OE mice tended to be longer than old control mice during the dark period with no change in the duration of NREM episodes (Fig. 5i,j). Intriguingly, .... Thus, the restoration of Prdm13 signaling in the DMH partially rescue age-associated sleep alterations, but its effect on sleep fragmentation is moderate."

      • What is the control manipulation for sleep deprivation? The authors need to clarify this in the Methods. Also, sleep deprivation has confounding effects including but not limited to stress, food deprivation (since food was removed during SD), human experimenter (since a gentle-touch method was used). Without proper controls for these variables, the authors should avoid concluding that the changes they saw at cellular level are due to sleep loss. *

      Thank you very much for this suggestion. We added detailed description for AL-SD (the control manipulation for SD) in the section SD study of the Materials as follows:

      On page 42-43, line 1014-1020

      "Mice for control manipulation (AL-SD) were also individually housed prior to the experiment without SD and food removal. We checked the level of blood glucose in the SD study, and found that the level of blood glucose was indistinguishable between SD and AL-SD groups (126±6 and 131±4 mg/dL, respectively), revealing that nutritional status is equal between these two groups."

      Identification of Prdm13+ cells using neuronal markers should be performed in addition to electrophysiological characterizations.

      We performed immunofluorescence using anti-MAP2 antibody and confirmed that most Prdm13+ cells are neurons. We added this new result in Suppl Fig. 2g.

      • Figs. 6 and 7 seem very disconnected from the main story. Identification of Prdm13 as a transcription factor is potentially interesting, but how does it account for its role in affecting sleep? The criteria of picking Cck, Grp and Pmch out of other candidate genes potentially regulated by Prdm13 and the rationale to investigate these genes seem unclear. More importantly, no evidence was shown regarding how Cck/Grp *

      Base on RNA-sequencing using DMH samples from DMH-Prdm13-KO and control mice, we got several candidate genes as downstream genes of Prdm13. After validating the candidate genes by qRT-PCR, Cck, Grp and Pmch were detected as top-hit genes. We thus further assessed these three genes in this study. Our result showed that Cckexpression in the hypothalamus significantly declines with age. Based on other literature, hypothalamic Cck seems to be involved in sleep control. Therefore, it is conceivable that Prdm13 controls age-associated sleep alterations via modulating Cck expression. However, as this reviewer pointed out, we are still lacking the evidence showing the role of Prdm13/Cck axis in age-associated sleep alterations. We now clearly described the limitation of our study in the Discussion on page 23, line 560-562.

      "However, the detailed molecular mechanisms by which Prdm13 in the DMH regulates age-associated sleep fragmentation and excessive sleepiness during SD still need to be elucidated in future study. "

      *Minor comments: 1. Please note on the images of Fig. 2d what the green fluorescence was. It is very confusing as is, given that it's surrounded by quantifications of c-fos in the figure. *

      The label "Prdm13" was added in Fig. 2d.

      Please note use a different color for Prdm13 in several figure images (e.g., Fig. 2f, g, 7a,d, and Supplementary 2c). Yellow usually means overlap of red and green.

      Since we have four-color images in Fig. 7, we consistently used yellow for Prdm13 throughout the main figures of the paper. At this moment, we would like to keep the current version of images, but we will revise images if the editor of affiliate journal requests this revision.

      • Please note the statistic test results on power spectrum graphs. *

      We added the statistic test results on power spectrum graphs in Figs. 1d, 4c, and 5d.

      • Inconsistency between the graphs in Fig. 3d and the description in the text. Fig. 3d suggests no change in Wake episode duration, significant decrease in Dark phase NREM and significant increase in Dark phase REM, whereas lines 224-227 in the main text state "The duration of wakefulness episodes ... was significantly shorter than control mice during the light period, and the duration of NREM sleep episodes ...was significantly longer ... during the dark period (Fig. 3d)". Which one is correct? Please check. *

      We apologize for this typo and unclear description. We revised the sentence regarding Fig. 3d as follows:

      On page 10, line 242-246

      "The duration of wakefulness episodes in DMH-Prdm13-KO mice was significantly shorter than control mice during the light period between ZT0 to ZT2. The duration of NREM sleep episodes in DMH-Prdm13-KO mice was significantly shorter than control mice during the dark period (Fig. 3d). These results indicate that DMH-Prdm13-KO mice showed mild sleep fragmentation compared with control mice."

      • Fig. 5f, Y-axis title should be EEG SWA. * We corrected it.
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      We are grateful to the reviewers for highlighting the novelty of the mechanism we describe for P2Y2 in driving RGD-binding integrin-dependent invasion, and acknowledging its potential in cancer therapy. We thank the reviewers for their valuable and detailed comments, which have allowed us to prepare a significantly stronger and clearer manuscript.

      2. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):


      Summary

      The study identifies P2Y2 as a purinergic receptor strongly associated with hypoxia, cancer expression and survival. A link is found between P2Y2-integrin interaction and cancer invasion, highlighting this as a novel therapeutic target. The mechanism is interesting and general well explored.

      • *

      We thank the reviewer for acknowledging the novelty of the therapeutic target presented in this work.

      • *

      Minor comments

      As P2Y2 is highly expressed by other cell types found with tumours, including vascular endothelium and leukocytes, the authors should reflect on this as a confounding factor in the analysis of adrenocarcinoma gene expression analysis. I appreciate the RNAscope work may resolve this issue to some extent.

      We agree that P2Y2 is known to be expressed in other cell types. RNAscope did not show convincing staining in PDAC normal adjacent tissue (was similar to negative staining), perhaps due to the challenging nature of pancreatic tissue with respect to RNA degradation. We have resolved this issue by including single cell RNA-seq of normal human pancreas for P2Y2 from Protein Atlas (Sup. Fig. 2B), which shows expression in several cell types, mainly endocrine cells, and macrophages. We now mention this in line 142 : “P2Y2 is known to be expressed at low levels in normal tissues but interestingly RNAscope did not detect this. This data suggest 1) the lower limits of the technique compounded by the challenge of RNA degradation in pancreatic tissue and 2) supports that in tumour tissue where it was detected there was indeed overexpression of P2Y2, in line with the bioinformatic data. Interrogating single cell P2Y2 RNA expression in normal PDAC from proteinatlas.org (Karlsson et al., 2021), expression was found at low levels in several cells types, for example in endocrine cells and macrophages (Sup. Fig. 2B).”

      Major comments

      • *

      The authors correctly identify that the level of ATP in the tumour microenvironment can be very high, typically 100uM or so. However, these concentrations are supramaximal for P2Y2 activation, at which ATP has an approximate EC50 of 100nM. Coupled with the fact that many cell types, including cancer cells, constitutively secrete ATP, there is an opportunity to explore the effects of lower ATP concentrations in some assays, or provide some concentration-response relationship to give more confidence of P2Y2-dependent effects.

      • *

      We thank the reviewer for raising this point and we agree that 100 uM can be a high concentration, albeit one that is frequently used throughout the literature. We have now included a concentration-response relationship (Sup. Fig. 2D) showing that ATP causes cytoskeletal changes that are P2Y2 dependent most prominently at 100 uM, the concentration that, as the reviewer has also corroborated, is similar to the concentration of ATP found in tumours.

      Also, the authors describe the use of cancer cells where P2Y2 has been knocked out using CRISPR. Does this KO have an effect on cancer invasion? The effect of ARC should be absent in these cells and give confidence the effects of ARC are P2Y2-dependent, as some off-target effects of this antagonist have been reported. To explore the influence of constitutive P2Y2 activity, the authors should explore the effects of ARC alone in some assays.

      We agree that including more AR-C only experiments would be informative, so we have included a 3D sphere invasion assay with our CRISPR cell line treated with and without AR-C that shows no effect in invasion (p = 0.4413) (Sup. Fig. 3J). We have now also included images of AsPC-1 cells transfected with Lifeact, showing no changes in morphology with AR-C only (Sup. Fig. 2E). We apologise for missing a ‘+’ in one of the supplementary figures which shows AR-C only in AsPC-1 cells has no effect on its own.

      The effects of the CRISPR cell line in invasion are shown in Fig. 3F, showing a significant reduction (p = 0.0005) in invasion.

      The title of the manuscript implies extracellular ATP drives cancer invasion, though in my opinion this statement is not fully explored. Though ATP/UTP are applied at supramaximal concentrations for P2Y2 activation, the influence of ATP in the cell culture microenvironment without exogenous application is not explored. One would predict that scavenging extracellular ATP with apyrase would negatively impact invasiveness and the proximity of integrin and P2Y2 without ATP/UTP application if constitutively secreted ATP is involved. Pharmacological manipulation of ectonucleotidase activity is an alternative. Experimental route to explore this.

      We agree and have changed the title of our article to “Purinergic GPCR-integrin interactions drive pancreatic cancer cell invasion”. Our 3D sphere experiments with the CRISPR cell line show a reduction in invasion without exogenous application of ATP, which we also see to a lesser extent in our siRNA P2Y2 cell line. We have tested our sphere model with apyrase but unfortunately, the buffer used for apyrase to work is not compatible with our gel composition. Pharmacological manipulation would be a very good alternative if the cells used expressed high levels of CD39 or PANX1, which unfortunately they don’t. We hypothesise that most basal extracellular ATP in our 3D spheres comes from hypoxic areas that cause cell death, just as is postulated for tumours.

      Immunoprecipitation experiments of native proteins would be more convincing data that P2Y2 and integrin physically interaction, as opposed to being in close proximity. This would also overcome artifacts of interaction that can be attributed to receptor overexpression.

      We attempted immunoprecipitation experiments but unfortunately ran into several technical difficulties, including the anti-aV antibody working poorly for Western blot. Immunoprecipitation of these proteins has been reported by others (PMID: 25908848), supporting the proposed interaction.

      DNA-PAINT super resolution microscopy allows for quantification of nanoscale distances, and we used this to calculate the distances where physical interaction occurs. The possibility of this close proximity being by chance is accounted for in the computational nearest neighbour distance calculation by calculating points randomly distributed. This random distribution calculation also helps in overcoming artifacts of interaction due to overexpression, as the random distributed points are the same number of points as the proteins detected in each condition for each region of interest. Importantly, we also performed DNA-PAINT in using untransfected AsPC-1 thus endogenous levels (no receptor overexpression or alteration) and saw similar results (Sup. Fig.4A-D), thus we are confident of the interactions reported.

      Finally, we alter the RGD motif, which underpins the physical interaction, and see significant changes that match observations in previous publications using the P2Y2 agonist UTP, mentioned in the discussion: Line 398 “Following ATP stimulation, the number of P2Y2 proteins at the plasma membrane decreased significantly after one hour, implying receptor internalisation, in line with previous work showing P2Y2 at the cell surface was reduced significantly after one hour of UTP stimulation (Tulapurkar et al., 2005).” and Line 408: “P2Y2 affecting cell surface redistribution of αV integrin has been reported, with αV integrin clusters observed after 5 min stimulation with UTP (Chorna et al., 2007)”

      It is currently not clear what the mechanistic relationship between P2Y2 activity, P2Y2-integrin proximity and RGD motif is. Do the authors suggest the RGD domain becomes exposed upon receptor activation? The mechanism is not fully articulated in the discussion.

      We apologise for any lack of clarity in our postulated mechanism, we have now included a more detailed explanation of the mechanism in the discussion : Line 417 “We speculate that by reducing the ability of integrins to bind to the RGD of P2Y2, through receptor internalisation, RGE mutation or through cRGDfV treatment, there is less RGD-triggered integrin endocytosis, hence less integrin recycling and an increase of integrins at the cell surface.”

      Reviewer #1 (Significance (Required)):


      General assessment: A novel mechanism is presented for therapeutic intervention of cancer. The study relies on supramaximal concentrations of agonist and overexpressed receptors. Role of endogenous P2Y2 not fully explored. The study lacks in vivo evidence of the importance of this mechanisms. Cell developed in the study could be used in mouse models to explore effect on tumour growth.

      Advance: Integrin and P2Y2 interactions are already documented but not in context of cancer.

      Audience: basic research

      We thank the reviewer for crediting this work as a novel mechanism for therapeutic intervention of cancer. We trust that the new data provided (as discussed above) have resolved the concerns of the reviewer as we now have provided an explanation for the concentrations used. We do rely on overexpressed receptors for a small portion of our experiments, however, all experiments with overexpressed receptors were then tested in cells with endogenous expression of P2Y2 and used pharmacological means to show the same behaviour. We have now clarified this. We have also included in the discussion a sentence about the mouse experiment performed by Hui et al. with regards to reduced tumour growth when targeting P2Y2: Line 365: “Combination treatment of subcutaneous xenografts of AsPC-1 or BxPC-3 cells with the P2Y2 antagonist AR-C together with gemcitabine significantly decreased tumour weight and resulted in increased survival compared to placebo or gemcitabine monotherapy control (Hu et al., 2019).”

      • *

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Considering the fact that most PDAC are characterized by a high level of extracellular purines content, authors decided to study the expression of the 23 genes coding for membrane proteins involved in the binding or transport of purines in available PDAC transcriptomic cohorts. This approach led to the identification of P2Y2, a GPCR, as the best predictor for the worst survival of patients. Using in vitro models, they show that P2Y2 expression is associated with increased invasion capacity of pancreatic cancer cells and that this pro-invasive effect is dependent on the interaction of P2Y2 with αV integrin via the RGD motif.

      Major comments:

      • It is not clear to me why authors decided at one point to perform a GSEA comparing low and high mRNA expression of P2Y2 and why they decided to focus on the potential interaction of P2Y2 with integrin αV. As a GPCR, activation of P2Y2 leads to the activation of several downstream signaling pathways that may directly impact the adhesion, migration, and invasion properties of cells. Moreover, despite the presence of the RGD motif in P2Y2, it is not excluded that it may bind (maybe more efficiently) to other "cell adhesion" molecules.

      We apologise if the link between the GSEA figure and focusing on the potential integrin interaction was not clear. We have now performed GSEA using the panther gene set library, which includes a “Integrin signalling pathway” gene set. This was the top ranked gene set in both cohorts and we have substituted the GSEA figure for this instead (Fig. 2D). We trust that the narrative of the manuscript and our rationale to pursue the importance of integrin interaction is now clear.

      We agree with the reviewer and believe that P2Y2 may bind to other molecules important in cell adhesion. We studied integrin interactions due to the clear relationship of P2Y2 and integrins in patient data, which was not as evident with other binding partners. Furthermore, this relationship is unexplored in cancer and offers novel therapeutic strategies.

      • Similarly, if αV can regulate P2Y2 signaling, what about the regulation of αV signaling pathways by P2Y2? αV integrin has to bind to a β subunit and, depending on the identity of the β subunit, may have distinct regulations and so different impact on cell invasion. How P2Y2 can interfere with these α/β ratios?

      We thank the reviewer for this comment, and have now included western blots showing the impact of P2Y2 treatment on integrin signalling through FAK and ERK (Fig 5). We agree that the β subunit may have distinct regulation and outputs, but this is outwith the scope of our current study.

      • While it has been shown in other studies, in this work, there is no real proof of the interaction between P2Y2 and αV. Only in Figure 4I, where the authors look at the NND We thank the reviewer for raising this point as it has made us realise that our chosen NND of * *

      • Surprisingly, in the absence of ATP, P2Y2 RGE mutant, which should no more interact with αV, show a 2 to 3 fold more vicinity to αV compared to WT P2Y2. How can the authors explain this?

      We agree that this is a suprising, but robust discovery. By altering the RGD motif, there may be less RGD-triggered integrin endocytosis, leading to increased integrins at the surface. We have included this hypothesis in the discussion in Line 417. The RGE mutation has less affinity to integrins, meaning it still retains some ability to bind to integrins. Hence by chance, a higher number of integrins will result in a higher number of interactions with the RGE. We are planning to interrogate the internalisation dynamics in a future study.

      • For DNA-PAINT experiments, the authors only focus on membrane proteins whose amounts are balanced by internalization, recycling and export from internal compartment. As claimed, but not demonstrated by the authors, interaction of P2Y2 and αV may interfere with all these steps, thereby increasing or decreasing the cell surface expression of both proteins. Hence, it would be useful to 1) control proteins levels by western blot, especially for the overexpressed P2Y2, to be sure that they are the same, 2) block internalization and/or export to decipher the important steps.

      • In fact, all these main questions are raised by the authors in the end of the discussion but so far, they only show that the RGD motif has an impact on the biological role of P2Y2 (cell invasion) and on the membrane dynamic of αV and itself.

      We thank the reviewer for the suggestions:

      • In the course of our attempts to perform co-IP for P2Y2 and aV we could confirm that P2Y2 expression levels were equivalent (see Fig below – for reviewers only), but the problems with anti-aV antibodies prevented completion of the experiment. We also show IF staining showing similar levels of P2Y2 for both overexpressed conditions (Sup. Fig. 3K).

      Figure: Immunoprecipitation of P2Y2 showing similar P2Y2 levels in AsPC-1 P2Y2CRISPR cells trasfected with P2Y2RGD or P2Y2RGE and treated with 100 µM of ATP or control for 1 hour. Antibody used: anti-P2Y2 (APR-010, Alomone Labs).

      • As the reviewer highlights, in this work we have focused on the role of P2Y2 in PDAC invasion and have looked at single-molecule resolution membrane dynamics of αV and P2Y2. The different steps of P2Y2 and integrin αV interactions in internalisation, recycling and export are certainly interesting to study but beyond the scope of the current manuscript and in our future aims. We include these ideas in the discussion as suggestions for future research and as a possible explanation for the dynamics observed.
      • Fig 2A, authors use RNAscope in order to reveal P2Y2 mRNA expression and distribution in tumor versus normal tissue from 2 patients. They rather show the protein expression, using the antibody they used in other experiments, by standard IHC and in a higher number of patients, including short and long survival, to confirm that the results they obtain by bioinformatics study of transcriptomic data are real.

      We now explicitly mention a paper (PMID: 30420446) that performed IHC of P2Y2 in 264 patients showing that P2Y2 was predominantly found in the tumour area, matching our bioinformatics study: Line 141 “matching our findings from larger publicly available cohorts, including P2Y2 IHC data from 264 patients in the Renji cohort (Hu et al., 2019).” and Line 359 “These observations were supported by published immunohistochemical staining of 264 human PDAC samples, showing that P2Y2 localised predominantly in cancer cells in human PDAC…”

      • Some figure legends are incorrectly numbered or described, such as the figure 4.

      We apologise for the incorrectly described figures in figure 4, this has now been corrected.

      • *

      Minor comments:

      • Can we reasonably talk about OMIC while studying 23 genes? In fact, as described by Timothy A. J. Haystead in 2006 (PMID: 16842150) the purinome is constituted of about 2000 genes coding for proteins binding to purines (including all kinases for example). Author should redefine they pool of genes as perhaps purines receptors/transporter?

      We agree with the reviewer and have redefined the pool of genes to ‘purinergic signalling genes’ or ‘(part of the) extracellular purinome’.

      • P2Y2 and ADORA2B associated with worse survival while P2Y11 and ADORA2A are associated with better survival (Figure 1B). Would it be more interesting to understand why proteins of the same family act in opposite ways?

      We have now included text exploring this idea in the discussion. Both P2Y2 and ADORA2B show increased expression with HIF-1α and/or hypoxia and the inverse happens with ADORA2A, for example. Line 352: “Adenosine A2B receptor requires larger agonist concentrations for activation compared to other receptors in the same family, such as adenosine A2A (Bruns, Lu and Pugsley, 1986; Xing et al., 2016), and receptor expression has been reported to increase when cells are subjected to hypoxia (Feoktistov et al., 2004). Moreover, HIF-1α has been shown to upregulate A2B and P2Y2 expression in liver cancer (Tak et al., 2016; Kwon et al., 2019).”

      • Figure 1C, any value for the correlation with Survival? Cause this is not so obvious in the figure.

      We agree this correlation needs strengthening with a numeric value, we have now included a Kaplan-Meier curve of high vs low Winter hypoxia score PDAC patients showing significantly lower survival with higher Winter hypoxia score (Sup. Fig. 1B).

      • *

      • Regarding the correlation of P2Y2 and ADORA2B with hypoxia scores, any HIF1 responsive element in promoter? What happens regarding the expression level of these genes when cells are transferred to low oxygen conditions?

      We thank the reviewer for these questions. The relationship of P2Y2 and ADORA2B with hypoxia and/or HIF-1α has been explored in other publications which are now cited in the discussion. Line 356: “Moreover, HIF-1α has been shown to upregulate A2B and P2Y2 expression in liver cancer (Tak et al., 2016; Kwon et al., 2019).” Of note, a HIF1-α responsive element has been reported for A2B, but as yet not for P2Y2.

      • Figure 4 E to M are too small.

      We apologise and have now increased the size of the graphs and the figure.

      • In Supp Figure 4, what are the "Non-altered AsPC-1 cells"?

      We apologise for the confusion that may have arisen from calling normal AsPC-1 cells “Non-altered AsPC-1 cells”. We have changed this to ‘Normal AsPC-1 cells (untransfected and unchanged P2Y2 expression).

      • *

      Reviewer #2 (Significance (Required)):

      Strengths: All the data shown are experimentally and statistically strong.

      Limitations: This study remains largely descriptive with no real molecular mechanism that could at least partially explain the biological role of P2Y2 regarding cell invasion.

      Advance: Limited

      We thank the reviewer for noting the experimental strength of the paper.

      After the suggested changes, including integrin signalling experiments, and strengthening our DNA-PAINT results, the molecular mechanism presented in this work has been strengthened and clarified significantly. These changes have also helped greatly in the mechanistic explanation of the role of P2Y2 in cell invasion.

      • *

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):


      The authors concentrate on the members of the purinome and attempt to identify members of the pathway that are especially relevant for PDAC biology, especially invasion and metastatic spread. Using the in silico analysis of transcriptome data from publicly available PDAC patient cohorts, the authors identify P2Y2 as being the most prominent in terms of cancer cell expression and with highest impact on patient survival. The authors than take an effort in functional characterization of P2Y2 and demonstrate that downregulation/deletion of P2Y2 leads to abrogation of ATP activated invasion in hanging drop spheroid model system in a very reasonable and scientifically good way. Finally, the authors postulate that the P2Y2 actions go over interaction with integrin AlphaV and modulations of the cellular cytoskeleton and show via DNA PAINT that a direct interaction of the 2 molecules. The hypothesis is experimentally elaborated in a sound way mostly using cell culture as a system.

      The study is solid communicated, the number of experiments seems to be fine. For my understanding, the study relies much on mRNA data (gene expression in cell lines and patient samples), I would suggest providing evidence on protein level what might have been challenging due to potential lack of specific antibody.

      We thank the reviewer for acknowledging our experimentally elaborated hypothesis and our solid communication of the study. As mentioned before, we now explicitly mention a paper (PMID: 30420446) that performed IHC of P2Y2 in 264 patients showing that P2Y2 was predominantly found in the tumour area, matching our bioinformatics study.

      Reviewer #3 (Significance (Required)):


      To strengthen the hypothesis experimentally, I would suggest the experiments listed below:

      Figure 1: The authors took a solid bioinformatic effort and analyzed expression of different genes of the purinome pathway in different PDAC patient and cell gene expression databases. In this part, the authors rely a lot on correlation of hypoxia and define high hypoxia scores and low hypoxia scores from previously published datasets. Although hypoxia surely plays an important biological condition in the PDAC, I am not sure I get the connection between purinome pathway and hypoxia. Few sentences give a broad introduction about hypoxia-purinome connection in the discussion part of the manuscript, but I think the readership would benefit from more specific statements (which drug, which hypoxic target, which system-mouse/human/cells, what was the exact discovery) and connect those specific statements to the work that has been done here.

      We agree with the reviewer that the study can benefit from more information about the hypoxia-purinergic signalling link. Hence, we have now included more detailed explanations of how hypoxia and purinergic signalling are related in the discussion, giving more information about the cell types and the exact discovery. Line 338: “Purinergic signalling has been associated classically with hypoxia and immune function in cancer (Di Virgilio et al., 2018). One of the first reports of hypoxia inducing ATP release in cells identified an increase of extracellular ATP in rat heart cells when kept in hypoxic conditions (Forrester and Williams, 1977). PDAC is a highly hypoxic cancer, with high levels of ATP reported in the tumour interstitial fluid of human and mouse PDAC tissues compared to healthy tissues (Hu et al., 2019).”

      Do the authors attempt to state here that hypoxic PDACs are those with worse prognosis and more aggressive and thus try to associate members of the purine pathway with those "worse" PDACs? Surprisingly, there is relatively little knowledge about hypoxia in PDAC and I would not suggest using it in this context as a predictor. Reports do suggest that hypoxia forces the emerging of resistant phenotypes but if the authors want to use hypoxic signatures, they have to fortify better (with literature) why do they choose hypoxia and what is the hypothesis that connects hypoxia to purinome, what makes this connection worth investigating.

      We thank the reviewer for raising the question of PDAC and worse prognosis with hypoxia. We have now included a Kaplan-Meier curve of high vs low Winter hypoxia score PDAC patients showing significantly lower survival with higher Winter hypoxia score (Sup. Fig. 1B). The significant link with poor survival shown with hypoxia and the inclusion of more detailed explanation of the links with hypoxia and purinergic signalling proteins (metioned above), now clarify the reasoning for investigating this connection.

      I find the statement "hypoxia in tumor core" a bit tricky, acute and chronic hypoxia can occur anywhere in the tumor, to my knowledge there are no reports saying only the tumor core suffers from hypoxia in PDAC. PDAC being especially rich in stroma in all of its parts is probably more prone to overall hypoxia and not only in tumor core.

      We agree that “hypoxia in tumour core” can be a tricky statement. We have changed “tumour core” to tumour cell compartment and have cited data that demonstrate hypo-vascularisation found in the juxta-tumoural stroma, due to PDAC cells inhibiting angiogenesis (PMID: 27288147). This paper supports our hypothesis of distribution of oxygen being reduced in the tumour area. Hence why we hypothesise that purinergic genes would be preferentially expressed in the tumour area: Line 112 “We hypothesised that genes related to high hypoxia scores would be expressed preferentially in the tumour cell compartment, as PDAC cells inhibit angiogenesis, causing hypo-vascularisation in the juxta-tumoural stroma (Di Maggio et al., 2016).”

      We would like to clarify that we do not beileve that only the tumour core suffers from hypoxia, we hypothesise that there is more hypoxia in the tumour cell areas. Although there are no reports of only the tumour core suffering from hypoxia, there is evidence of the tumour epithelial region of the cancer having a greater range of hypoxia (1-39%) compared to the stromal (1-13%) (PMID: 26325106). Moreover, all our analyses point to most purinergic genes differentially expressed in patients with high hypoxic scores being also related to cancer cells and the tumour region. These bioinformatic results linking certain genes like P2RY2 and ADORA2B with hypoxia are also supported in published work cited in the discussion (Line 354 and 356).

      I would suggest that the authors rely on published subtyping of PDAC

      patient cohorts (Collisson et al, 2010; Bailey et al; Moffit et al, 2015; Chan-Sen-Yue, 2020)

      and correlate the expression of purinome genes with the QM/basal-like PDAC subtype that has been confirmed multiple times as the "bad predictor" and use those subtypes for correlation with purinome pathway members. In figure 1E is also shown that P2RY2 is high in expression in basal-like subtype.

      We thank the reviewer for this suggestion and have included the subtyping of patients in the PAAD-TCGA cohort in Sup. Table 1 and added comments about the genes related to the different subtypes in the text: Line 88 “In the Bailey model, most genes were related to the Immunogenic subtype except for NT5E, ADORA2B, PANX1 and P2RY2, which related to Squamous (Bailey et al., 2016). Collisson molecular subtyping showed several purinergic genes associated mostly to quasimesenchymal and exocrine subtypes (Collisson et al., 2011). The Moffit subtypes were not strongly associated with purinergic genes except for ADA, NT5E, P2RY6, P2RY2 and PANX1 associated with the Basal subtype (Moffitt et al., 2015).” and Line 345 “Expression of most purinergic genes was associated predominantly with immune cells, immunogenic PDAC subtype and low hypoxia scores (Fig. 1C, E). In contrast, expression of genes correlated with worse survival and hypoxia (PANX1, NT5E, ADORA2B and P2RY2) was associated with tumour cells and the squamous PDAC subtype, correlating with hypoxia, inflammation and worse prognosis (Bailey et al., 2016).”

      We did not include the subtyping of Chan-Sen-Yue, 2020, due to the similarities with Moffit and the lack of correlation of basal/classical types with purinergic signalling genes as many of them are not expressed in cancer cells.

      Figure 2: In further course of the paper the authors elaborate on possible functions of P2RY2 in PDAC. Although the mRNA data is pretty elaborate, the RNA SCOPE ISH has been performed on only 3 (!) patient PDAC samples. To demonstrated the mRNA is really found in tumor and not in normal adjacent tissue or stroma, I would strongly suggest to increase the number of samples here. The authors should perhaps try to co-localize ISH signals with IF/IHC for some other cancer cell marker, e.g. PanCK or GATA6/KRT81 in human samples to differentiate basal-like from classical samples;If possible, I would even suggest to perform immunohistochemistry instead of RNA scope and confirm the presence of the receptor. If there is an issue with the antibody availability, please state so in the manuscript so that it is clear to the readers why mRNA expression is favored over protein.

      We thank the reviewer for these suggestions.

      RNAscope was used to verify our trascriptomic bioinformatic results of location of expression P2Y2 in the tumour from publicly available data of 60 pairs of laser microdissection of PDAC epithelial and stromal tissue and the PAAD TCGA deconvolution of 177 patients. We have experienced issues with RNAscope due to the RNA degradation in pancreatic tissue and other technical difficulties which unfortunately led to only having 3 samples showing staining with the positive control. All three successful samples showed P2Y2 expression located in cancer cells. The images presented show the location of P2Y2 RNA expression in the tumour region, which was the aim of the RNAscope experiment.

      RNAscope only captures mRNA expression above a specific threshold, and we are aware that P2Y2 will be expressed in other cell types in the normal adjacent as seen in the deconvolution. We have now included in supplementary single cell RNAseq data of normal PDAC tissue to counteract this issue (Sup. Fig. 2B).

      We also cite a publication that has performed P2Y2 IHC in 264 patients and showed that P2Y2 protein expression was predominantly shown in the epithelial tumour region (PMID: 30420446), hence staining of P2Y2 in a high number of patients has already been performed: Line 359 “These observations were supported by published immunohistochemical staining of 264 human PDAC samples, showing that P2Y2 localised predominantly in cancer cells in human PDAC”

      As shown in Fig. 1 E, P2Y2 is associated with basal and classical tumour cells, not just exclusively to basal, hence the staining to differentiate subtypes is not pertinent to the focus of this paper.

      The GSEA data indicated that high P2Y2 expression relates to processes of adhesion/ECM/cytoskeleton organization where the authors draw the conclusion (based also on published data mostly on neuronal/astrocyte work) that P2Y2 may interact with integrins over the RGD domain and thus contribute to invasion an migration. Since this is a very important assumption, I would strongly suggest to expand the experiments of figure 2E and 2G on at least 2 more PDAC cell line, if possible include some with originally epithelial morphology (eg. HPAFII, HPAC...).The visualization of filaments can be done with common IF staining, eg. phalloidin, no need for stable expression.

      Perhaphs the reviewer missed Sup. Fig. 2F, where data from Figure 2G are recapitulated in 3 different cell lines. We support the idea of the reviewer in including epithelial morphology cells, hence we added an extra cell line to have 2 cells with epithelial morphology, BxPC-3 and CAPAN-2.

      We have tried repeating the experiment in Fig. 2E in epithelial cells, but the way the epithelial cells grow in clusters (Sup. Fig. 2F) make it very difficult to evaluate the morphology of individual cells and get quantifiable results. Nonetheless, we show phenotypic similarities of BxPC-3 to AsPC-1 cells in the invasion assays.

      I would also be in favor of investigating the expression of EMT markers upon ATP stimulation.

      We thank the reviewer for the suggestion, although feel this is out of scope for our study. There have been recent controversies with reference to EMT and cancer metastasis (PMID:31666716) but more importantly we see changes in cell morphology 1 hour after ATP treatment, indicating it is not/not just EMT.

      How was 100µM/5µM chosen as a working concentration?

      We have now included figures showing different concentrations of ATP (Sup. Fig. 2D) and AR-C (Sup. Fig. 2E) to illustrate how the concentrations were selected based on the greatest change in morphology for ATP and the full recovery of original cell morphology for AR-C.

      • *

      AsPC-1 is also known as the cell line that gladly migrates and invades, usually used in metastatic modeling of PDAC. Would be interesting to see if another cell line that is not that migrative (HPAF II) presents the same effect...

      This is an interesting point, although we haven’t performed experiments with low migrative cells, later on the work, invasion assays with the epithelial cell line BxPC-3, which has a very different migrative nature, presented the same effect (Sup. Fig. 3G, F). We also perform invasion assays with PANC-1 cells, which also recapitulate an invasive phenotype when transfected with P2Y2.

      Is treatment with ATP inducing expression of P2RY2 maybe? What is happening with Intergrin expression upon ATP treatment? Since the hypothesis is that extracellular ATP is driving the invasion, I would certainly suggest to investigate if ATP treatment induces expression of P2RY2 in a time and dose dependent manner.

      We thank the reviewer for this suggestion. We have now changed the title to “Purinergic GPCR-integrin interactions drive pancreatic cancer cell invasion”, hence shifting from a focus on extracellular ATP and focusing on the effects of the RGD motif in invasion.

      Figure 3:

      The authors made very good efforts here to provide functional evidence that P2Y2 is really involved and essential for ATP induced invasion in PDAC cells. They performed an 3D hanging drop spheroid model for invasion in co-culture with stellate cells and show that ATP treatment leads to invasive behavior that is than blocked by addition of P2Y2 antagonist or RGD blocking peptides . Although stellate cells are a nice add-on, keeping in mind the very complex tissue microenviroment of the PDAC, I don't rate the presence of stellate cells here as essential. Are the results the same when experiments are performed without stellate cells?

      We thank the reviewer for raising this point, as it has allowed us to clarify that the stellate cells are crucial for this assay to work as they are essential for the formation of the cancer spheres due to their matrix deposition. We have included the hanging drop with and without stellate cells to illustrate this point (Sup. Fig. 3A)

      EMT markers increase upon ATP stimulation, do not increase under siRNA downregulation of P2Y2?

      As mentioned above, we thank the reviewer for the comment, but we are not focusing on EMT, given the rapidity of the phenotype we observe.

      Furthermore, the authors downregulate the P2Y2 using the siRNA/CRISPR-Cas9 approach and confirm that the P2Y2 is really involved in the invasive spread also using the specific RGD block. Experiments in the figure 3 are fairly done and provide functional evidence for the hypothesis. I would suggest that for clarity reasons on every panel (A, B,C...) is written which cell line is used (mostly Aspc1) and for the siRNA experiment I would suggest writing directly on the figure the time points (48h-72h post tranfection) and shortly explain in the text why was mRNA evaluated as the measure of siRNA efficacy and not the protein? Probably the antibody problem, though western-blot applicable antibodies are available.

      We thank the reviewer for acknowledging that the experiments in figure 3 provide functional evidence for our hypothesis. We agree with the reviewer and for clarity have included the cell line in each panel and the time point post transfection. We now include a Western blot showing protein levels in the siRNA P2Y2 treatment (Sup. Fig. 3I).

      Furthermore, for providing higher impact, I would encourage the experiments to be performed (at least in part) in a PDAC cell line with epithelial morphology (eg. HPAF II or any other that expresses the P2Y2 to a reasonable level).

      We agree that performing this experiment with an epithelial morphology cell line provides higher impact, hence why we performed the experiment in BxPC-3 cell lines, perhaps missed in Sup. Fig. 3G and H. We now highlight that they are epithelial-like in the text.

      Figure 5: By using the DNA-PAINT method, the authors demonstrated that integrin av and P2Y2 physically interact in the cell membrane over the RGD domain and these interactions are essential for ATP induced P2Y2 mediated invasion in Aspc1 cells. The performed work seems plausible, however, I leave the technical evaluation of this experiment to experts in the field.

      General suggestion:

      I believe the work would benefit from a clinical/patient perspective if the authors show by immunohistochemistry in PDAC tissue samples that P2Y2 is localized at the invasive front/or metastasis. Is there a surrogate marker that can be used to label ATP rich regions in the tumor, are those regions at the invasive front? Are the P2Y2 positive cells those cells at the invasive front?

      This is an interesting suggestion but immunostaining has already been performed on a large cohort of 264 PDAC patients (PMID: 30420446) and expression was consistent throughout the tumour cells.

      • *
    1. Blog About Shared Future SOS Currency Public Blog About Shared Future SOS Currency Public Blog Follow @collabfund The Optimal Amount of Hassle May 19, 2021 SHARE ↓ by Morgan Housel @morganhousel Copy Link Steven Pressfield wrote for 30 years before publishing The Legend of Bagger Vance. His career leading up to then was bleak, at one point living in a halfway house because it had cheap rent. He once spoke about the people he met living there: The people in this halfway house, we used to hang out in the kitchen and talk all night long, were among the smartest people that I ever met and the funniest and the most interesting. And what I concluded from hanging out with them and from others in a similar situation was that they weren’t crazy at all. They were actually the smart people who had seen through the bullshit. And because of that, they couldn’t function in the world. They couldn’t hold a job because they just couldn’t take the bullshit, and that was how they wound up in institutions. The greater society thought, “Well these people are absolute rejects. They can’t fit in.” But in fact they were actually the people that really saw through everything. This may not have been Pressfield’s point, but it reminds of something I’ve long believed. If you recognize that BS is ubiquitous, then the question is not “How can I avoid all of it?” but, “What is the optimal amount to put up with so I can still function in a messy and imperfect world?” If your tolerance is zero – if you are allergic to differences in opinion, personal incentives, emotions, inefficiencies, miscommunication and such – your odds of succeeding in anything that requires other people rounds to zero. You can’t function in the world, as Pressfield says. The other end of the spectrum – fully accepting every incidence of nonsense and hassle – is just as bad. The world will eat you alive. The thing people miss is that there are bad things that become bigger problems when you try to eliminate them. I think the most successful people recognize when a certain amount of acceptance beats purity. Theft is a good example. A grocery store could eliminate theft by strip-searching every customer leaving the store. But then no one would shop there. So the optimal level of theft is never zero. You accept a certain level as an inevitable cost of progress. BS, in all its forms, is similar. A unique skill, an underrated skill, is identifying the optimal amount of hassle and nonsense you should put up with to get ahead while getting along. Franklin Roosevelt – the most powerful man in the world whose paralysis meant the aides often had to carry him to the bathroom – once said, “If you can’t use your legs and they bring you milk when you wanted orange juice, you learn to say ‘that’s all right,’ and drink it.” Every industry and career is different, but there’s universal value in that mentality, accepting hassle when reality demands it. Volatility. People having bad days. Office politics. Difficult personalities. Bureaucracy. All of them are bad. But all have to be endured to some degree if you want to get anything done.

      This reminds me of "Work Somewhere Dysfunctional" article

    1. Author Response

      Reviewer #2 (Public Review):

      In this manuscript, Polyák et al. report detailed and systematic functional, electrocardiographic, electrophysiologic (both in vivo and in vitro experiments) and histological analysis in a large animal (canine) model of exercise to assess risk of ventricular arrhythmia susceptibility. They find that exercise-trained dogs have a slower heart rate (not accounted by heightened vagal tone alone and consistent with recent work from Denmark), an increased ventricular mass and fibrosis, APD lengthening due to repolarisation abnormality, enhanced HCN4 expression and decreased outward potassium channel density together with increased ventricular ectopic beats and ventricular fibrillation susceptibility (open-chest burst pacing). The authors suggest these changes as underlying the risk of VA in athletes, and appropriately caution against consigning the beneficial effects of exercise. In general, this study is well done, reasonably well-written, with reasonable conclusions, supported by the data presented and is much needed. There are some methodological, however, given the paucity of experimental data in this area, I think it would still be additive to the literature.

      Strengths:

      1. This is an area with very limited experimental data- this is an area of need.

      2. The study, in general seems to be well-conducted with two clear groups

      3. The use of a large animal model is appropriate

      4. The study findings, in general, support the authors conclusions

      5. The authors have shown some restraint in their conclusions and the limitations section is detailed and well written.

      Weaknesses:

      1. There are some methodological issues:

      a. Authors should explain what the conditioning protocol was and why it was necessary.

      In order to cause as little discomfort as possible to the animals, we selected animals that were naturally cooperative with the researchers and not afraid of the noise of the treadmill. This selection period lasted about three weeks, during which the animals were not exercised in a formal setting, but familiarized with the experimental setting and walked on the treadmills for a few minutes. During the conditioning period, both control and trained animals were equally handled.

      Following your remarks the corresponding part of the text was extended properly explaining the training protocol in more detail.  

      b. The rationale for the exercise parameters chosen needs to be presented.

      Experimental data on large animal models are very limited. Sled dogs are considered the highest elite of dog exercise. The distances they run are taken as a reference, although this protocol is not exactly the same due to the conditions of training, sledding, and weather. The most widely known races are the Norwegian Finnmarksløp and the Alaskan Iditarod, take place on snow and cover distances ranging from 500–1569 km in a continuous competition lasting for up to 14 days to be completed. (Calogiuri & Weydahl, 2017)

      Based on these data, preliminary experiments were conducted to determine the maximum running time and intensity that dogs can sustain without distress, injuries, or severe fatigue. We increased the intensity of exercise in line with the animals' performance. The detailed training protocol and the daily running distances applied are presented in Table 1. Now, a new figure, Figure 1, and a new table, Table 1, illustrate a detailed experimental timeline in the revised manuscript.

      Reference:

      Calogiuri, G., & Weydahl, A. (2017). Health challenges in long-distance dog sled racing: A systematic review of literature. Int J Circumpolar Health, 76(1), 1396147. https://doi.org/10.1080/22423982.2017.1396147

      c. Open chest VF induction was a limitation, and it was unnecessary.

      d. A more refined VT/VF induction protocol was required. This is a major limitation to this work.

      C, D: Thank you for the reviewer’s comment. For a detailed explanation of the VF induction procedures, please see our responses to question 11 of Reviewer #2.

      e. The concept of RV dysfunction has not been considered in the study and its analysis.

      Thank you for the suggestion. The complexity of our study and the capacity of our laboratory limited the work that could be carried out, but we are planning to perform additional studies involving the RV.

      f. The lack of a quantitative measure for fibrosis is a limitation.

      At the Department of Pathology, there was no opportunity to analyze myocardial fibrosis quantitatively. As described by Mustroph et al., quantitative analysis of fibrosis can be based on appropriate software measuring the amount of fibrotic area per total area on digitized slides. Such software was not available during the evaluation. This is a limitation of the study; however, the semi-quantitative assessment in histology reports is widely accepted in human pathology (Mustroph et al., 2021).

      Reference:

      Mustroph, J., Hupf, J., Baier, M. J., Evert, K., Brochhausen, C., Broeker, K., Meindl, C., Seither, B., Jungbauer, C., Evert, M., Maier, L. S., & Wagner, S. (2021). Cardiac Fibrosis Is a Risk Factor for Severe COVID-19. Front Immunol, 12, 740260. https://doi.org/10.3389/fimmu.2021.740260

      1. Statistical analysis requires further detail (checking of normality of the data/appropriate statistical test).

      Thank you for this comment. This question has been answered in response to question 12 of Reviewer #2 and the statistical part of the methodology in the manuscript has been updated.

      1. The use of Volders et al. study as a corollary in the discussion does not seem justified given that this study used AV block induced changes as an acquired TdP model.

      We agree with the reviewer that the two models involve completely different mechanisms. Therefore, in order to avoid misunderstandings, we have deleted the part of the discussion that made the comparison with the study by Volders et al.(Volders et al., 1998; Volders et al., 1999) Nevertheless, the exercise-induced compensatory adaptive mechanisms of the athlete's heart have been considered as a phenomenon completely distinct from pathological conditions, yet the electrical remodeling observed in our model indicates important similarities with the experimental model of long-term complete AV block. For example, both resulted in profound bradycardia, compensated cardiac hypertrophy, prolonged QTc interval, APD prolongation, and increased spatial and temporal dispersion of repolarization. These changes were attributed to the downregulation of potassium currents and were associated with increased ventricular arrhythmia susceptibility. Therefore, we hypothesized that the mechanisms of increased propensity for ventricular fibrillation in this model may have a similar electrophysiological background to the compensated hypertrophy studies of Volders et al. However, the autonomic changes, the potential impairment of the conduction system of the athlete’s heart, and the electrophysiological background require further, more detailed investigations.

      References:

      Volders, P. G., Sipido, K. R., Vos, M. A., Kulcsar, A., Verduyn, S. C., & Wellens, H. J. (1998). Cellular basis of biventricular hypertrophy and arrhythmogenesis in dogs with chronic complete atrioventricular block and acquired torsade de pointes. Circulation, 98(11), 1136-1147. https://doi.org/10.1161/01.cir.98.11.1136

      Volders, P. G., Sipido, K. R., Vos, M. A., Spatjens, R. L., Leunissen, J. D., Carmeliet, E., & Wellens, H. J. (1999). Downregulation of delayed rectifier K(+) currents in dogs with chronic complete atrioventricular block and acquired torsades de pointes. Circulation, 100(24), 2455-2461. https://doi.org/10.1161/01.cir.100.24.2455

    1. Author Response

      Reviewer #1 (Public Review):

      This article is aimed at constructing a recurrent network model of the population dynamics observed in the monkey primary motor cortex before and during reaching. The authors approach the problem from a representational viewpoint, by (i) focusing on a simple center-out reaching task where each reach is predominantly characterised by its direction, and (ii) using the machinery of continuous attractor models to construct network dynamics capable of holding stable representations of that angle. Importantly, M1 activity in this task exhibits a number of peculiarities that have pushed the authors to develop important methodological innovations which, to me, give the paper most of its appeal. In particular, M1 neurons have dramatically different tuning to reach direction in the movement preparation and execution epochs, and that fact motivated the introduction of a continuous attractor model incorporating (i) two distinct maps of direction selectivity and (ii) distinct degrees of participation of each neuron in each map. I anticipate that such models will become highly relevant as neuroscientists increasingly appreciate the highly heterogeneous, and stable-yet-non-stationary nature of neural representations in the sensory and cognitive domains.

      As far as modelling M1 is concerned, however, the paper could be considerably strengthened by a more thorough comparison between the proposed attractor model and the (few) other existing models of M1 (even if these comparisons are not favourable they will be informative nonetheless). For example, the model of Kao et al (2021) seems to capture all that the present model captures (orthogonality between preparatory and movement-related subspaces, rotational dynamics, tuned thalamic inputs mostly during preparation) but also does well at matching the temporal structure of single-neuron and population responses (shown e.g. through canonical correlation analysis). In particular, it is not clear to me how the symmetric structure of connectivity within each map would enable the production of temporally rich responses as observed in M1. If it doesn't, the model remains interesting, as feedforward connectivity between more than two maps (reflecting the encoding of many more kinematic variables) or other mechanisms (such as proprioceptive feedback) could well explain away the observed temporal complexity of neural responses. Investigating such alternative explanations would of course be beyond the scope of this paper, but it is arguably important for the readers to know where the model stands in the current literature.

      Below is a summary of my view on the main strengths and weaknesses of the paper:

      1) From a theoretical perspective, this is a great paper that makes an interesting use of the multi-map attractor model of Romani & Tsodyks (2010), motivated by the change in angular tuning configuration from the preparatory epoch to the movement execution epoch. Continuous attractor models of angular tuning are often criticised for being implausibly homogeneous/symmetrical; here, the authors address this limitation by incorporating an extra dimension to each map, namely the degree of participation of each neuron (the distribution of which is directly extracted from data). This extension of the classical ring model seems long overdue! Another nice thing is the direct use of data for constraining the model's coupling parameters; specifically, the authors adjust the model's parameters in such a way as to match the temporal evolution of a number of "order parameters" that are explicitly manifested (i.e. observable) in the population recordings.

      I believe the main weakness of this continuous attractor approach is that it - perhaps unduly binarises the configuration of angular tuning. Specifically, it assumes that while angular tuning switches at movement onset, it is otherwise constant within each epoch (preparation and execution). I commend the authors for carefully motivating this in Figure 2 (2e in particular), by showing that the circular variance of the distribution of preferred directions is higher across prep & move than within either prep or move. While this justifies a binary "two-map model" to first order, the analysis nevertheless shows that preferred directions do change, especially within the preparatory epoch. Perhaps the authors could do some bootstrapping to assess whether the observed dispersion of PDs within sub-periods of the delay epoch is within the noise floor imposed by the finite number of trials used to estimate tuning curves. If it is, then this considerably strengthens the model; otherwise, the authors should say that the binarisation reflects an approximation made for analytical tractability, and discuss any important implications.

      We thank the reviewer for the suggested analysis. We have included this new analysis in Fig. S1.

      First of all, in Fig 2e of the previous version of the manuscript, we were considering three time windows during preparation and two time windows during movement execution. We are now using a shorter time window of 160ms, so that we can fit three time windows within either epoch. The results do not change qualitatively, and the results of the bootstrap analysis below do not change based on the definition of this time window.

      The bootstrap analysis is described in detail in the second paragraph of the Methods sections (“Preparatory and movement-related epochs of motion”). The bootstrap distribution is generated by resampling trials with repetitions (and keeping the number of trials per condition the same as in the data), while shuffling the temporal windows in time, within epochs. For example: for condition 1, we have 43 trials in the data. In one trial of the bootstrap distribution for condition 1, each one of the 3 time windows of the delay period is chosen at random (with repetitions) between the possible 43*3 windows from the data. The analysis shows that the median variance of preferred directions from the data is significantly larger than the one from the bootstrap samples.

      This suggests that neurons do change their preferred direction within epochs, but these changes are smaller in magnitude than changes that occur between the epochs. We explicitly comment on this in the methods, and in the main text we point out that considering only two epochs is a simplifying assumption, and as such it can be thought as a first step towards building a more complete model that shows dynamics of tuning within both preparatory and execution epochs. Note, however, that this simple framework is enough for the model to recapitulate to a large extent neuronal activity, both at the level of single-units and at the population level.

      2) While it is great to constrain the model parameters using the data, there is a glaring "issue" here which I believe is both a weakness and a strength of the approach. The model has a lot of freedom in the external inputs, which leads to relatively severe parameter degeneracies. The authors are entirely forthright about this: they even dedicate a whole section to explaining that depending on the way the cost function is set up, the fit can land the model in very different regimes, yielding very different conclusions. The problem is that I eventually could not decide what to make of the paper's main results about the inferred external inputs, and indeed what to make of the main claim of the abstract. It would be great if the authors could discuss these issues more thoroughly than they currently do, and in particular, argue more strongly about the reasons that might lead one to favour the solutions of Fig 6d/g over that of Fig 6a. On the other hand, I see the proposed model as an interesting playground that will probably enable a more thorough investigation of input degeneracies in RNN models. Several research groups are currently grappling with this; in particular, the authors of LFADS (Pandarinath et al, 2018) and other follow-up approaches (e.g. Schimel et al, 2022) make a big deal of being able to use data to simultaneously learn the dynamics of a neural circuit and infer any external inputs that drive those dynamics, but everyone knows that this is a generally ill-posed problem (see also discussion in Malonis et al 2021, which the authors cite). As far as I know, it is not yet clear what form of regularisation/prior might best improve identifiability. While Bachschmid-Romano et al. do not go very far in dissecting this problem, the model they propose is low-dimensional and more amenable to analytical calculations, such that it provided a valuable playground for future work on this topic.

      We agree with the reviewer that the problem of disambiguating between feedforward and recurrent connections from observation of the state of the recurrent units alone is a degenerate problem in general.

      By explicitly looking for solutions that minimize the role of external inputs in driving the dynamics, we argued that the solutions of Fig 4d/g are favorable over the one of Fig 4a because they are based on local computations implemented through shorter range connections compared to incoming connections from upstream areas; as such, they likely require less metabolic energy.

      In the new version of the paper, we discuss this issue more explicitly:

      Degeneracy of solutions. We considered the case where parameters are inferred by minimizing a cost function that equals the reconstruction error only (this corresponds to the case of very large values of the parameter α in the cost function). Figure 4—figure supplement 2 shows that after minimizing the reconstruction error, the cost function is flat in a large region of the order parameters. We also added Figure 5—figure supplement 5, to show that the dynamics of the feedforward network looks almost indistinguishable from the one of the recurrent network (Fig.5) - although the average canonical correlation coefficient is a bit lower for the purely feedforward case.

      Breaking the degeneracy of solutions. We added Figure 4—figure supplement 1 to show that for a wide range of the parameter α, all solutions cluster in a small region of parameter space. Solutions are found both above and below the bifurcation line. Note that all solutions are such that parameters jA and jB are close to the bifurcation line that separate the region where tuned network activity requires tuned external input, and the region where tuned network activity can be sustained autonomously. Furthermore, the weight of recurrent-connections within map B (j_B) is much stronger than the corresponding weight for map A (j_A). Hence, we observe that external inputs play a stronger role in shaping the dynamics during motor preparation than during execution, while recurrent inputs dominate the total inputs during movement execution, for a broad range of values of alpha. This prediction needs to be tested experimentally, although it is in line with the results of ref. 39, as we explain in the Discussion, section “Interplay between external and recurrent currents”, last paragraph.

      3) As an addition to the motor control literature, this paper's main strengths lie in the modelcapturing orthogonality between preparatory and movement-related activity subspaces (Elsayed et al 2016), which few models do. However, one might argue that the model is in fact half hand-crafted for this purpose, and half-tuned to neural data, in such a way that it is almost bound to exhibit the phenomenon. Thus, some form of broader model cross-validation would be nice: what else does the model capture about the data that did not explicitly inspire/determine its construction? As a starting point, I would suggest that the authors apply the type of CCA-based analysis originally performed by Sussillo et al (2015), and compare qualitatively to both Sussillo et al. (2015) and Kao et al (2021). Also, as every recorded monkey M1 neuron can be characterized by its coordinates in the 4-dimensional space of angular tuning, it should be straightforward to identify the closest model neuron; it would be very compelling to show side-by-side comparisons of single-neuron response timecourses in model and monkey (i.e., extend the comparison of Fig S6 to the temporal domain).

      We thank the reviewer for these suggestions. We have added the following comparisons:

      ● A CCA-based analysis (Fig 5.a) shows that the performance of our model is qualitatively comparable to the Sussillo et al. (2015) and Kao et al (2021) at generating realistic motor cortical activity (average canonical correlation ρ = 0.77 during movement preparation and 0.82 during movement execution).

      ● For each of the 141 neurons in the data, we selected the corresponding one in the model that is closest in the eta- and theta- parameters space:

      a) A side-by-side comparison of the time course of responses shows a good qualitative agreement (Fig 5.c).

      b) We successfully trained a linear decoder to read the responses of these 141 neurons from simulations and output trial-averaged EMG activity recorded from a monkey performing the same task Fig 5.b.

      c) Figure 5—figure supplement 4 shows that simulated data presents sequential activity, as does the recorded data.

      In our simulations, the temporal variability in single-neuron responses is due to the temporal evolution of the inferred external inputs, and to noise, implemented by an Ornstein-Uhlenbeck (OU) process that is added to the total inputs. Another source of variability could be introduced in the synaptic connectivity: one could add a gaussian random variable to each synaptic efficacy, for example. We checked that this simple extension of our model is able to reproduce the dynamics of the order parameters seen in the data. A full characterization of this extended model is beyond the scope of our paper.

      4) The paper's clarity could be improved.

      We thank the reviewer for his feedback. We have significantly rewritten most sections of the paper to improve clarity.

      Reviewer #2 (Public Review):

      The authors study M1 cortical recordings in two non-human primates performing straight delayed center-out reaches to one of 8 peripheral targets. They build a model for the data with the goal of investigating the interplay of inferred external inputs and recurrent synaptic connectivity and their contributions to the encoding of preferred movement direction during movement preparation and execution epochs. The model assumes neurons encode movement direction via a cosine tuning that can be different during preparation and execution epochs. As a result, each type of neuron in the model is described with four main properties: their preferred direction in the cosine tuning during preparation (denoted by θ_A) and execution (denoted by θ_B) epochs, and the strength of their encoding of the movement direction during the preparation (denoted by η_A) and execution (denoted by η_B) epochs. The authors assume that a recurrent network that can have different inputs during the preparation and execution epochs has generated the activity in the neurons. In the model, these inputs can both be internal to the network or external. The authors fit the model to real data by optimizing a loss that combines, via a hyperparameter α, the reconstruction of the cosine tunings with a cost to discourage/encourage the use of external inputs to explain the data. They study the solutions that would be obtained for various values of α. The authors conclude that during the preparatory epoch, external inputs seem to be more important for reproducing the neuron's cosine tunings to movement directions, whereas during movement execution external inputs seem to be untuned to movement direction, with the movement direction rather being encoded in the direction-specific recurrent connections in the network.

      Major:

      1) Fundamentally, without actually simultaneously recording the activity of upstream regions, it should not be possible to rule out that the seemingly recurrent connections in the M1 activity are actually due to external inputs to M1. I think it should be acknowledged in the discussion that inferred external inputs here are dependent on assumptions of the model and provide hypotheses to be validated in future experiments that actually record from upstream regions. To convey with an example why I think it is critical to simultaneously record from upstream regions to confirm these conclusions, consider two alternative scenarios: I) The recorded neurons in M1 have some recurrent connections that generate a pattern of activity that is based on the modeling seems to be recurrent. II) The exact same activity has been recorded from the same M1 neurons, but these neurons have absolutely no recurrent connections themselves, and are rather activated via purely feed-forward connections from some upstream region; that upstream region has recurrent connections and is generating the recurrent-like activity that is later echoed in M1. These two scenarios can produce the exact same M1 data, so they should not be distinguishable purely based on the M1 data. To distinguish them, one would need to simultaneously record from upstream regions to see if the same recurrent-like patterns that are seen in M1 were already generated in an upstream region or not. I think acknowledging this major limitation and discussing the need to eventually confirm the conclusions of this modeling study with actual simultaneous recordings from upstream regions is critical.

      We agree with the reviewer that it is not possible to rule out the hypothesis that motor cortical activity is purely generated by feedforward connectivity.

      In the new version of the paper, we discuss more explicitly the fact that neural activity can be fully explained by feedforward inputs, and we added Figure 5—figure supplement 5 to show that the dynamics of the feedforward network looks almost indistinguishable from the one of the recurrent network (Fig.5), provided their parameters are appropriately tuned. Notice, however, that a canonical correlation analysis comparing the activity from recording with the one from simulations shows that the average canonical correlation coefficient is slightly lower for the case of a purely feedforward network (Fig.5.a vs Fig.S12.a).

      A summary of our approach is:

      • We observe that both a purely feedforward and a recurrent network can reproduce the temporal course of the recordings equally well (see also our answer to question 5 below);

      • We point out that a solution that would save metabolic energy consumption is one where the activity is generated by recurrent currents (with shorter range local connections) rather than by feedforward inputs from upstream regions (long-range connections).

      • We study the solution that best reproduces the recorded activity and minimizes inputs from upstream regions.

      In the Discussion, we included the Reviewer’s observation that our hypothesis needs to be tested by simultaneous recordings of M1 and upstream regions, as well as measures of synaptic strength between motor cortical neurons. See the second paragraph of page 14: “ Our prediction (…) will be necessary to rule out alternative explanations”. Yet, we think that the results of reference [51] are consistent with our results.

      One last point we would like to stress is that external inputs drive the network's dynamics at all times, even in the solution that we argue would save metabolic energy consumption: untuned inputs are present throughout the whole course of the motor action, also during movement execution, and they determine the precise temporal pattern of neurons firing rates.

      2) The ring network model used in this work implicitly relies on the assumption that cosinetuning models are good representations of the recorded M1 neuronal activity. However, this assumption is not quantitatively validated in the data. Given that all conclusions depend on this, it would be important to provide some goodness of fit measure for the cosine tuning models to quantify how well the neurons' directional preferences are explained by cosine tunings. For example, reporting a histogram of the cosine tuning fit error over all neurons in Fig 2 would be helpful (currently example fits are shown only for a few neurons in Fig. 2 (a), (b), and Figure S6(b)). This would help quantitatively justify the modeling choice.

      We thank the reviewer for this observation. Fig.S2.e-f shows the R^2 coefficient of the cosine fit; in particular, we show that the R^2 of the cosine fit strongly correlates with the variables \eta, which represent the degree of participation of single units to the recurrent currents. Units with higher \eta (the ones that contribute more to the recurrent currents) are the ones whose tuning curves better resemble a cosine. However, the plot also shows that the R^2 coefficient of the cosine fit is pretty low for many cells. To show that a model with cosine tuning can yield this result, we repeated the same analysis on the units in our simulated network. In our simulations, all neurons receive a stochastic input mimicking large fluctuations around mean inputs that are expected to occur in vivo. We selected the 141 units whose activity more strongly resembled the activity of the 141 recorded neurons (see figure caption for details). We then looked at the tuning curves of these 141 units from simulations, and calculated the R^2 coefficient of the cosine fit. Figure 5—figure supplement 2.c shows that the result agrees well with the data: the R^2 coefficient is pretty low for many neurons, and correlates with the variable \eta. To summarize, a model that assumes cosine tuning, but also incorporates noise in the dynamics, reproduces well the R^2 coefficient of the cosine fit of tuning curves from data. We added the paragraph “Cosine tuning “ in the Discussion to comment on this point.

      3) The authors explain that the two-cylinder model that they use has "distinct but correlated"maps A and B during the preparation and movement. This is hard to see in the formulation. It would be helpful if the authors could expand in the Results on what they mean by "correlation" between the maps and which part of the model enforces the correlation.

      We thank the reviewer for this comment. By correlation, we meant the correlation between neural activity during the preparatory and movement-related temporal intervals. In the model, the correlation between the vectors θA and θB induces correlation in the preparatory and movement-related activity patterns. To make the paper easier to read, we are not mentioning this concept in the Results; in the Discussion, we explicitly refer to it in the following two paragraphs:

      “A strong correlation between the selectivity properties of the preparatory and movement-related epochs will produce strongly correlated patterns of activity in these two intervals and a strong overlap between the respective PCA subspaces.” (Discussion, section Orthogonal spaces dedicated to movement preparation and execution)

      “The correlation between the vectors θAand θB (Discussion, section Interplay between external and recurrent currents)”

      4) The authors note that a key innovation in the model formulation here is the addition ofparticipation strengths parameters (η_A, η_B) to prior two-cylinder models to represent the degree of neuron's participation in the encoding of the circular variable in either map. The authors state that this is critical for explaining the cosine tunings well: "We have discussed how the presence of this dimension is key to having tuning curves whose shape resembles the one computed from data, and decreases the level of orthogonality between the subspaces dedicated to the preparatory and movement-related activity". However, I am not sure where this is discussed. To me, it seems like to show that an additional parameter is necessary to explain the data well, one would need to compare fit to data between the model with that parameter and a model without that parameter. I don't think such a comparison was provided in the paper. It is important to show such a comparison to quantitatively show the benefit of the novel element of the model.

      We thank the reviewer for this comment.

      ● The key observation is that without the parameters eta_A, eta_B, the temporal evolution of all neurons in the network is the same (only the noise term added to the dynamics is different). To show this, we have performed a comparison of the temporal evolution of the firing rates of single neurons of the model with data. Fig 5.c shows a comparison between the time-course of single neurons firing rates from data and simulations (good agreement), while Figure 6—figure supplement 2.a shows the same comparison for a model in which all neurons have the same value of the eta_A, eta_B parameters (worse agreement: the range of firing rates is the same for all neurons). In summary, the parameters eta_A, eta_B introduce the variability in the coupling strengths that is necessary to generate heterogeneity in neuronal responses.

      ● At the end of section “PCA subspaces dedicated to movement preparation and execution”, we refer to (Figure 6—figure supplement 2).c, showing that a model with eta_A=1=eta_B for all neurons yields less orthogonal subspaces.

      5) The model parameters are fitted by minimizing a total cost that is a weighted average of twocosts as E_tot = α E_rec + E_ext, with the hyperparameter α determining how the two costs are combined. The selection of α is key in determining how much the model relies on external inputs to explain the cosine tunings in the data. As such, the conclusions of the paper rely on a clear justification of the selection of α and a clear discussion of its effect. Otherwise, all conclusions can be arbitrary confounds of this selection and thus unreliable. Most importantly, I think there should be a quantitative fit to data measure that is reported for different scenarios to allow comparison between them (also see comment 2). For example, when arguing that α should be "chosen so that the two terms have equal magnitude after minimization", this would be convincing if somehow that selection results in a better fit to the neural data compared with other values of α. If all such selections of α have a similar fit to neural data, then how can the authors argue that some are more appropriate than others? This is critical since small changes in alpha can lead to completely different conclusions (Fig. 6, see my next two comments).

      All the points raised in questions 5 to 8 are interrelated, and we address them below, after Major issue 8.

      6) The authors seem to select alpha based on the following: "The hyperparameter α was chosen so that the two terms have equal magnitude after minimization (see Fig. S4 for details)". Why is this the appropriate choice? The authors explain that this will lead to the behavior of the model being close to the "bifurcation surface". But why is that the appropriate choice? Does it result in a better fit to neural data compared with other choices of α? It is critical to clarify and justify as again all conclusions hinge on this choice.

      7) Fig 6 shows example solutions for 2 close values of α, and how even slight changes in the selection of α can change the conclusions. In Fig. 6 (d-e-f), α is chosen as the default approach such that the two terms E_rec and E_ext have equal magnitude. Here, as the authors note, during movement execution tuned external inputs are zero. In contrast, in Fig. 6 (g-h-i), α is chosen so that the E_rec term has a "slightly larger weight" than the E_ext term so that there is less penalty for using large external inputs. This leads to a different conclusion whereby "a small input tuned to θ_B is present during movement execution". Is one value of α a better fit to neural data? Otherwise, how do the authors justify key conclusions such as the following, which seems to be based on the first choice of α shown in Fig. 6 (d-e-f): "...observed patterns of covariance are shaped by external inputs that are tuned to neurons' preferred directions during movement preparation, and they are dominated by strong direction-specific recurrent connectivity during movement execution".

      8) It would be informative to see the extreme case of very large and very small α. For example, if α is very large such that external inputs are practically not penalized, would the model rely purely on external inputs (rather than recurrent inputs) to explain the tuning curves? This would be an example of the hypothetical scenario mentioned in my first comment. Would this result in a worse fit to neural data?

      We agree with the reviewer that it is crucial to discuss how the choice of the parameter alpha affects the results, and we have strived to improve this discussion in the revised manuscript.

      I. When we looked for the coupling parameters that best explain the data, without introducing a metabolic cost, we found multiple solutions that were equally good (see Figure 4—figure supplement 2 and our answer to question (1) above). These included the solution with all couplings set to zero ( j_s^B = j_s^A = j_a = 0), as well as many solutions with different values of synaptic couplings parameters. The solution with the strongest couplings is close to the bifurcation line, in the area where j_s^B > j_s^A.

      II. We then introduced a metabolic cost to break the degeneracy between these different solutions. The cost function we minimized contains two terms; their relative strength is modulated by alpha. The case of very small alpha (i.e., only minimizing external input) yields a very poor reconstruction of neural dynamics and is not interesting. The case of very large alpha reduces to the case (I) above. We added Figure 4—figure supplement 1 to show the results for intermediate values of alpha - alpha is large enough to yield a good reconstruction of neural dynamics, yet small enough to ensure that we find a unique solution. For these intermediate values of alpha, the two terms of the cost function have comparable magnitudes. Although slight changes in the selection of alpha do change whether the solutions are above or below the bifurcation surface, Figure 4—figure supplement 1 shows that all solutions are close to the bifurcation surface. In particular, the value of j_s^B is close to its critical value, while we never find solutions where j_s^A is close to its critical value - we never find solutions in the lower-right region of the plot in Figure 4—figure supplement 1. The critical value for j_s^B is the one above which no tuned external inputs are necessary to sustain the observed activity during movement execution. For values of j_s^B close to the bifurcation line but below it (for example, Fig.4g) inferred tuned inputs are still much weaker than the untuned ones, during movement execution. Also, the inferred direction-specific couplings are strong and amplify the weak external inputs tuned to map B, therefore still playing a major role in shaping the observed dynamics during movement execution.

      We have rewritten accordingly the abstract, introduction and conclusions of the paper. Instead of focusing on only one solution for a particular value of alpha, we now discuss all solutions and their implications.

      9) The authors argue in the discussion that "the addition of an external input strengthminimization constraint breaks the degeneracy of the space of solutions, leading to a solution where synaptic couplings depend on the tuning properties of the pre- and post-synaptic neurons, in such a way that in the absence of a tuned input, neural activity is localized in map B". In other words, the use of the E_ext term, apparently reduces "degeneracy" of the solution. This was not clear to me and I'm not sure where it is explained. This is also related to α because if alpha goes toward very large values, it would be like the E_ext term is removed, so it seems like the authors are saying that the solution becomes degenerate if alpha grows very large. This should be clarified.

      We thank the reviewer for pointing this out. By degeneracy of solution, we mean that the model can explain the data equally well for different choices of the recurrent couplings parameters (j_s^A, j_s^B, j_a). In other words, if we look for the coupling parameters that best explain the data, there are many equivalent solutions. When we introduce the E_ext term in the cost function, we then find one unique solution for each choice of alpha. So by “breaking the degeneracy”, we mean going from a scenario where there are many solutions that are equally valid, to one single solution. We added this explanation in the paper, along with the explanation on how our conclusion depends on the ‘choice of alpha’.

      10) How do the authors justify setting Φ_A = Φ_B in equation (5)? In other words, how is the last assumption in the following sentence justified: "To model the data, we assumed that the neurons are responding both to recurrent inputs and to fluctuating external inputs that can be either homogeneous or tuned to θ_A; θ_B, with a peak at constant location Φ_A = Φ_B ≡ Φ". Does this mean that the preferred direction for a given neuron is the same during preparation and movement epochs? If so, how is this consistent with the not-so-high correlation between the preferred directions of the two epochs shown in Fig. 2 c, which is reported to have a circular correlation coefficient of 0.4?

      We would like to stress the important distinction between the parameters \theta and the parameters Φ. While the parameters \theta_A and \theta_B represent the preferred direction of single neurons during preparatory and execution epochs, respectively, the parameters Φ_A, Φ_B represent the direction of motion that is encoded at the population level during these two epochs. The mean-field analysis shows that Φ_A = Φ_B, even though single neurons change their preferred direction from one epoch to the next. We added a more extensive explanation of the order parameters in the Results section.

      Reviewer #3 (Public Review):

      In this work, Bachschmid-Romano et al. propose a novel model of the motor cortex, in which the evolution of neural activity throughout movement preparation and execution is determined by the kinematic tuning of individual neurons. Using analytic methods and numerical simulations, the authors find that their networks share some of the features found in empirical neural data (e.g., orthogonal preparatory and execution-related activity). While the possibility of a simple connectivity rule that explains large features of empirical data is intriguing and would be highly relevant to the motor control field, I found it difficult to assess this work because of the modeling choices made by the authors and how the results were presented in the context of prior studies.

      Overall, it was not clear to me why Bachschmid-Romano et al. couched their models within a cosine-tuning framework and whether their results could apply more generally to more realistic models of the motor cortex. Under cosine-tuning models (or kinematic encoding models, more generally), the role of the motor cortex is to represent movement parameters so that they can presumably be read out by downstream structures. Within such a framework, the question of how the motor cortex maintains a stable representation of movement direction throughout movement preparation and execution when the tuning properties of individual neurons change dramatically between epochs is highly relevant. However, prior work has demonstrated that kinematic encoding models provide a poor fit for empirical data. Specifically, simple encoding models (and the more elaborate extensions [e.g., Inoue, et al., 2018]) cannot explain the complexity of single-neuron responses (Churchland and Shenoy, 2007), and do not readily produce the population-level signals observed in the motor cortex (Michaels, Dann, and Scherberger, 2016) and cannot be extended to more complex movements (Russo, et al., 2018).

      In both the Introduction and Discussion, the authors heavily cite an alternative to kinematic encoding models, the dynamical systems framework. Here, the correlations between kinematics and neural activity in the motor cortex are largely epiphenomenal. The motor cortex does not 'represent' anything; its role is to generate patterns of muscle activity. While the authors explicitly acknowledge the shortcomings of encoding models ('Extension to modeling richer movements', Discussion) and claim that their proposed model can be extended to 'more realistic scenarios', they neither demonstrate that their models can produce patterns of muscle activity nor that their model generates realistic patterns of neural activity. The authors should either fully characterize the activity in their networks and make the argument that their models better provide a better fit to empirical data than alternative models or demonstrate that more realistic computations can be explained by the proposed framework.

      Major Comments

      1) In the present manuscript, it is unclear whether the authors are arguing that representing movement direction is a critical computation that the motor cortex performs, and the proposed models are accurate models of the motor cortex, or if directional coding is being used as a 'proof of concept' that demonstrates how specific, population-level computations can be explained by the tuning of individual neurons.

      If the authors are arguing the former, then they need to demonstrate that their models generate activity similar to what is observed in the motor cortex (e.g., realistic PSTHs and population-level signals). Presently, the manuscript only shows tuning curves for six example neurons (Fig. S6) and a single jPC plane (Fig. S8). Regarding the latter, the authors should note that Michaels et al. (2016) demonstrated that representational models can produce rotations that are superficially similar to empirical data, yet are not dependent on maintaining an underlying condition structure (unlike the rotations observed in the motor cortex).

      If the authors are arguing the latter - and they seem to be, based on the final section of the Discussion - then they need to demonstrate that their proposed framework can be extended to what they call 'more realistic scenarios'. For example, could this framework be extended to a network that produces patterns of muscle activity?

      We thank the reviewer for raising these issues.

      Is our model a kinematic encoding model or a dynamical system?

      Our model is a dynamical system, as can be seen by inspecting equations (1,2). The main difference between our model and recently proposed dynamical system models of motor cortex is that the synaptic connectivity matrix in our model is built from the tuning properties of neurons, instead of being trained using supervised learning techniques (we come back to this important difference below). Since the network’s connectivity and external input depend on the neurons’ tuning to the direction of motion (eq 5-6), kinematic parameters emerge from the dynamic interaction between recurrent and feedforward currents, as specified by equations (1-6). Thus, kinematic parameters can be decoded from population activity.

      While in kinematic encoding models neurons’ firing rates are a function of parameters of the movement, we constrained the parameters of our model by requiring the model to reproduce the dynamics of a few order parameters, which are low-dimensional measures of the activity of recorded neurons. Our model is fitted to neural data, not to the parameters of the movement.

      Although we observed that a linear decoder of the network’s activity can reproduce patterns of muscle activity without decoding any kinematic parameter (see below), discussing whether tuning in M1 plays a computational role in controlling muscle activity is outside of the scope of our work. Rather, the scope of our paper is to discuss how a specific connectivity structure can generate the observed patterns of neural activity, and which connectivity structure requires minimum external inputs to sustain the dynamics. In our approach, the correlations between kinematics and neural activity in the motor cortex are not merely epiphenomenal, but emerge from a specific structure of the connectivity that has likely been shaped by hebbian-like learning mechanisms.

      Can the model generate realistic PSTHs and patterns of muscle activity? Yes, it can. As suggested, we have added the following comparisons:

      ● A CCA-based analysis (Fig 5.a) shows that the performance of our model is qualitatively comparable to the Sussillo et al. (2015) and Kao et al (2021) at generating realistic motor cortical activity (average canonical correlation ρ = 0.77 for motor preparation, 0.82 for motor execution).

      ● For each of the 141 neurons in the data, we selected the corresponding most similar unit in the model (the closest neurons in the eta- and theta- parameters space, i.e. the one with smallest euclidean distance in the space defined by (\theta_A, \theta_B, \eta_A, \eta_B)). A side-by-side comparison of the time course of responses (Fig 5.c) shows a good qualitative agreement.

      ● We successfully trained a linear decoder to read the responses of these 141 units from simulations and output trial-averaged EMG activity recorded from a monkey performing the same task (Fig 5.b).

      ● The model displays sequential activity and rotational dynamics (Fig. S10) without the need to introduce neuron-specific latencies (Michaels, Dann, and Scherberger, 2016).

      Can our model explain the complexity of single-neuron tuning?

      We have shown that our model captures the heterogeneity of neural responses. Yet, it has been shown that neurons’ tuning properties depend on many features of movement. For example, the current version of the model does not describe the dependence of tuning on speed (Churchland and Shenoy, 2007). However, our model could be extended to incorporate it. Preliminary results suggest that in a network model in which neurons differ by the degree of symmetry of their synaptic connectivity the speed of neural trajectories can be modulated by external inputs targeting preferentially neurons that are asymmetrically connected. In our model, all connections are a sum of a symmetric and an asymmetric term. We could extend our model to incorporate variability in the degree of symmetry in the connections, and speculate that in such a model tuning would depend on the speed of movement, for appropriate forms of external inputs. We leave this study to future work.

      Can our model explain neural activity underlying more complex trajectories? When limb trajectories are more complex than simple reaches (Russo, et al., 2018), a single neuron’s activity displays intricate response patterns. Our work could be extended to model more complex movement in several ways. A simplifying assumption we made is that the task can be clearly separated into a preparatory phase and one movement-related phase. A possible extension is one where the motor action is composed of a sequence of epochs, corresponding to a sequence of maps in our model. It will be interesting to study the role of asymmetric connections for storing a sequence of maps. Such a network model could be used to study the storing of motor motifs in the motor cortex (Logiaco et al, 2021); external inputs could then combine these building blocks to compose complex actions.

      In summary, we proposed a simple model that can explain recordings during a straight-reaching task. It provides a scaffold upon which we can build more sophisticated models to explain the activity underlying more complex tasks. We point out that a similar limitation is present in modeling approaches where a network is trained to perform specific neural or muscle activity. The question of whether/how trained recurrent networks can generalize is not yet solved, although currently under investigation (e.g., Dubreuil et al 2022; Driscoll et al 2022).

      What is the advantage of the present model, compared to an RNN trained to output specific neural/muscle activity?

      Its simplicity. Our model is a low-rank recurrent neural network: the structure of the connectivity matrix is simple enough to allow for analytical tractability of the dynamics. The model can be used to test specific hypotheses on the relationship between network connectivity, external inputs and neural dynamics, and to test hypotheses on the learning mechanisms that may lead to the emergence of a given connectivity structure. The model is also helpful to illustrate the problem of degeneracy of network models. An interesting future direction would be to compare the connectivity matrices of trained RNNs and our model.

      We addressed these points in the Discussion, in sections: “Representational vs dynamical system approaches” and “Extension to modeling activity underlying more complex tasks.”

      2) Related to the above point, the authors claim in the Abstract that their models 'recapitulatethe temporal evolution of single-unit activity', yet the only evidence they present is the tuning curves of six example units. Similarly, the authors should more fully characterize the population-level signals in their networks. The inferred inputs (Fig. 6) indeed seem reasonable, yet I'm not sure how surprising this result is. Weren't the authors guaranteed to infer a large, condition-invariant input during movement and condition-specific input during preparation simply because of the shape of the order parameters estimated from the data (Fig. 6c, thin traces)?

      We thank the reviewer for this comment. Regarding the first part of the question: we added new plots with more comparisons between the activity of our model and neural recordings (see the answer above referring to Fig 5).

      Regarding the second part: It is true that the shape of the latent variables that we measure from data constrains the solution that we find. However, a “condition-invariant input during movement and condition-specific input during preparation” is not the only scenario compatible with the data. Let’s take a step back and focus on the parameters that we are inferring from data. We are inferring both the strength of external inputs and the couplings parameters. This is done in a two-step inference procedure: we start from a random guess of the couplings parameters, then we infer the strength of the external inputs, and finally we compute the cost function, which depends on all parameters. This is done iteratively, by moving in the space of the coupling parameters; for each point in the space of the coupling parameters, there is one possible configuration of external inputs. The space of the coupling parameters is shown in Fig 4.a, for example (see also Fig. S4). The solutions that we find do not trivially follow from the shape of the latent variables. For example, one possible solution could be: large parameter j_s^A, small parameter j_s^B, which correspond to a point in the lower-right region of the parameter space in Fig 4.a (Fig. S4). The resulting external input would be a strong condition-specific external input during movement execution, but a condition-invariant input during movement preparation: the model is such that, for example, exciting for a short time-interval a few neurons whose preferred direction corresponds to the direction of motion would be enough to “set the direction of motion” for the network; the pattern of tuned activity could be sustained during the whole delay period thanks to the strong recurrent connections j_s^A. We could not rule out this solution by simply looking at the shape of the latent variables. However, it is a solution we have never observed. We only found solutions in the region where j_s^B is large and close to its critical value. This implies the presence of condition-specific inputs during the whole delay period, and condition-invariant external inputs that dominate over condition-specific ones during movement execution.

      3) In the Abstract and Discussion (first paragraph), the authors highlight that the preparatory andexecution-related spaces in the empirical data and their models are not completely orthogonal, suggesting that this near-orthogonality serves an important mechanistic purpose. However, networks have no problem transferring activity between completely orthogonal subspaces. For example, the generator model in Fig. 8 of Elsayed, et al. (2016) is constrained to use completely orthogonal preparatory and execution-related subspaces. As the authors point out in the Discussion, such a strategy only works because the motor cortex received a large input just before movement (Kaufman et al., 2016).

      We thank the reviewer for this observation. We would like to stress the fact that we are not claiming that having an overlap between subspaces is necessary to transfer activity. Instead, our model shows that a small overlap between the maps can be exploited by the network to transfer activity between subspaces without requiring direction-specific external inputs right before movement execution. A solution where activity is transferred through feedforward inputs is also possible. Indeed, one of the observations of our work (which we highlight more in the new version of the paper) is that by looking at motor cortical activity only, we are not able to distinguish between the activity generated by a feedforward network, and one generated by a recurrent one. However, we argue that a solution where external inputs are minimized can be favorable from a metabolic point of view, as it requires fewer signals to be transmitted through long-range connections. This informs our cost function, and yields a solution where activity is transferred through recurrent connections, by exploiting the small correlation between subspaces.

    1. Author Response

      Reviewer #1 (Public Review):

      This well-done platform trial identifies that ivermectin has no impact on SARS-CoV-2 viral clearance rate relative to no study drug while casirivimab lead to more rapid clearance at 5 days. The figures are simple and appealing. The study design is appropriate and the analysis is sound. The conclusions are generally well supported by the analysis. Study novelty is somewhat limited by the fact that ivermectin has already been definitively assessed and is known to lack efficacy against SARS-CoV-2. Several issues warrant addressing:

      1) Use of viral load clearance is not unique to this study and was part of multiple key trials studying paxlovid, remdesivir, molnupiravir, and monoclonal antibodies. The authors neglect to describe a substantial literature on viral load surrogate endpoints of therapeutic efficacy which exist for HIV, hepatitis B and C, Ebola, HSV-2, and CMV. For SARS-CoV-2, the story is more complicated as several drugs with proven efficacy were associated with a decrease in nasal viral loads whereas a trial of early remdesivir showed no reduction in viral load despite a 90% reduction in hospitalization. In addition, viral load kinetics have not been formally identified as a true surrogate endpoint. For maximal value, a reduction in viral load would be linked with a reduction in a hard clinical endpoint in the study (reduction in hospitalization and/or death, decreased symptom duration, etc...). This literature should be discussed and data on the secondary outcome, and reduction in hospitalization should be included to see if there is any relationship between viral load reduction and clinical outcomes.

      This is an important point and we thank the reviewer for raising it. We agree that there is a rich literature on the use of viral load kinetics in optimizing treatment of viral infectious diseases, and we are clearly not the first to think of it! We have added the following sentence in the discussion.

      “The method of assessing antiviral activity in early COVID-19 reported here builds on extensive experience of antiviral pharmacodynamic assessments in other viral infections.”

      We agree that more information is needed to link viral clearance measures to clinical outcomes. We have addressed this in the discussion as follows:

      “Using less frequent nasopharyngeal sampling in larger numbers of patients, clinical trials of monoclonal antibodies, molnupiravir and ritonavir-boosted nirmatrelvir, have each shown that accelerated viral clearance is associated with improved clinical outcomes [1,4,5]. These data suggest reduction in viral load could be used as a surrogate of clinical outcome in COVID-19. In contrast the PINETREE study, which showed that remdesivir significantly reduced disease progression in COVID-19, did not find an association between viral clearance and therapeutic benefit. This seemed to refute the usefulness of viral clearance rates as a surrogate for rates of clinical recovery [16]. However, the infrequent sampling in all these studies substantially reduced the precision of the viral clearance estimates (and thus increased the risk of type 2 errors). Using the frequent sampling employed in the PLATCOV study, we have shown recently that remdesivir does accelerate SARS-CoV-2 viral clearance [17], as would be expected from an efficacious antiviral drug. This is consistent with therapeutic responses in other viral infections [18, 19]. Taken together the weight of evidence suggests that accelerated viral clearance does reflect therapeutic efficacy in early COVID-19, although more information will be required to characterize this relationship adequately.”

      2) The statement that oropharyngeal swabs are much better tolerated than nasal swabs is subjective. More detail needs to be paid to the relative yield of these approaches.

      The statement is empirical. We know of other studies in progress where there are high rates of discontinuation because of patient intolerance of repeated nasopharyngeal sampling. Not one of 750 patients enrolled to date in PLATCOV has refused sampling, which we believe is useful information for research involving multiple sampling. This is clearly a critical point for pharmacodynamic studies.

      We agree that the optimal site of swabbing for SARS-CoV-2 and relative yields for the given test requirements (sensitivity vs quantification) need to be considered, although the literature on this is large and sometimes contradictory.

      We have added the following line:

      Oropharyngeal viral loads have been shown to be both more and less sensitive for the detection of SARS-CoV-2 infection. Although rates of clearance are very likely to be similar from the two body sites, this should be established for comparison with other studies.

      3) The stopping rules as they relate to previously modeled serial viral loads are not described in sufficient detail.

      The initial stopping rules were chosen based on previously modelled data (reference 11). We have added details to the text (lines 199-219):

      “Under the linear model, for each intervention, the treatment effect β is encoded as a multiplicative term on the time since randomisation: eβT, where T=1 if the patient was assigned the intervention, and zero otherwise. Under this specification β=0 implies no effect (no change in slope), and β>0 implies increase in slope relative to the population mean slope. Stopping rules are then defined with respect to the posterior distribution of β, with futility defined as Prob[β<λ]>0.9; and success defined as Prob[β>λ]>0.9, where λ≥0. Larger values of λ imply a smaller sample size to stop for futility but a larger sample size to stop for efficacy. λ was chosen so that it would result in reasonable sample size requirements, as was determined using a simulation approach based on previously modelled serial viral load data [11]. This modelling work suggested that a value of λ=log(1.05) [i.e. 5% increase] would requireapproximately 50 patients to demonstrate increases in the rate of viral clearance of ~50%, with control of both type 1 and type 2 errors at 10%. The first interim analysis (n=50) was prespecified as unblinded in order to review the methodology and the stopping rules (notably the value of λ). Following this, the stopping threshold was increased from 5% to 12.5% [λ=log(1.125)] because the treatment effect of casirivimab/imdevimab against the SARS-CoV-2 Delta variant was larger than expected and the estimated residual error was greater than previously estimated. Thereafter trial investigators were blinded to the virus clearance results. Interim analyses were planned every batch of additional 25 patients’ PCR data however, because of delays in setting up the PCR analysis pipeline, the second interim analysis was delayed until April 2022. By that time data from 145 patients were available (29 patients randomised to ivermectin and 26 patients randomized to no study drug).”

      4) The lack of blinding limits any analysis of symptomatic outcomes.

      We added this line to the discussion:

      “Finally, although not primarily a safety study, the lack of blinding compromises safety or tolerability assessments.”

      5) It is unclear whether all 4 swabs from 2 tonsils are aggregated. Are the swabs placed in a single tube and analyzed?

      The data are not aggregated but treated as independent and identically distributed under the linear model. 4 swabs were taken at randomization, followed by two at each follow-up visit. We have added line 183:

      “[..] (18 measurements per patient, each swab is treated as as independent and identically distributed conditional on the model).”

      Swabs were stored separately and not aggregated.

      6) In supplementary Figure 7, both models do well in most circumstances but fail in the relatively common event of non-monotonic viral kinetics (multiple peaks, rebound events). Given the importance of viral rebound during paxlovid use, an exploratory secondary analysis of this outcome would be welcome.

      Thank you for the suggestion. We agree, although the primary goal is to estimate the mean change in slope. Rebound is a relatively rare event and tends to occur after the first seven days of illness in which we are assessing rate of clearance.

      Nevertheless, we agree that this is an important point. It remains unclear how to model viral rebound. In over 700 profiles now available from the study, only a few have strong evidence of viral rebound.

      Reviewer #2 (Public Review):

      This manuscript details the analytic methods and results of one arm of the PLATCOV study, an adaptive platform designed to evaluate low-cost COVID-19 therapeutics through enrollment of a comparatively smaller number of persons with acute COVID-19, with the goal of evaluating the rate of decrease in SARS-CoV-2 clearance compared to no treatment through frequent swabbing of the oropharynx and a Bayesian linear regression model, rather than clinical outcomes or the more routinely evaluated blunt virologic outcomes employed in larger trials. Presented here, is the in vivo virologic analysis of ivermectin, with a very small sample of participants who received the casirivimab/imdevimab, a drug shown to be highly effective at preventing COVID-19 progression and improving viral clearance (during circulation of variants to which it had activity) included for comparison for model evaluation.

      The manuscript is well-written and clear. It could benefit however from adding a few clarifications on methods and results to further strengthen the discussion of the model and accurately report the results, as detailed below.

      Strengths of this study design and its report include:

      1) Selection of participants with presumptive high viral loads or viral burden by antigen test, as prior studies have shown difficulty in detecting effect in those with a lower viral burden.

      2) Adaptive sample size based on modeling- something that fell short in other studies based on changing actuals compared to assumptions, depending on circulating variant and "risk" of patients (comorbidities, vaccine state, etc) over time. There have been many other negative studies because the a priori outcomes assumptions were different from the study design to the time of enrollment (or during the enrollment period). This highlight of the trial should be emphasized more fully in the discussion.

      3) Higher dose and longer course of ivermectin than TOGETHER trial and many other global trials: 600ug/kg/day vs 400mcg/kg/day.

      4) Admission of trial participants for frequent oropharyngeal swabbing vs infrequent sampling and blunter analysis methods used in most reported clinical trials

      5) Linear mixed modeling allows for heterogeneity in participants and study sites, especially taking the number of vaccine doses, variant, age, and serostatus into account- all important variables that are not considered in more basic analyses.

      6) The novel outcome being the change in the rate of viral clearance, rather than time to the undetectable or unquantifiable virus, which is sensitive, despite a smaller sample size

      7) Discussion highlights the importance of frequent oral sampling and use of this modeled outcome for the design of both future COVID-19 studies and other respiratory viral studies, acknowledging that there are no accepted standards for measuring virologic or symptom outcomes, and many studies have failed to demonstrate such effects despite succeeding at preventing progression to severe clinical outcomes such as hospitalization or death. This study design and analyses are highly important for the design of future studies of respiratory viral infections or possibly early-phase hepatitis virus infections.

      Weaknesses or room for improvement:

      1) The methods do not clearly describe allocation to either ivermectin or casirivimab/imdevimab or both or neither. Yes, the full protocol is included, but the platform randomization could be briefly described more clearly in the methods section.

      We have added additional text to the Methods:

      “The no study drug arm comprised a minimum proportion of 20% and uniform randomization ratios were then applied across the treatment arms. For example, for 5 intervention arms and the no study drug arm, 20% of patients would be randomized to no study drug and 16% to each of the 5 interventions. Additional details on the randomization are provided in the Supplementary Materials. All patients received standard symptomatic treatment.”

      2) The handling of unquantifiable or undetectable viruses in the models is not clear in either the manuscript or supplemental statistical analysis information. Are these values imputed, or is data censored once below the limits of quantification or detection? How does the model handle censored data, if applicable?

      We have added lines 185-186:

      “Viral loads below the lower limit of quantification (CT values ≥40) were treated as left-censored under the model with a known censoring value.”

      3) Did the study need to be unblinded prior to the first interim analysis? Could the adaptive design with the first analysis have been done with only one or a subset of statisticians unblinded prior to the decision to stop enrolling in the ivermectin arm?

      The unblinded interim analysis was done on the first 50 patients enrolled in the study. The study at that time was enrolling into five arms including ivermectin, casirivimab-imdevimab, remdesivir, favipiravir, and a no study drug arm (there were exactly 10 per arm as a result of the block randomization).

      The main rationale for making this interim analysis unblinded was to determine the most reasonable value of λ (this defines stopping for futility/success), which is a trade-off between information gain, reasonable sample size expectations, and the balance between quickly identifying interventions which have antiviral activity versus the certainty of stopping for futility.

      Once the value of 12.5% was decided, the trial investigators remained blinded to the results until the stopping rules were met and the unblinded statistician discussed with the independent Data Safety and Management Board who agreed to unblind the ivermectin arm.

      4) Can the authors comment on why the interim analysis occurred prior to the enrollment of 50 persons in each of the ivermectin and comparison arms? Even though the sample sizes were close (41 and 45 persons), the trigger for interim analysis was pre-specified.

      After the first interim analysis at 50 patients enrolled into the study, they were planned every additional 25 patients (i.e. very frequently). The trigger for the interim analysis was not 50 patients into a specific arm, but 50 patients in total, and thereafter were planned to occur with every 25 new patients enrolled into the study. In practice there were backlogs in the data pipeline (which we explain), and interim analyses occurred less frequently than planned- the second one being in April 2022.

      5) The reporting of percent change for the intervention arms is overstated. All credible intervals cross zero: the clearance for ivermectin is stated to be 9% slower, but the CI includes + and - %, so it should be reported as "not different." Similarly, and more importantly for casirivimab/imdevimab, it was reported to be 52% faster, although the CI is -7.0 to +115%. This is likely a real difference, but with ten participants underpowered- and this is good to discuss. Instead, please report that the estimate was faster, but that it was not statistically significant. Similarly, the clearance half-life for ivermectin is not different, rather than "slower" as reported (CI was -2 to +6.6 hours). This result was however statistically significant for casirivimab/imdevimab.

      Thank you for your comments. The confidence interval for casirivimab/imdevimab did not cross zero and was +7.0 to +115.1%, and we thank the reviewer for picking up the error in the results section (it was correct in the abstract) where it was written -7.0 to +115.1%. We have made this correction. Elsewhere, we have provided more precise language to discriminate clinical significance from statistical significance, as per the essential revisions.

      6) While the use of oropharyngeal swabs is relatively novel for a clinical trial, and they have been validated for diagnostic purposes, the results of this study should discuss external validity, especially with respect to results from other studies that mainly use nasopharyngeal or nasal swab results. For example, oropharyngeal viral loads have been variably shown to be more sensitive for the detection of infection, or conversely to have 1-log lower viral loads compared to NP swabs. Because these models look for longitudinal change within a single sampling technique, they do not impact internal validity but may impact comparisons to other studies or future study designs.

      We have added the following sentence to the discussion:

      “Oropharyngeal viral loads have been shown to be both more and less sensitive for the detection of SARS-CoV-2 infection. Although rates of viral clearance are very likely to be similar from the two sites, this should be established for comparison with other studies.”

      7) Caution should be used around the term "clinically significant" for viral clearance. There is not an agreed-upon rate of clinically significant clearance, nor is there a log10 threshold that is agreed to be non-transmissible despite moderately strong correlations with the ability to culture virus or with antigen results at particular thresholds.

      We agree. We have addressed this partly in our response to Reviewer 1.

      8) Additional discussion could also clarify that certain drugs, such as remdesivir, have shown in vivo activity in the lungs of animal models and improvement in clinical outcomes in people, but without change in viral endpoints in nasopharyngeal samples (PINETREE study, Gottlieb, NEJM 2022). Therefore, this model must be interpreted as no evidence of antiviral activity in the pharyngeal compartment, rather than a complete lack of in vivo activity of agents given the limitations of accessible and feasible sampling. That said, strongly agree with the authors about the conclusion that ivermectin is also likely to lack activity in humans based on the results of this study and many other clinical studies combined.

      As above this has been addressed in our response to Reviewer 1.

      Reviewer #3 (Public Review):

      This is a well-conducted phase 2 randomized trial testing outpatient therapeutics for Covid-19. In this report of the platform trial, they test ivermectin, demonstrating no virologic effect in humans with Covid-19.

      Overall, the authors' conclusions are supported by the data.

      The major contribution is their implementation of a new model for Phase 2 trial design. Such designs would have been ideal earlier in the pandemic.

      We thank the reviewer for their encouraging comments.

    1. Author Response

      Reviewer #1 (Public Review):

      Auxin-induced degradation is a strong tool to deplete CHK-2 and PLK-2 in the C. elegans germ line. The authors strengthen their conclusions through multiple approaches, including rescuing mutant phenotypes and biochemical analyses of CHK-2 and PLK-2.

      The authors overcame a technical limitation that would hinder in vitro analysis (low quantity of CHK-2) through the clever approach of preventing its degradation via the proteasome. In vitro phosphorylation assays and mass spectrometry analysis that establishes that CHK-2 is a substrate of PLK-2 nicely complement the genetic data.

      The authors argue that the inactivation of CHK-2 by PLK-2 promotes crossover designation; however, the data only indicate that PLK-2 promotes proper timing of crossover designation.

      We thank the reviewer for this point of clarification. While we believe that PLK activity is essential to inactivate CHK-2 and trigger CO designation, we agree that this has not been firmly established with the tools available to us, as elaborated below. We have revised the text to avoid overstating the conclusions.

      It is not clear whether the loss of CHK-2 function with the S116A and T120A mutations is the direct result of the inability to phosphorylate these residues or whether it is caused by the apparent instability of these proteins, as their abundance was reduced in IPs compared to wild-type. Agreed. The instability of the mutant proteins was a source of significant frustration during the course of this work, and limits the strength of our conclusions.

      The mechanism of CHK-2 inactivation in the absence of PLK-2 remains unclear, though the authors were able to rule out multiple candidates that could have played this role.

      Reviewer #2 (Public Review):

      In this manuscript, Zhang et al., address the role of Polo-like kinase signaling in restricting the activity of Chk2 kinase and coordinating synapsis among homologous chromosomes with the progression of meiotic prophase in C. elegans. While individual activities of PLK-2 and CHK-2 have been demonstrated to promote chromosome pairing, and double-strand break formation necessary for homologous recombination, in this manuscript the authors attempt to link the function of these two essential kinases to assess the requirement of CHK-2 activity in controlling crossover assurance and thus chromosome segregation. The study reveals that CHK-2 acts at distinct regions of the C. elegans germline in a Polo-like kinase-dependent and independent manner.

      Strengths:

      The study reveals distinct mechanisms through which CHK-2 functions in different spatial regions of meiosis. For example, it appears that CHK-2 activity is not inhibited by PLK's (1 and 2) in the leptotene/zygotene meiotic nuclei where pairing occurs. This suggests that either CHK-2 is not phosphorylated by PLK-2 in the distal nuclei or that it has a kinase-independent function in this spatial region of the germline. These are interesting observations that further our understanding of how the processes of meiosis are orchestrated spatially for coordinated regulation of the temporal process.

      Weaknesses:

      While the possibilities stated above are interesting, they lack direct support from the data. A key missing element in the study is the actual role of PLK-2 signaling in controlling CHK-2 activity and thus function. I expand on this below.

      Throughout the manuscript, the authors test the role of each of the kinases (CHK-2 or PLK-1, or 2) using auxin-induced degradation, which would eliminate both phosphorylated and unphosphorylated pools of proteins. This experiment thus does not test the role of PLK-2 signaling in controlling CHK-2 function or the role of CHK-2 activation. To test the role of signaling from PLK-2 or CHK-2, the authors need to generate appropriate alleles such as phospho-mutants or kinase-dead mutants. The authors do generate unphosphorylatable and phosphomimetic versions of CHK-2, however, they find that the protein level for both these alleles is lower than wild-type CHK-2 (which the authors state is already low). The authors conclude that the lower level of protein in the CHK-2 phospho-mutants is because the mutations cause destabilization of the protein. I am sympathetic with the authors since clearly these results make interpretations of actual signaling activity more challenging. But there needs to be some evidence of this activity, for example through the generation of a phosphor-specific antibody to phosphorylated CHK-2. While not functional, at least the phosphorylation status of CHK-2 would provide more information on its spatial pattern of activation and inactivation. In addition, it would still be of interest to the readership to present the data on these phosphor-mutant alleles with crossover designation and COSA-1::GFP. Is the phenotype of the WT knockin, and each of the phosphomutant knock-ins similar to auxin-induced degradation of CHK-2?

      We thank the reviewer for these comments. We have made several attempts over the past decade that have failed to elicit a CHK-2 antibody that works for either immunofluorescence or western blots, likely due to the very low abundance of CHK-2. This has discouraged us from investing yet more resources to try to develop a phospho-specific antibody. Moreover, our evidence suggests that phosphorylation may promote CHK-2 degradation. Since the phosphomutants of CHK-2 are not stable, we do not think knock-in of these phosphomutants will provide new insights.

      Given that the CHK-2 phosphomutants did not pan out for assessing the signaling regulation of PLK-2 on CHK-2, to directly assess whether PLK-2 activity restricts CHK-2 function in mid-pachytene but not leptotene/zygotene, the authors should generate PLK-2 kinase dead alleles. These alleles will help decouple the signaling function of PLK-2 from a structural function.

      Similarly, to assess the potentially distinct roles of CHK-2 in leptotene/zygotene and mid-pachytene it would be important to assess CHK-2 kinase-dead mutant alleles. At this time, all of the analysis is based on removing both active CHK-2 and inactive CHK-2 (i.e. phosphorylated and unphosphorylated pool) using auxin-induced degradation. The kinase-dead alleles will help infer the role of the kinase more directly. The authors can then superimpose the auxin-induced degradation and assess the impact of complete removal of the protein vs only loss of its kinase function. These experiments may help clarify the role of signaling outcomes of these proteins, vs their complete loss. For example, what does kinase dead PLK-2 recruitment to the synapsed chromosomes appear like? Are their distinct activities for active and inactive PLK-2 that are spatially regulated? The same can be tested for CHK-2.

      A kinase-dead allele of plk-2 has been generated in previous work and we have used it for other purposes. However, the fact that CHK-2 and PLK-2 are required for homolog pairing and synapsis, which are prerequisites for crossover designation, precludes their use here.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We thank the reviewers for their thorough and insightful evaluations of our manuscript and for their constructive feedback, which have significantly improved the quality of our manuscript. We were pleased to read that all three reviewers found our work novel, interesting, and relevant. In this revised manuscript, we have done our best to address all of the points raised by the reviewers by performing new experiments and revising sections of the text, as requested.

      2. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity):

      In this manuscript authors show that extracellular Mtb aggregates can cause macrophage killing in a close contact dependent but phagocytosis independent manner. They showed Mtb aggregates can induce plasma membrane perturbations and cytoplasmic Ca2+ influx with live cell microscopy. Next, the authors show that the type of cell death initiated by extracellular aggregates is pyroptosis and they partially supressed cell death with pyroptosis inhibitors. They also identified that PDIM, EsxA/EsxB and EspB all have a role in uptake-independent killing of macrophages even though their impact varies with respect membrane perturbation and Ca2+ influx. Finally, they used a small molecule inhibitor BTP15 to inhibit the effect of ESX-1 during the contact of the extracellular Mtb aggregates with the macrophages and they observed a substantial decrease in membrane perturbation and macrophage killing.<br /> The work describes a very interesting mechanism by which Mtb can kill macrophages that is possibly relevant in the context of infection.

      1. In general, there are two main issues with the experiments and the interpretation: the lack of quantitative analysis showing that in a population of macrophages the ones that are in contact with the aggregates die whereas the ones that are not in contact remain alive. This is currently not shown, and it should be added in figure 1.

      All our data are based on the visual inspection and annotation of time-lapse microscopy image series, from which it is conclusive that death happens more often among cells in contact with Mtb aggregates (see movies S3 and S6 for representative examples). However, we acknowledge the reviewer’s suggestion that quantitative data supporting this observation might help to convey this conclusion more effectively. Therefore, we have quantified the percentage of dead cells in: I) macrophages in uninfected controls; II) macrophages that establish contact with an Mtb aggregate; III) bystander macrophages that never contact an Mtb aggregate despite being in the same sample as the infected cells, in experiments with (figure 1D) or without (figure 1Q) cytochalasin D treatment. These data have been incorporated as two additional plots in figure 1 in the revised manuscript. We find that uninfected and bystander cells have similar survival probabilities over the time-course of an experiment, whereas most of the cells that physically interact with Mtb_aggregates die by the end of the experiment. To further validate these observations, we have also plotted the lifespans of infected cells vs. bystander cells without (figure S3A) and with (figure S3B) cytochalasin D treatment. In these plots, the lifespan of an individual cell is represented by a line; the fraction of the line coloured in black corresponds to the time spent as bystander and the fraction of the line in magenta corresponds to the time spent in contact with an _Mtb aggregate. We hope that these new data convincingly show that bystander cells (black lines) survive longer compared to cells that interact with Mtb aggregates (black-magenta lines).

      1. The second is the cell death mode, as the markers used are very different and considering different outcomes (e.g., apoptosis vs. necrosis) are relevant for the infection it is unclear what is being measured here and the impact on bacterial replication.

      As the reviewer points out, it has previously been shown that different cell death pathways can affect viability and propagation of intracellular bacteria (1, 2). Since in our experiments we are specifically analyzing extracellular bacteria, we cannot directly comment on how cell death affects intracellular bacterial replication. However, to address the reviewer’s comment, we have included additional data in figure S13A of the revised manuscript showing that specific inhibitors of cell death do not affect the growth or replication of extracellular Mtb. These results suggest that while these molecules do not affect Mtb growth per se, the suppression of these specific death pathways also does not significantly affect the microenvironment to alter Mtb growth (i.e., access to nutrients or molecules released by dead cells). In addition, we have included new data in figure S12 demonstrating the responsiveness of our isolated macrophages to the various cocktails of molecules typically used to induce apoptosis, pyroptosis, or necroptosis.

      The authors are showing that infection with Mtb aggregates increase the rate of the macrophage killing but how does this impact infection dissemination and replication of the bacterial aggregates? Is it beneficial for the aggregates? Did the authors check the growth rate of Mtb along with cytochalasin D?

      A previous study has shown that phagocytosis of Mtb aggregates leads to macrophage death more efficiently than phagocytosis of a similar number of individual bacteria (3). It has also been shown that Mtb growing on the debris of dead host macrophages forms cytotoxic aggregates that kill the newly interacting macrophages (3). These observations suggest a model in which host cell death induced by Mtb aggregates supports faster extracellular growth and propagation of infection (3). This study was cited in the Introduction section of our manuscript, and our data support these observations. In the revised manuscript, we show that single Mtb bacilli or Mtb aggregates induce macrophage death in a dose-dependent manner (figure S7A,B); however, bacterial aggregates kill more efficiently when compared to similar numbers of non-aggregated bacilli (figure S7A,B). We also show that infection with Mtb_aggregates leads to faster bacterial propagation compared to infection with similar numbers of individual bacteria (figure S7C,D). These observations, combined with our data showing that _Mtb aggregation also enhances uptake-independent killing of macrophages (figure 2), suggest that Mtb aggregates induce rapid host cell death, allowing the bacteria to escape intracellular stresses, grow faster outside host cells (figure S1B), and propagate to other cells. To address the reviewer’s concern whether cytochalasin D affects Mtb growth, the revised manuscript includes additional data confirming that cytochalasin D does not affect the growth of Mtb aggregates (figure S6).

      1. How did the authors quantify the interactions of Mtb with macrophages in Figure 1D?

      The interactions of Mtb with macrophages were quantified through manual annotation of the time-lapse microscopy image series. If the Mtb aggregates disaggregated upon interaction with the macrophage, resulting in redistribution of smaller aggregates of bacteria, we categorized them as “fragmented”. On the other hand, if the aggregates remained clustered, we categorized them as “not fragmented”. Representative snapshots of these two patterns are presented in figure 1E and 1F and we have included additional representative examples in movies S4 and S5 of the revised manuscript. These interactions are quantified and plotted in figure 1N of the revised manuscript (figure 1D in the original version).

      1. Is it enough to conclude with one example of SEM that the mycobacteria with different fragmentation discriminates if the bacteria is intracellular or extracellularly localised? Can authors use an alternative quantitative method to confirm the localization of the bacteria by a quantification by 3D imaging of these two phenotypes with a cytoskeleton marker (or may be even with tdTomato-expressing BMDMs)?

      In the revised manuscript, we provide additional examples of correlative time-lapse microscopy and SEM images (supplementary figure S5). As suggested by the reviewer, in the revised manuscript we further validate these conclusions using an alternative approach based on correlative time-lapse microscopy followed by confocal 3D imaging. After time-lapse imaging, we fixed the samples and labelled the plasma membrane of the macrophages with a fluorescent anti-CD45 antibody to define the cell boundaries and identify bacteria that are intracellular vs. extracellular. Representative images obtained using this approach have been added to figure 1 and additional examples are shown in supplementary figure S4 of the revised manuscript. The acquisition, processing, and analysis of these 3D images are time-consuming and prevent us from performing an exhaustive quantitative analysis. However, we are confident in our conclusions, since in all of the cells that we analyzed we found that aggregates that are not fragmented within 6 hours of stable interaction with macrophages are visible on the outer side of the plasma membrane.

      1. How do we know if the cell is lysed at 30 h in Supplementary Figure 1, did the authors use a marker to detect the cell lysis or is it based on just the observation from the live cell imaging? Movies in supplementary are actually not very informative as there are many ongoing events and it is hard to visualise what the authors claim. A marker of cell death in the movies should be used.

      In this study, we used brightfield time-lapse microscopy images to identify cell death. Dying macrophages rapidly change shape, lose membrane integrity, and stop moving. Moreover, the intracellular structures and bacteria also stop moving at the time of death of the host cell. While these events can be difficult to distinguish by examining individual snapshots, they are readily identifiable by careful frame-by-frame examination of time-lapse microscopy image series. To exemplify this process, in the revised manuscript we show in supplementary figure S2A how we identify macrophage cell death events. We also include Draq7 (a live cell-impermeable dye commonly used to identify dead cells by flow cytometry and microscopy) in the growth medium during time-lapse imaging in order to label dead macrophages. The timing of staining validates and confirms our strategy of using brightfield time-lapse images to define the time-of-death of individual cells. To further assist readers, in the revised manuscript we provide the time-lapse microscopy movie used to generate this figure (movie S4). Similar images and movies have also been added for cells treated with cytochalasin D (figure S2B; movie S7). As suggested by the reviewer, we also replaced figure S1A with a new figure that shows a representative example of an Mtb intracellular microcolony that, upon death of the host macrophage, grows and forms a large extracellular aggregate on the debris of the dead cell (Draq7-positive). Movie S2 was used to generate this figure. Finally, we replaced figures 1E,F with new figures incorporating the Draq7 staining to label macrophage cell death and we include the time-lapse microscopy movies used to generate these figures (movies S4, S5).

      1. Total macrophage killing after contact in Figure 1L is around 12 hours, whereas it is observed that the macrophage death after contact with cytochalasin D treatment in Figure 1M is even longer than 24 hours. The viability at 12 hours in Figure1M is as fragmented Mtb survival in Figure1L, why there is a difference in timing with respect to macrophage killing?

      We thank the reviewer for this interesting observation. Indeed, we find that macrophages treated with cytochalasin D do take longer to die upon establishing stable interaction with Mtb aggregates in comparison to untreated cells. Although we do not have a clear explanation for this difference in timing, we speculate that by inhibiting actin polymerization and consequently cell motility, cytochalasin D might slow the expansion of the macrophage plasma membrane and the establishment of a larger interface of contact between the cell and the bacterial aggregate, which could influence the timing of cell death.

      1. Did authors perform statistical tests for Figure 1D and Figure 1N? p-values should be added.

      Figure 1D (figure 1N in the revised manuscript) shows the percentage of interactions between macrophages and _Mtb_aggregates that do or do not lead to fragmentation of the aggregate. Each dot represents the percentage of these events in one experimental replicate. We included this plot to show that reproducibly in all our replicates approximately 20% of the interactions do not lead to fragmentation of the aggregate. Since the purpose of this plot is not to compare the “fragmented” and “non-fragmented” populations but rather to highlight the reproducibility of the phenomenon, we do not think it would be appropriate to add a p-value. However, figure 1N (figure 1Q in the revised manuscript) has been updated and modified to include statistical analysis and a p-value.

      1. In Figure 3, do the observations indicated in the Figure 3 happen in all the macrophages that are in contact with aggregates? This is unclear and critical to support the conclusions. Do all the macrophages that are in contact with Mtb aggregates become Annexin-V positive? In Supplementary Figure 2 there is some information regarding this question, but it will be important to show it as a percentage.

      In response to the reviewer’s suggestion, we have modified the figure to include quantitation of Annexin-V staining. Approximately 75% of the macrophages that interact with an Mtb aggregate show detectable local Annexin V-positive membrane domains at the site of contact with the aggregate during a typical 60 hour-long experiment. Since most of the macrophages show local Annexin V-positive membrane domains within the first 12 hours upon contact with an Mtb_aggregate (figure 3C), we used this criterion for comparison of different conditions or strains (for example, those shown in figure 6F). In addition, we added figure 3D, which shows the behaviour of 105 macrophages upon contact with _Mtb aggregates in a typical experiment. In this plot, each line represents the lifespan of an individual cell; the fraction of the line in black represents the time spent as bystander, the fraction of the line in magenta represents the time spent interacting with an Mtb aggregate, and the fraction in green represents the time upon formation of local Annexin V-positive membrane domains at the site of contact with the Mtb aggregate. We believe that this additional information further supports our conclusions that most of the cells in contact with an Mtb aggregate show local Annexin V-positive membrane domains and that cells that show this pattern die faster than cells that do not develop local Annexin V-positive membrane domains.

      1. Did the authors try to stain Mtb aggregates alone with Annexin-V as a control over the duration of the imaging?

      We thank the reviewer for suggesting this control. In supplementary figure S8C of the revised manuscript, we include a representative example of a time-lapse microscopy image series showing Mtb aggregates that never interact with a live macrophage althought they are adjacent to a dead cell. As observed in the Annexin V fluorescence images (yellow), these Mtb aggregates never become Annexin-V positive during the course of the experiment (60 hours).

      1. In Figure 4, did the authors continue to image the cells interacting with Mtb aggregates that do not die after Ca2+ accumulation in Supplementary Figure 3D? Do these cells recover from the plasma membrane perturbation? Did the authors consider using another marker for plasma membrane perturbation together with BAPTA?

      Unfortunately, we are not able to image macrophages stained with Oregon Green 488 Bapta-1 AM for more than 36 hours because they lose fluorescence over time, possibly due to partial dye degradation or secretion. Another issue is that macrophages do not establish synchronous interaction with Mtb aggregates (figure 3D; figure S3B). In order to pool together results from many cells, we analyze all the cells that interact with Mtb within the first 20 hours and we define as timepoint 0 the time at which each individual cell establishes interaction with the bacteria. To compare similar time windows for each cell, we use fluorescence values measured at 16 hours post-interaction with bacteria as a readout. This time window is sufficient to observe formation of local Annexin V plasma membrane domains and death in a relevant number of macrophages (figure 1P; figure 3D). Not all of the contacted cells die within the timeframe of our experiments; however, we believe that if we imaged cells that accumulate Ca2+ for longer durations, we would find that all such cells eventually die. This assumption is consistent with the observation that calcium chelation reduces inflammasome activation and death in macrophages in contact with Mtb aggregates (figure 5D; figure 4E).

      With respect to the reviewer’s query whether cells recover from plasma membrane perturbation, in our time-lapse microscopy experiments, we observe that when macrophages form local Annexin V-positive plasma membrane domains at the site of contact with Mtb aggregates, they never revert to an Annexin V-negative status afterwards (figure 3D; movie S7; movie S8). Our SEM data show that Mtb aggregates colocalizing with Annexin V-positive domains are not partially covered by intact membrane, in contrast to those associated with Annexin V-negative macrophages, although they do present vesicles and membrane debris on their surface (figure 3F,G ). In the revised manuscript, we include additional fluorescence microscopy images showing that Annexin V-positive foci colocalize with markers for the macrophages’ plasma membrane (figure S8A,B) as well as with more distal areas of the bacterial aggregates, where we do not observe any positive plasma membrane staining (figure S8B). Similarly, although _Mtb_aggregates that are never in contact with macrophages never become Annexin V-positive (figure S8C), we see that upon macrophage death, aggregates in contact with dead cells retain some Annexin V-positive material on their surface (figure S8C; movie S8). Vesicle budding and shedding is a common ESCRT III-mediated membrane repair strategy that allows removal of damaged portions of the plasma membrane and wound resealing (4). Thus, we think that in our experiments the Annexin V-positive foci might represent both damaged membrane areas and released macrophage plasma membrane vesicles that stick to the hydrophobic surface of the bacterial aggregates. This means that the time of appearance of Annexin V-positive domains marks the time when the macrophage membrane experiences a damaging event. Interestingly, we do not observe a gradual increase in fluorescence intensity of the Annexin V-positive domains, but rather multiple single intensity peaks over time (movie S8). This might suggest that multiple discrete damaging events occur over time.

      1. In Figure 5D-G it will be important if the authors include dots for each macrophage events for the contact conditions as well, as it was done for the bystander condition.

      We apologize for using a too-pale shade of magenta in the earlier version of the manuscript, which apparently made the dots in these figures hard to visualize. In the revised manuscript, we use a darker shade of magenta to show the dots corresponding to the fluorescence values of the macrophages in contact with Mtb aggregates.

      1. How did the authors discriminate between the macrophages that are in contact or not with Mtb aggregates after the staining with Casp-1, pRIP3 and pMLKL? Do the aggregates stay in contact even after the staining procedures? Representative images of the labelling should be included in this figure.

      Before fixation, we make sure to remove the medium gently to avoid disrupting the interactions between cells and bacteria. This step most likely removes the floating bacterial aggregates that are not in stable contact with cells but apparently does not detach aggregates that stably interact with cells. Our correlative time-lapse microscopy and immunofluorescence images (figure 1; figure S4), as well as our correlative time-lapse microscopy and SEM images (figure S5; figure 3F,G), confirm that Mtb aggregates that interact with cells during time-lapse imaging are retained on the surface of those cells upon fixation and processing for immunofluorescence or electron microscopy. As we can observe in figure 5B (cell indicated by the white arrow), Mtb aggregates are retained on the debris of dead cells. In figure 5 we distinguish between “in contact” macrophages and “bystander” macrophages by inspecting brightfield images showing the cells and the respective fluorescence images corresponding to the bacteria. If the body of a macrophage identified in the brightfield image overlaps with a bacterial aggregate identified in the fluorescence channel, we define the macrophage as “in contact”; otherwise, it is annotated as “bystander”. We provide representative images in figure S12 and we clarify the definition of “in contact” and “bystander” in the figure legend of figure 5.

      1. The labelling of Figure 5H needs to be corrected both in the text and in the figure legend.

      We thank the reviewer for bringing our attention to this error, which has been corrected in the revised manuscript.

      1. Pyroptosis inhibitors did reduce the percentage of cell death, but did it also reduce the number of Annexin-V positive domains? This is important as AnnexinV is a marker of apoptosis and the outcome for Mtb very different.

      As pointed out by the reviewer, Annexin V staining is often used as a marker for apoptosis. Typically, apoptotic cells stain positive for Annexin V but negative for other membrane-impermeable markers such as propidium iodide, because they expose phosphatidylserine (bound by Annexin V) on the outer leaflet of the plasma membrane without losing plasma membrane integrity (5). Apoptotic cells often look round and their plasma membrane is stained homogeneously by fluorescently labelled Annexin V (5). In our experiments, we observe that macrophages in contact with Mtb aggregates become Annexin V-positive; however, this happens only at the site of contact with the bacteria (figure 3A; movie S7). Only when cells die and get stained by membrane-impermeable dies such as Draq7 do they also get stained with Annexin V over the entire membrane debris. We thus use Annexin V staining as a marker for membrane perturbation rather than for cell death. If we were using the Annexin V as a marker for cell death, we would expect to see a reduction in Annexin V-positive cells in samples treated with pyroptosis inhibitors. In these samples, we do observe a reduced percentage of cell death in comparison to untreated controls; however, we still observe a comparable percentage of macrophages that stain positive for Annexin V locally, i.e., at the site of contact with bacterial aggregates (supplementary figure S13B). In line with this observation, treated vs. untreated macrophages in contact with Mtb aggregates accumulate similar levels of intracellular calcium. These observations are consistent with our model suggesting that contact with Mtb aggregates induces membrane perturbation, calcium accumulation, inflammasome activation, and pyroptosis in contacted macrophages. Since the death inhibitors used in our study specifically target pyroptosis effectors, we do not expect them to affect upstream events such as membrane perturbation and calcium accumulation.

      1. In Figure 6, The sections for Figure 6 are well described but kept relatively long with too many details, it will be helpful to the reader if the authors can combine the sections in one header.

      We agree that the text linked to figure 6 is long. We tried to make these sections as concise as possible; however, we are concerned that combining all of the sections under a single header might be at the expense of clarity. Thus, unless the reviewer objects, we would prefer to maintain the use of multiple headers.

      1. Figure 6F does not have a statistical test and p-value, it will be important to include the statistical test in the legend and p-values in the

      As recommended by the reviewer, we have analyzed the results in figure 6F by using a one-way ANOVA test and we have added the calculated p-values to the figure.

      Reviewer #1 (Significance):

      Based on the literature, Mtb infection and replication can trigger different types of cell death and most of the studies have addressed cell death only as an outcome of intracellular replication. This study shows another form of host cell death, associated only to extracellular bacterial aggregates that are in contact with macrophages. Plasma membrane damage initiating pyroptosis has been defined in: "Plasma membrane damage causes NLRP3 activation and pyroptosis during Mycobacterium tuberculosis infection" by K.S. Beckwith et al. (2020). However, the effect of extracellular bacteria on plasma membrane damage was not addressed before and this paper is addressing an important observation with respect to Mtb evasion and dissemination. These observations represent a novel and interesting aspect in the induction of macrophage cell death by Mtb and potentially relevant for the disease. If the authors consider the comments listed above, this manuscript will be a novel and relevant addition to the field of host pathogen interactions in tuberculosis.

      We thank the reviewer for their perspective and their positive comments about our work.

      Reviewer #2 (Evidence, reproducibility and clarity):

      In this work, Toniolo and coworkers use single-cell time-lapse fluorescence microscopy to show that extracellular aggregates of Mycobacterium tuberculosis can evade phagocytosis by killing macrophages in a contact-dependent but uptake-independent manner. The authors further show that this process is dependent on the functionality of the ESX-1 type VII secretion system and the presence of mycobacterial phthiocerol dimycocerosate (PDIM). In essence the authors show that M. tuberculosis can induce macrophage death from the outside of the cell, and dissect the different players that are involved in the process.

      Major comments:

      1. I was intrigued by all the different findings of this work, which was done by using bone marrow derived murine macrophages, however, my first question to the authors is how they imagine that this process will take under an in vivo situation? Do they have evidence that these mycobacterial clumps may form during the initial infection process in the lungs? It would be important to provide more insights and discussion into this question in order to see how relevant the described details are inside the host organism.

      Formation of Mtb aggregates in tuberculosis lesions have been documented in several animal models (6, 7) and in humans (8–11). While it is unclear whether mycobacterial aggregates form during the earliest stages of infection, extracellular bacterial aggregates have been observed in animal models of infection within the first month post-infection, and they are often associated with necrotic foci. Moreover, masses of Mtb growing as pellicle-like aggregates are often observed on the surface of cavities in human tuberculosis patients. These observations confirm that Mtb aggregates can form during a tuberculosis infection and that a significant fraction of bacteria are extracellular during different stages of infection. As we observe that macrophages undergo contact-dependent uptake-independent death also in the absence of cytochalasin D in vitro, we assume that this may also happen in vivo when Mtb aggregates are formed or released outside host cells. This process may promote bacterial propagation at early stages of infection as well as at later stages when necrotic granulomas and cavities are formed.

      In the revised manuscript we present and discuss our observations in the context of the in vivo phenotypes reported in the scientific literature and we include additional references showing that extracellular Mtb aggregates are often observed in vivo. We also propose this concept already in the Introduction section to better link our observations to possible in vivo scenarios.

      Minor comments:

      Line 91: here the authors list the different forms of cell death that is induced by MTB infection, and it would be important to add apoptosis as a reported mechanism as well (References: PMID: 23848406, PMID: 28095608)

      As suggested by the reviewer, in the revised manuscript we have modified the Introduction section to include apoptosis as a Mtb-induced mechanism of macrophage death and we have cited the two publications recommended by the reviewer.

      1. Line 95: The secretion of EspE was mainly described in M. marinum while in members of the M. tuberculosis complex no virulence phenotype was reported to the best of my knowledge.

      In agreement with the reviewer’s comment, we have modified the sentence and removed EspE from the list of virulence factors.

      1. Lines 98: In the cited papers it is described that PDIM is required for phagosomal damage/rupture, however, the methods used there do not allow to specifically report about translocation. The wording should be adapted.

      We thank the reviewer for this insightful comment and we have modified the text accordingly.

      1. Line 206: Here it is described that in Figure 3A the BMDMs were expressing tdTomato fluorescence and the bacteria GFP, and the same is also repeated in the Figure legend of Fig3A. However, on the images, BMDMs are shown green and bacterial clumps purple (as also indicated in the description directly on the images) Please check and explain/correct this discrepancy.

      We apologize that the color scheme in figure 3A is confusing. In this figure we used tdTomato-expressing BMDMs and GFP-expressing bacteria; however, we have pseudo-colored the fluorescence images for the sake of consistency with the other figures in the manuscript, which always show bacteria in magenta. We have clarified this point in the figure legend of the revised manuscript.

      1. Line 304: Here the authors could mention that this finding is similar to results found previously in reference PMID: 28095608 and opposite to the results reported previously in PMID: 28505176.

      As recommended by the reviewer, we have added a sentence comparing our results with previous studies and we have cited the two references suggested by the reviewer.

      1. Line 321: It should be mentioned that CFP10 (EsxB) can also be secreted without its EsxA partner (under certain circumstances, i.e. when the EspACD operon is not expressed due to a phoP regulatory mutation (PMID: 28706226)). However, in Figure S7 an EspAdeletion mutant shows loss of EsxB secretion. This should be checked and discussed how the data here compare with data and strains published previously.

      We thank the reviewer for pointing out this interesting point. Our proteomics data revealed that both our esxA mutant and our espA mutants abolish secretion of both EsxA and EsxB, in line with previously published data (12–14). We do not know why the espA mutant behaves differently from the MTBVAC strain concerning secretion of EsxA and EsxB (although we note that regulatory mutations may have complex pleiotropic effects). In the revised manuscript, we have modified this section to include references highlighting that secretion of these proteins may be uncoupled in some circumstances.

      1. The finding that EspB can substitute the loss of virulence due to loss of EsxA/ESAT-6 secretion is astonishing and also is different to previous observations that strain H37Ra and MTBVAC (two attenuated strains that have no or very little EsxA secretion due to a regulation defect of the espACD operon PMID: 18282096; PMID: 28706226). How does the hypothesis put forward by the authors match with these previously published data ?

      We thank the reviewer for this interesting comment. We would like to clarify that we are not claiming that EspB and EsxA are in general redundant and that EspB can substitute EsxA as a virulence factor. In our experiments we show that EspB can induce contact-dependent uptake-independent death in macrophages in contact with Mtb aggregates in vitro even in the absence of EsxA; however, the precise role of EspB during infection in mice or humans remains to be elucidated and is outside the scope of this manuscript. A previous study comparing Mtb ESX-1 mutants with different secretion patterns linked EspB secretion to Mtb virulence in vivo (14); however, the behavior of an isogenic espB_deletion strain _in vivo was not reported. A M. marinum espB mutant was shown to have reduced virulence; however, in contrast to Mtb, deletion of espB also affects secretion of EsxA in this organism (15). As the reviewer points out, the Mtb strains H37Ra and MTBVAC do not secrete EsxA due to a mutated phoP gene. Previous literature has shown that espB expression is also dependent on PhoP (16). We thus speculate that these strains might behave similarly to our espA espB mutant strain in the context of contact-dependent uptake-independent induction of macrophage death, although we think that this point is outside the scope of our manuscript.

      1. In the same context, it is to notice that the authors report in the paragraph between lines 310-330 about EsxA/EsxB secretion, however, looking at the Western blots of figure S7, there is no blot showing results using an antibody against EsxA. Given the previously published results that EsxA/EsxB secretion may also be disconnected (PMID: 28706226), the wording of the text in this paragraph should be adapted or the results from Western Blots using EsxA antibodies be added.

      We agree with the reviewer’s comment. Unfortunately, we currently do not have access to a good antibody for EsxA. A commercial monoclonal antibody that was previously available for immunoblotting has been discontinued. We tried several other antibodies that were previously shown to work in M. marinum, but none of these antibodies were effective in M. tuberculosis. We agree that analysing secretion of EsxB alone might not be sufficient to support claims about EsxA secretion. For this reason, we performed quantitative proteome analysis of the secretome in all of the relevant mutant strains. In our revised manuscript, we are careful to make sure that whenever we refer to EsxA/EsxB secretion we always provide proteomics data to support our conclusions.

      1. Line 395: Here the authors write that BTP15, a small molecule that in a previous study was shown to inhibit EsxA secretion at higher concentrations (starting from 1.5 uM and higher). However, no effect on the expression of EsxA was described for that compound in reference PMID: 25299337. Thus the corresponding sentence in line 395 needs to adapted to that situation.

      We thank the reviewer for noticing this error, which we have corrected in the revised manuscript.

      1. Moreover, most concentrations of the compounds used are reported in uM, except for BTP15. It would be easier for the reader if the concentration used for BTP15 could also be reported in uM.

      As suggested by the reviewer, in the revised manuscript we report the concentration of BTP15 in μM.

      1. Line 475 The comment on the pore forming activity has to be made with caution, as recombinant EsxA produced from E. coli cultures has been shown to often retain detergent PMID: 28119503 that may be responsible for pore forming activity of recombinant EsxA observed in quite some studies, whereas EsxA purified from M. tuberculosis cultures did not show the detergent, but still retained membranolytic activity. This point should be clarified and discussed, and the wording adapted, as EsxA is not a classical poreforming toxin, but excerts the membrane-lysing activity together with other partners (PDIM) in a yet unknown way upon cell contact.

      We thank the reviewer for this comment. In the revised manuscript, we have modified the text accordingly and included the sugggested reference.

      Reviewer #2 (Significance):

      The findings in this work extend the current knowledge on cell infection by M. tuberculosis in a significant way and put extracellular M. tuberculosis clumps in a new context. These data obtained by single-cell time-lapse fluorescence microscopy also need to be discussed for predicting the relevance for an in vivo situation inside the host organism.

      As suggested by the reviewer, in the revised manuscript we discuss additional examples from the literature showing that Mtb aggregates can form during infection and that many bacteria are extracellular and associated with necrotic foci during different stages of the disease in animal models of infection and in human patients. We believe that these previously published observations support the in vivo relevance of the process we observe in vitro.

      Reviewer #3 (Evidence, reproducibility and clarity):

      This is an excellent study distinguished by the volume of observations, rigor of analysis and clarity of presentation. The results are novel, biologically interesting and pathophysiologically important. The ability of aggregated M. tuberculosis to kill macrophages has been reported, but the understanding was that proliferation of Mtb within macrophages killed them. Here, the authors observe that macrophages are susceptible to pyroptotic death triggered by contact with extracellular Mtb aggregates, and that this is not recapitulated by contact with a comparable number of Mtb as single bacilli. The authors go some way to tracing the mechanism and uncover a complex inter-dependence on PDIM and on components of the mycobacterial ESX-1 secretory system.

      The following comments will helpfully improve the study further.

      Major points

      1. The chief measurement in this study is death of individual macrophages as judged by the observer in videomicroscopy. However, the criteria for calling a macrophage "dead" are not defined with any morphological detail, beyond noting that the cell stops moving and lyses. Of course a cell will stop moving if it has lysed, but do not some if not most cells stop moving before they lyse? If so, lysis alone would seem to be the time-point marker for cell death. Yet from the images in Fig 1E and F, I cannot tell that the cells called "dead" have lysed. Watching the videos, the time of lysis is not clear to me. Eventually, shrunken cell bodies are obvious but it is not clear if these are residua of cells that had been said to "lyse" at an earlier time.

      In this study, we used brightfield time-lapse microscopy images to identify cell death. Dying macrophages rapidly change shape, lose membrane integrity, and stop moving. Moreover, the intracellular structures and bacteria also stop moving at the time of death of the host cell. While these events can be difficult to distinguish by examining individual snapshots, they are readily identifiable by careful frame-by-frame examination of time-lapse microscopy image series. To exemplify this process, in the revised manuscript we show in supplementary figure S2A how we identify macrophage cell death events. We also include Draq7 (a live cell-impermeable dye commonly used to identify dead cells by flow cytometry and microscopy) in the growth medium during time-lapse imaging in order to label dead macrophages. The timing of staining validates and confirms our strategy of using brightfield time-lapse images to define the time-of-death of individual cells. To further assist readers, in the revised manuscript we provide the time-lapse microscopy movie used to generate this figure (movie S4). Similar images and movies have also been added for cells treated with cytochalasin D (figure S2B; movie S7). As suggested by the reviewer, we also replaced figures 1E,F with new figures incorporating the Draq7 staining to label macrophage cell death and we include the time-lapse microscopy movies used to generate these figures (movies S4, S5).

      1. The use of BTP15 as a specific inhibitor of ESX-1 is problematic. The source of the compound is not stated.

      The BTP15 molecule was kindly provided by Prof. Stewart Cole, the corresponding author of the article describing the identification of this compound and its effect on Esx-1 secretion (17). We have included this information in the Materials and Methods section.

      1. The concentration used, 20 ug/mL, is well above the reported IC50 (1.2 uM) for its presumed target, a mycobacterial histidine kinase, and above the concentrations (0.3-0.6 uM) reported to inhibit Mtb's secretion of EsxA almost completely. It is concerning that the concentrations that were reported to work so well on the whole cell are lower than the IC50 for the presumed target, because uptake into Mtb and intrabacterial metabolism will typically lead to a lower potency for an inhibitor against the whole bacterium than against the isolated enzyme; and because 50% inhibition of an enzyme rarely gives a functional effect as complete as what is shown in the cited reference. In other words, it is not clear that the histidine kinase is the functionally relevant target of BTP15 in Mtb. The original report did not consider BTP15's possible effect on mammalian cells and the present authors likewise do not take that into consideration with respect to possible effects on the macrophages. No concentration-response or time course experiments with BTP15 are presented. Most important, unless I missed it, there is apparently no demonstration that the compound inhibited ESX-1-dependent secretion in the present authors' hands, no matter by what mechanism. Without this, I am reluctant to accept that the results with BTP15 demonstrate a dependence of extracellular-aggregate-induced macrophage death on ESX-1-mediated secretion from Mtb. I would recommend that the authors either provide a direct demonstration of BTP15's effect on ESX-1 dependent secretion at concentrations near those that worked on whole cells in the original report, or drop the BTP15 studies from the paper. That said, the genetic experiments remain unequivocal, so the paper's conclusions would not be affected.

      We agree with the reviewer that in the original version of our manuscript we did not provide direct evidence demonstrating that BTP15 inhibits ESX-1 secretion and that it does not affect the host cells. We addressed the first issue by quantifying (by Western blot) the secretion of EsxB and EspB in Mtb cultures treated with different concentrations of BTP15. We show that BTP15 reduces secretion of these two proteins in a dose-dependent manner. These data have been included in figures S21A-B of the revised manuscript. In line with this observation, we also show that BTP15 reduces uptake-independent killing of macrophages by Mtb aggregates in a dose-dependent manner (figure 6H). To show that the dose-dependent effect observed in macrophages does not depend on a direct effect of BTP15 on the host cells, we treated Mtb with different concentrations of BTP15 for 48 hours and removed the compound by washing the bacteria prior to infection. We observe that Mtb aggregates that have been treated with BTP15 show reduced uptake-independent killing of macrophages, even when bacteria have been pre-treated and the small molecule is not present during the incubation with the cells (figure S21C). We hope that these additional results provide clear evidence that BTP15 reduces Mtb-mediated contact-dependent uptake-independent killing of macrophages by inhibiting ESX-1 secretion, consistent with our genetic data. We think these results are important because they provide a chemical validation of our genetic data. To the best of our knowledge, BTP15 is the only available compound known to inhibit ESX-1 secretion, and in the revised manuscript we confirm that this compound has the previously described effect on Mtb also in our hands. Unfortunately, we had to use concentrations higher than those previously reported to inhibit ESX-1 secretion in order to achieve the observed effects. As we had access only to prediluted aliquots that had been stored for a long time, we cannot rule out the posibility that the compound might have undergone partial degradation during storage.

      1. The experiments, or at least the discussion, could consider what may distinguish single Mtb cells from aggregated Mtb in some way relevant to the present observations. The authors seem to assume that all the Mtb cells in their preparations are biochemically equivalent and that their distribution into single-cell or aggregate subpopulations is stochastic. What if it is deterministic instead? For example, what if these two subpopulations are defined by differential expression of PDIM, so that the greater macrophage-killing effect of aggregates than single cells in equivalent numbers reflects a greater amount of PDIM in the aggregates, rather than some sort of valency-of-contact effect? The authors could compare the PDIM-to-DNA ratio in the single cell and aggregated subpopulations, or at least discuss this possibility.

      We thank the reviewer for proposing this extremely interesting idea. In the revised manuscript, we have added a discussion of this point (lines 487-489) and we have floated various possible explanations. However, we believe that experimental dissection of the underlying mechanism could be a very lengthy undertaking and we hope that the reviewer will agree that this is outside the scope of the current manuscript.

      Minor points

      1. Some of the experiments compare "low", "medium" and "high" numbers of Mtb, but I could not find a definition of these numbers.

      We apologize for this oversight. In the revised manuscript, we have clarified the definition of these gates in the figure 2 legend.

      1. There seem to be no positive or negative controls for any of the antibodies used for cell staining (anti-cleaved caspase 1, antiphospho RIP3, anti-phospho MLKKL).

      As recommended by the reviewer, the revised manuscript includes controls for all of the antibodies used for immunostaining. In figure S12 we provide representative immunostaining images and fluorescence quantification of uninfected untreated macrophages (negative controls) and of uninfected macrophages treated with cocktails of molecules typically used to induce apoptosis, pyroptosis, or necroptosis (positive controls).

      Reviewer #3 (Significance):

      The results are novel, biologically interesting and pathophysiologically important.

      We thank the reviewer for their appreciation of our findings.

      References 1. H. Gan, et al., Mycobacterium tuberculosis blocks crosslinking of annexin-1 and apoptotic envelope formation on infected macrophages to maintain virulence. Nature Immunology 9, 1189–1197 (2008). 2. M. Divangahi, et al., Mycobacterium tuberculosis evades macrophage defenses by inhibiting plasma membrane repair. Nature Immunology 10, 899–906 (2009). 3. D. Mahamed, et al., Intracellular growth of Mycobacterium tuberculosis after macrophage cell death leads to serial killing of host cells. eLife 6, e22028 (2017). 4. A. J. Jimenez, et al., ESCRT Machinery Is Required for Plasma Membrane Repair. Science 343, 1247136 (2014). 5. M. van Engeland, L. J. W. Nieland, F. C. S. Ramaekers, B. Schutte, C. P. M. Reutelingsperger, Annexin V-Affinity assay: A review on an apoptosis detection system based on phosphatidylserine exposure. Cytometry 31, 1–9 (1998). 6. D. R. Hoff, et al., Location of Intra- and Extracellular M. tuberculosis Populations in Lungs of Mice and Guinea Pigs during Disease Progression and after Drug Treatment. PLOS ONE 6, e17550 (2011). 7. S. M. Irwin, et al., Presence of multiple lesion types with vastly different microenvironments in C3HeB/FeJ mice following aerosol infection with Mycobacterium tuberculosis. Disease Models & Mechanisms 8, 591–602 (2015). 8. Kaplan, G., et al., Mycobacterium tuberculosis Growth at theCavity Surface: a Microenvironment with FailedImmunity. Infection and Immunity 71, 7099–7108 (2003). 9. J. Timm, et al., A Multidrug-Resistant, acr1-Deficient Clinical Isolate of Mycobacterium tuberculosis Is Unimpaired for Replication in Macrophages. The Journal of Infectious Diseases 193, 1703–1710 (2006). 10. R. L. Hunter, Pathology of post primary tuberculosis of the lung: An illustrated critical review. Tuberculosis 91, 497–509 (2011). 11. G. Wells, et al., Micro–Computed Tomography Analysis of the Human Tuberculous Lung Reveals Remarkable Heterogeneity in Three-dimensional Granuloma Morphology. Am J Respir Crit Care Med 204, 583–595 (2021). 12. S. A. Stanley, S. Raghavan, W. W. Hwang, J. S. Cox, Acute infection and macrophage subversion by Mycobacterium tuberculosis require a specialized secretion system. Proc Natl Acad Sci USA 100, 13001 (2003). 13. S. M. Fortune, et al., Mutually dependent secretion of proteins required for mycobacterial virulence. Proc Natl Acad Sci U S A 102, 10676 (2005). 14. J. M. Chen, et al., Mycobacterium tuberculosis EspB binds phospholipids and mediates EsxA-independent virulence. Mol Microbiol 89, 1154–1166 (2013). 15. L.-Y. Gao, et al., A mycobacterial virulence gene cluster extending RD1 is required for cytolysis, bacterial spreading and ESAT-6 secretion. Mol Microbiol 53, 1677–1693 (2004). 16. V. Anil Kumar, et al., EspR-dependent ESAT-6 Protein Secretion of Mycobacterium tuberculosis Requires the Presence of Virulence Regulator PhoP. Journal of Biological Chemistry 291, 19018–19030 (2016). 17. J. Rybniker, et al., Anticytolytic Screen Identifies Inhibitors of Mycobacterial Virulence Protein Secretion. Cell Host & Microbe 16*, 538–548 (2014).

    1. ourfutures entangled together.Imbler 4

      We are still rapidly discovering new species and beings that we did not know to exist before on our planet. Their evolutionary trajectory may have split but they are ultimately tied to us by ancestry. However, we often look at creatures, especially new ones, and label them as alien, unsightly, repulsive, gross, creepy, etc. This destructive rhetoric leads to an irreverence toward their being. We do not know what the blob may feel, think, or do so why do we create this disconnect between them and ourselves? Respect can still be present without a connection but may take more effort. We must put in this effort, however, as they are ultimately our neighbors (evolutionarily and physically).

    2. confuse or repulse us

      This is abhorrence to the unknown is thought to be part of our psychology. We make patterns and molds in which to fit the world into. It makes internal processing much easier, especially when confronted with a dangerous situation. For example, when things, such as the sea-blob, go against our conceived notions of ocean animals we are more cautious as we don't know to internally lable them as "safe and cuddly" or "they will eat your arm off". (read: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4141622) I'm not a super big fan of this article as its very anthropocentric but it points to some ideas about patterns that have been thought to help us survive. While the above article may refute this, animals do a similar "patternization" of their surroundings. Its a shortcut to take in senses, really, but I think it often leads to, as Juliette said, schismogenesis. In my classes I haven't found much literature contradicting the "pattern" notions of sensing per psychology, as much as I don't like them. I'd be interested to see if other had come across any literature.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Point by Point Description of Revisions

      We thank the reviewers for their time, effort and constructive input. Below, our responses are bolded with yellow highlighting, while the reviewers’ comments are italicized.


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary

      The manuscript by Hays and colleagues described the spectrum of mutations that drive adaptation in nitrogen-limit environment by experimental evolution. The approach of serial transfer (fluctuating condition) allowed them to find that Ty insertion is the major mutation type for adaptive evolution. This was neither observed in nitrogen-limited condition when another experimental evolution approach, chemostat (non-fluctuating condition), was applied, nor in glucose-limited condition. The authors concluded that not only selection pressure itself but also how selection is applied are important to shape the adaptive events.

      Major points

      *Both serial transfer and chemostat are commonly used approaches of experimental evolution. In the manuscript, the authors refer serial transfer to "fluctuating" condition because the low nitrogen source would be consumed to none during the interval of transfers. I am wondering whether the authors have estimated the nitrogen uptake (consumption) during the transfer intervals and whether the nitrogen was exhausted within 48 hours. *

      We appreciate the reviewer’s question, and although we did not directly measure nitrogen consumption throughout this specific experiment, ammonium was the limiting nutrient in the defined medium which has been previously used to achieve transient nitrogen starvation conditions in other yeast experimental evolutions (Blundell et al. 2019). In that previous work, it was confirmed that addition of ammonium above 0.04% (up to 0.15%) led to additional rounds of doubling – confirming that the amount of ammonium provided was in fact the limiting nutrient. Finally, we point out that the adaptive mutations recovered in this study predominantly impact genes known to affect nitrogen catabolism, as is expected under nitrogen-limited evolution conditions.

      We’ve updated the methods section to ensure the rationale for this medium choice is clearly stated.

      Since this is not precisely controlled by experiment design, the "fluctuating" condition itself may be not stable during the long-term evolution. For example, as population evolved, the rate and the amount of nitrogen uptake might change. I feel a better experiment setup for "fluctuating" condition is like 24 hour "low-nitrogen (ammonium)" - 24 hour "no ammonium" and so on. If the adaptive mutations (e.g. adaptive Ty) specifically respond to such "fluctuating" condition rather than chemostat, the authors can measure their fitness in nitrogen starvation condition, which is expected to be fitter than mutants observed only in chemostat (e.g. copy number variation of nitrogen transporters).

      The reviewer correctly points out that nitrogen availability will change as the population adapts, and it is likely that some portion of the population become better at utilizing the newly available nitrogen upon transfer into fresh medium over time. This is in fact the intention of this experimental design. We have rephrased the text of the main paper to emphasize that our fluctuating conditions represent fluctuations in the nutrient availability in fresh medium upon transfer, and not strict oscillating nitrogen concentrations that cells experience locally throughout all generations.

      We note that in the reviewer-proposed experimental design (using 2 stages of low- and no- nitrogen media), that the low-nitrogen condition would still exhibit the same population-dependent nitrogen usage dynamics as the population adapts over time. We chose our evolution conditions to apply a selective pressure for cells to become best adapted to the environmental fluctuations associated with this transfer regimen, and we have updated the main paper to clarify this point. We thank the reviewer for helping us clarify this important point.

      The authors compared their results with published dataset using nitrogen-limitation chemostat and the mutation spectrum is different. In addition to the "fluctuating" and "non-fluctuating" difference as mentioned above, other factors need to be considered. First, the nitrogen-limited conditions in the two studies are different. The authors used 0.04% ammonium sulfate while Hong et al used "800 uM nitrogen regardless of the molecular form of the nitrogen", which may influence the mutation spectrum and need to be discussed. Second, bottlenecks were applied for each transfer in this study, in comparison with constant population size in chemostat, which will influence the efficiency of selection and further the evolutionary dynamics and outcomes. Thus, population size and bottlenecks need to take in to account to make comparisons of mutation spectrum.

      We thank the reviewer for their point: we have expanded the section of the main text addressing the differences in how serial transfer and chemostat conditions are applied, the media differences necessitated by such and specifically how the conditions between our study and the Hong et al study differ. We believe the additional detail now better highlights our point that how selection is applied shapes adaptive events, and we thank the reviewer for their helpful input.

      *The authors found that Ty mutagenesis accounts for a substantial number of adaptive mutations in nitrogen limitation. I am wondering for adaptive clones, whether Ty occurred independently or is more likely to co-exist with other drivers. *

      We appreciate the reviewer’s question. In the clones with adaptive Ty insertions, the only co-occurring adaptive mutation is autodiploidization. There were no additional mutational classes that were adaptive and co-occur with adaptive Ty insertions in our dataset. However, many novel Ty insertions are neutral, and these DO co-occur with beneficial mutations. These data are captured in Figure 5A, and in detail in Supplemental File 1. The blue bar in the adaptive haploids reflect neutral-fitness Ty insertions that co-occur with other mutations that drive fitness increase. These are distinct from the Ty insertions that are themselves responsible for the fitness increase, which are captured in the orange bar. We have clarified the text surrounding the Fig 5A results to better emphasize these findings.

      What is the distribution of number of clones with one, two, and multiple mutations? If there is co-existence of driver mutations, what is the relative contribution of each to adaptation? The phenotypic validation of Ty mutagenesis for adaptation is expected while it seems only one case was presented in Figure 2 (mep1Ty−731427).

      Aside from diploidization events, only one clone with two nitrogen-adaptive mutations was identified in this study: a double mutant with mutations in both gat1 and tor1. Please see Supplemental File 1 (which is sortable) for a complete outline of all clones with mutations and fitness remeasurements. In the case of diploids that have additional beneficial mutations, those data are shown in Figure 3 with diploids indicated as well as the ploidy of the secondary beneficial mutation, and again in detail in Supplemental File 1.

      The reviewer is correct in that only one Ty mutation was dissected and validated in Figure 2. However, we inferred adaptation by Ty insertion through the observation of parallel adaptation, and we fitness remeasurements of many independent Ty insertion mutants.

      Statistical analysis needs to be reinforced in the manuscript, including but not limited to Figure 2 fitness comparison among clones with different genotypes, Figure 5 Ty enrichment comparison, etc.

      We thank the reviewer for their helpful suggestion. We have updated figures and figure legends to more clearly include statistical comparisons between genotypes for Figures 2 and 5: specifically describing the analyses used and the associated p-values for differences between WT and adaptive alleles and significance of Ty class enrichments.

      Minor points

      We thank the reviewer for their detailed and careful edits below and have addressed them in the main text and figures as applicable.

      "For diploids, we only sequenced those with estimated fitness greater than diploidy alone would provide." Main text clarified with additional explanation

      "either through impacting alternate start (green triangle) or alternate stop sites (yellow and red triangles)." I do not see yellow and red triangles in Fig. 3. Legend updated to reflect current figure color palette.

      Fig.2. FCY2 mutant fitness can be added as well?

      Unfortunately, data for FCY2 backcrossed mutants were not generated

      "while we found only 212 novel Ty insertions in 488 glucose evolved clones (Figure 5B)" The value in the text does not match the one in the figure.

      We appreciate the reviewer’s attention to detail and have corrected the main text to match the correct value in Fig 5B.

      In addition to adaptive Ty insertion, what is the genome-wide distribution or characteristics of other Ty, especially for nitrogen-limited condition? Is that distinct from glucose-limited condition?

      Figure S5 addresses the major locations of Ty insertions upstream of tRNA genes, in both Glucose and Nitrogen limited evolutions, the insertion location previously published to be preferred; the only difference between glucose and nitrogen is that there are more in the nitrogen limited condition, though the profile of insertions upstream of tRNAs is essentially the same. In addition to insertions upstream on tRNAs, all other specific insertion locations are available in Supplemental File 1 and Supplemental File 4.

      "Studies determining at which step(s) of the Ty life cycle nitrogen starvation shapes ty activity would be needed to determine the specific mechanism underlying the increase in transposon insertions." Here "ty" => "Ty"

      Corrected! We thank the reviewer for their detailed reading.

      Reviewer #1 (Significance (Required)):

      The manuscript is a follow-up work of Levy et al. 2015 and Blundell et al. 2019. In general, the research is interesting and point out the important role of Ty for adaptive evolution in nitrogen-limited environment. It also compared the spectrum of adaptive mutations in response to nitrogen limitation by serial transfer (this work) and chemostat (especially the work of Gresham lab). The paper is well-written as well. Audience from the field of genetics, genomics and evolution will be interested in this work.

      My field of expertise: genetics, experimental evolution, budding yeast

      We thank the reviewer for their kind comments, constructive input, and generosity with their time.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Hays et al. sequence and analyze the mutational spectrum from a set of S. cerevisiae strains evolved in a nitrogen limiting environment, and detail genes that recurrently are found to be mutated in a fluctuating nitrogen limiting environment. These data are contrasted to evolution under glucose limited environments and non-fluctuating environments. Specifically, Hays et al. observe a high proportion of Ty element-mediated mutations arising from strains evolved under the fluctuating nitrogen limiting regime. Their fitness data are robust and clearly demonstrate that these mutations reproducibly lead to improved fitness under nitrogen limitation (based on the authors' defined criteria). Overall, the observed bias of the high proportion of Ty-mediated mutation in fluctuating nitrogen starvation is unexpected and an important finding. Further, the discussion was thoughtful and well executed in detailing interpretations of the data more broadly. We are generally positive about this work and find the analyses robust and convincing. The authors should address the concerns listed below prior to acceptance/publication.

      We thank the reviewer for their kind words and enthusiasm for our study, we have worked to address their constructive feedback as detailed below.

      Reviewer #2 (Significance (Required)):

      Major comments to be addressed:

      The claim that the 3' UTR Ty insertions in MEP1 are apparently gain of function is very interesting. The authors should consider performing RT-PCR or strand specific RNAseq to see whether the antisense transcript is reduced and the MEP1 transcript is increased in the presence of the 3' UTR insertion. This would provide much stronger support for their claim that MEP1 3' Ty insertions are gain of function. Orientation information is critical to provide!

      We agree that these future directions are exciting and of extreme interest! We however believe they are out of the scope of this current study which already includes substantial data and analysis. We note that we did not claim that the 3’ UTR insertions are gain of function – instead, we suggested that “Ty insertions in the 3’ region unique to the MEP1 locus may affect fitness in nitrogen limitation via a mechanism different than the putative gain of function missense mutations in the coding region itself”. We did not speculate on the mechanism by which these insertions are adaptive, but it is an active line of research and we look forward to discovering the mechanism.

      The authors seemed to miss a golden opportunity to measure Ty1 expression or transposition under fluctuating/non-fluctuating nitrogen starvation. Otherwise, the claims of increased Ty activity are unsupported. The authors measured an endpoint (Ty insertion), but this says nothing directly as to the rate of activity, although it is presumably correlated. However, based on the data one could argue activity may be equal in all environments, but the mutational events caused by Ty activity are uniquely selected for in fluctuating nitrogen starvation. As it stands, either model (increased activity vs. differential strength of selection) are equally likely. At a minimum, the authors should at least address this point.

      We appreciate the reviewer bringing this concern to our attention: we address the reviewer’s concerns in 3 ways: First, we’ve rephrased to more explicitly consider the possibility that the observed difference in novel Ty insertions could be driven at the level of selection, not activity. Second, we’ve clarified the main text to greater emphasize our reasoning for why we speculate the inference of greater Ty activity under nitrogen starvation may be more likely based on the level of presumptive neutral Ty insertions being greater in nitrogen than in glucose (even after normalization for the number of evolved generations). Third, we’ve performed additional experiments that support that, at least with an artificial retrotransposition reporter construct, these starvation conditions show additional Ty activity in nitrogen compared to glucose (note, we have not carried out such experiments in chemostats, and do not currently have a functioning chemostat set up). We’re including these results below, though have not included them in the manuscript, as we intend to generate additional data for a subsequent study to make these claims more robust. We feel that adding them to this manuscript would make it less focused.

      To assess Ty activity in yeast experiencing different nutrient conditions, we used a modified version of a plasmid-based Ty reporter created previously by Curcio and Garfinkel, 1991, PNAS 88(3):936-40. The original reporter construct used an inducible GAL promoter to initiate Ty transcription from the plasmid, and new Ty insertions confer the ability for the strain to grow on SC-His. To assess Ty activity induced by nitrogen limitation, we excised the GAL promoter and instead used the native Ty promoter from the insertion found at YPLWTy1-1. This Ty promoter was selected based on having recovered novel Ty insertions in evolved clones that originated from this locus.

      Plasmid pGS234 was created by replacing the promoter containing XhoI fragment from pGTy1mhis3-AI with XhoI fragment containing promoter from chromosomal location of YPLWTy1-1.

      Strains bearing the Ty reporter plasmid pGS234 were subjected to nitrogen limited media and glucose limited media to assess transposon activity in these conditions. We observe significantly more Ty activity from the reporter plasmid in nitrogen-limited conditions than in glucose limited conditions or in SC-ura medium (see Figure below).

      Panel A: Bars represent average of three WT strains with transposon reporter plasmid; each value is number of colonies on SC-His medium with each His+ colony representing independent Ty transposition events. Strains were grown in SC-Ura and then shifted to M14, M3 or SC-Ura as a control for 48 hours and plated on SC-His plates.

      Panel B. One WT strain with pGS234 was subjected to a fluctuation test (16x 5ml tubes) in M14 and M3 media. Each dot represents the number of colonies on each SC-His plate. Kruskal-Wallis chi-squared = 23.341, df = 1, p-value = 1.357e-06

      In line with the above, we think the authors should soften some points in the discussion as it stands. For example: "The significant increase of Ty activity under this specific fluctuating nitrogen-starvation..." We feel the data does not exclusively support increased activity of Ty, that would require the aforementioned assays. As it stands, we feel this is more appropriate: ": "The significant increase of Ty insertions under this specific fluctuating nitrogen-starvation..."

      We edited the main text to include this suggested language change.

      Minor comments to be addressed:

      Please provide a citation for the following statement "The single copy of Ty5 in the ancestor is known to be inactive and gives rise to no new insertions under either glucose or nitrogen limitation" - Voytas & Boeke. Nature 1992.

      We appreciate the reviewer catching this, and the reference has been added.

      We found the following to be a confusing sentence: "Indeed, if global Ty derepression reflects a host-parasite coevolution that minimizes host cost and maximizes potential for survival of both, the role of transposons in host evolvability is important (Levin and Moran 2011)."

      We have clarified this sentence by editing it to: “Indeed, the role of transposons in host evolvability is important: global Ty derepression could reflect host-parasite coevolution towards a less parasitic lifestyle: resulting in minimal host cost and maximized potential for survival of both, especially under detrimental environmental conditions (Levin and Moran 2011)”

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Hays et al. studied the genomic changes that lead to adaptation under fluctuating nitrogen starvation. In addition to loss of function alleles, the authors identified adaptive gain-of- function alleles. Furthermore, their results demonstrate that Ty and microhomology-facilitated mutations in several candidate genes contribute substantially (though not exclusively) to the adaptation under nitrogen-limited serial transfer. Importantly, a novel lineage tracking method provides high resolution fitness measurements.

      We appreciate the reviewer’s helpful edits in clarifying and improving the manuscript, and appreciate their time and constructive input.

      Despite the clear merits of the study, we also have a few relatively minor questions and suggestions

      • Please elaborate on the criteria they used to identify adaptive loci. The fact that these mutations occurred repeatedly is highlighted on Table 1, but perhaps numbers could also be included in the text, to increase clarity.* We have added the pertinent numbers to the main text to accompany the values captured in Table 1 and Supplemental file 1 and further emphasize selection criteria outline in the main text.

      • "Were also validated to a fitness effect of >0.01 in nitrogen-limited media". More details about the selection of this cut-off value need to be provided in either the text or the Methods section to increase clarity.*

      We agree and have clarified the limit of detection used in the methods section.

      • In Figure 3 it seems that the type of observed mutations was less important compared to the gene where the mutation occurred. Therefore, it seems that some genes, e.g. GAT1, contribute more to the observed fitness change. It would be beneficial if the authors discussed this observation.*

      We thank the reviewer for their observation and have included some additional discussion in the main text around the per-locus fitness observations as shown in Figure 3.

      • What was the reason to select samples from the 88th generation for glucose and from the 192nd generation for nitrogen, as presented in Figure 5? How does this affect the observations?*

      We thank the reviewer for their question: these generations were determined to best capture peak adaptive diversity (as discussed in Blundell et al 2019), based on population barcode dynamics in the original evolutions (Levy et al 2015, Blundell et al. 2019). The challenge is balancing picking a time point late enough, such that there are sufficient numbers of adaptive clones within independent lineages, yet early enough that few mutations have occurred (ideally only a single adaptive mutation per sequenced clone) and that no very fit clones have taken over the population. Because the fitness effects of beneficial mutations in glucose limited media were larger than in nitrogen limited media it was necessary to choose a later timepoint in the Nitrogen limited evolutions, to allow for there to be a sufficient fraction of the population carrying adaptive mutations. We believe this peak diversity makes these samples the most relevant for broadly assessing the adaptive mutational spectra.

      • The use of statistics is not always clear. Please provide a clear indication of the statistical methods/tests used, eg for Figure 5.*

      We thank the reviewer for this important point and have updated figures 2 and 5 and their corresponding legends for clarity surrounding statistical analysis used.

      • The authors could include a supplementary Table, summarising their findings on GAT1 locus, since the text is extensive and it is difficult to put all the information into perspective.*

      We note that row one of Table 1 in the main text is exactly this overview of the mutations observed at the GAT1 locus. These mutations plus specific location and their fitness remeasurements are shown in Figure 3 panel A, and detailed descriptions of the mutations for each clone are also available in the sortable table in Supplemental File 1. For these reasons we’ve not included an additional GAT1-specific table.

      • The introduction is extremely detailed and informative, but at the same time quite lengthy; shortening it and only keeping the most relevant parts may increase readability.*

      We appreciate the reviewer’s perspective but have not made substantial changes to remove information from the introduction as we feel that each of the subsections of the introduction are necessary to provide the appropriate context to the study.

      • More detailed figure legends (which should also include a brief mentioning of the statistics & sample size) would benefit comprehensibility. For example the black lines in Figure S4 are not described anywhere in the text.*

      We agree and have added further description of statistics used in legends throughout. Description of the black lines in Figure S4 has been included.

      • "Many of the 332 clones ... were beneficial" à rephrase.*

      We have updated this sentence to clarify our intent.

      Reviewer #3 (Significance (Required)):

      Apart from the elegant characterization of adaptive mutations, perhaps the most important part of the study is that it highlights the importance of a particular selection regime. Together, the findings extend our knowledge on this important topic.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Hays et al. sequence and analyze the mutational spectrum from a set of S. cerevisiae strains evolved in a nitrogen limiting environment, and detail genes that recurrently are found to be mutated in a fluctuating nitrogen limiting environment. These data are contrasted to evolution under glucose limited environments and non-fluctuating environments. Specifically, Hays et al. observe a high proportion of Ty element-mediated mutations arising from strains evolved under the fluctuating nitrogen limiting regime. Their fitness data are robust and clearly demonstrate that these mutations reproducibly lead to improved fitness under nitrogen limitation (based on the authors' defined criteria). Overall, the observed bias of the high proportion of Ty-mediated mutation in fluctuating nitrogen starvation is unexpected and an important finding. Further, the discussion was thoughtful and well executed in detailing interpretations of the data more broadly. We are generally positive about this work and find the analyses robust and convincing. The authors should address the concerns listed below prior to acceptance/publication.

      Significance

      Major comments to be addressed:

      The claim that the 3' UTR Ty insertions in MEP1 are apparently gain of function is very interesting. The authors should consider performing RT-PCR or strand specific RNAseq to see whether the antisense transcript is reduced and the MEP1 transcript is increased in the presence of the 3' UTR insertion. This would provide much stronger support for their claim that MEP1 3' Ty insertions are gain of function. Orientation information is critical to provide!

      The authors seemed to miss a golden opportunity to measure Ty1 expression or transposition under fluctuating/non-fluctuating nitrogen starvation. Otherwise, the claims of increased Ty activity are unsupported. The authors measured an endpoint (Ty insertion), but this says nothing directly as to the rate of activity, although it is presumably correlated. However, based on the data one could argue activity may be equal in all environments, but the mutational events caused by Ty activity are uniquely selected for in fluctuating nitrogen starvation. As it stands, either model (increased activity vs. differential strength of selection) are equally likely. At a minimum, the authors should at least address this point.

      In line with the above, we think the authors should soften some points in the discussion as it stands. For example: "The significant increase of Ty activity under this specific fluctuating nitrogen-starvation..." We feel the data does not exclusively support increased activity of Ty, that would require the aforementioned assays. As it stands, we feel this is more appropriate: ": "The significant increase of Ty insertions under this specific fluctuating nitrogen-starvation..."

      Minor comments to be addressed:

      Please provide a citation for the following statement "The single copy of Ty5 in the ancestor is known to be inactive and gives rise to no new insertions under either glucose or nitrogen limitation" - Voytas & Boeke. Nature 1992.

      We found the following to be a confusing sentence: "Indeed, if global Ty derepression reflects a host-parasite coevolution that minimizes host cost and maximizes potential for survival of both, the role of transposons in host evolvability is important (Levin and Moran 2011)."

    1. ests, "the Charlie Chan character became institutionalized as the nonthreatening Asian (read: a physical wimp, a sexual deviant,and a political yes-man)."20 Indeed, while Chan's expertise as a detective may be theresult of his "Chinese" understanding of human nature, his appeal to Americanaudiences was the fact that he was a polite, soft-spoken, well-groomed, familyman who had adopted middle-class, American v

      I think that it is kind of interesting that if you fall into the categories of being well groomed and polite then you could be seen as a stereotype. while I feel that while we shouldnt be restrained into any type of role or stereotype I also feel that it is strange to say that doing certain things someone does out of free will will make you a stereotype or almost like a traitor to your origins.

  6. westernsydney.pressbooks.pub westernsydney.pressbooks.pub
    1. Systematic Review Subject Guide Endnote Subject Guide Discipline specific Subject Guides Databases by title

      I wonder if we should specify these LibGuides - as I think the outcome of the LibGuide review may mean these could change??

    1. Jan. 22. To set down such choice experiences that my own writingsmay inspire me and at last I may make wholes of parts. Certainly it isa distinct profession to rescue from oblivion and to fix the sentimentsand thoughts which visit all men more or less generally, that thecontemplation of the unfinished picture may suggest its harmoniouscompletion. Associate reverently and as much as you can with yourloftiest thoughts. Each thought that is welcomed and recorded is anest egg, by the side of which more will be laid. Thoughts accidentallythrown together become a frame in which more may be developedand exhibited. Perhaps this is the main value of a habit of writing, ofkeeping a journal,—that so we remember our best hours and stimulateourselves. My thoughts are my company. They have a certainindividuality and separate existence, aye, personality. Having bychance recorded a few disconnected thoughts and then brought theminto juxtaposition, they suggest a whole new field in which it waspossible to labor and to think. Thought begat thought.

      !!!!

      Henry David Thoreau from 1852

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: The authors use an unclassified quaranjavirus, Wǔh�n mosquito virus 6 (WuMV-6), to demonstrate the possibility of orthomyxvirid global transmission dynamic analyses. The focused surface protein analysis strongly indicates a vertebrate host for WuMV-6 in addition to the insect host. The analysis is then expanded to other quaranjaviruses, which differ considerably in their surface glycoproteins, indicating a complex evolution. Finally, the authors scientifically demonstrate that orthomyxovirids are undersampled and hence that this family will have to expand considerably in the future.

      Major comments: none

      We thank the reviewer for a succinct summary of our study and we are very glad the key messages were sufficiently clear.

      Minor comments: The article lacks precision and hence some global edits are in order. Generally:

      1. For clarity to the reader, please introduce the family Orthomyxoviridae, i.e., its current official composition (i.e., 9 genera, 21 species, and 22 viruses) so the reader is not confused by terms such as "quaranjavirus" or "isavirus" etc.).

      This is a fair request though we would prefer to err on the side of caution with regards to the precise number of taxonomic ranks given the flux viral taxonomy has experienced and in light of the deluge of new taxa being discovered all the time. We refer to the “traditional” view of orthomyxovirid taxonomy at the genus level, encompassing the genera described up until 2011.

      After that, please clearly indicate which viruses are classified and which ones are not. For instance, the main virus dealt with in this paper is unclassified, and so are Astopletus and Ūsinis viruses.

      We do not think this is reasonable since virtually all RNA viruses discussed in the text are not classified and their status as such has little bearing on any of our findings.

      Please ensure correct spelling, including diacritics, of the viruses and abbreviations throughout: Wǔh�n mosquito virus 6 (WuMV-6); H�běi orthomyxo-like virus 2 [note the deletion of one "virus"]; Wēnlǐng orthomyxo-like virus 2

      Thank you for the comment, we have added the diacritics where we could identify them but may have missed some.

      For orientation of the reader, please refer to family groups of viruses as -virids (e.g., "orthomyxovirids", "human coronavirids", "some rhabdovirids"). This way it is clear to the reader that, for instance, "quaranjaviruses" refers to a genus-level group

      Thank you, we agree that this adds much needed precision in terminology.

      "influenza" is a disease. There are several viruses that can cause influenza; they belong to four different genera. Please scan for "influenza" and replace each either with a virus name (for instance, in the abstract, "...RNA viruses containing influenza A virus" or with a genus name (e.g., "alphainfluenzaviruses")

      Our apologies for that misnomer. The text has been corrected.

      Please ensure the differentiation of taxa (concepts), such as species, and viruses (things). Orthomyxoviridae cannot infect anything, it can also not be sampled etc. Orthomyxovirids, the physical members of Orthomyxoviridae can infect things. Most instances of "Orthomyxoviridae" should be replaced accordingly.

      Thank you for the comment, this has been corrected as suggested.

      In particular:

      1. The title doesn't make much sense. Orthomyxovirids are not taxonomically incomplete - they are things that we simply may not have samples or may have characterized incompletely. Also, the analyses are largely restricted to quaranjaviruses. Hence, I would suggest "...genome evolution, and broad diversity of quaranjaviruses"

      Our apologies for the confusion. The analyses we carried out to quantify evolutionary orthomyxovirid diversity likely waiting to be discovered was carried out on all known (at the time) members of ____Orthomyxoviridae____ and thus the title must still refer to the entire family rather than quaranjavirids. We felt that the term “taxonomic incompleteness” imparts on the reader exactly what the reviewer refers to, namely that new taxonomic ranks are likely to come as more evolutionary diversity gets uncovered. Alternative and more precise formulations, like referring to evolutionary incompleteness or something similar, would miss the fact that it is taxonomy that discretises the otherwise continuous evolutionary change.

      Abstract: genomes are not employed and do not make money. Please replace "employed" with "used"

      We have to respectfully disagree since the definition of the word “employ” also includes the meaning “to make use of”.

      Re: point 6 above, Introduction: species/families etc. cannot be discovered. They are being established by people for viruses that may be discovered. Please fix here and elsewhere (in most cases, "species" should be replaced with "viruses")

      We agree that taxonomic ranks are designated and not discovered and have changed the text accordingly.

      P3, second paragraph: please place "jingmenviruses" in quotation marks as this is not an official term (yet). Please add "potentially" ("as potentially causing human disease"). Even the authors only speak of an "association" and do not fulfill Koch's postulates

      We have to respectfully disagree here too. “Jingmenviruses” as a term is unambiguous in referring to a group of related segmented flaviviruses even though the groups is not officially assigned a taxonomic rank. We have altered the text to add uncertainty to the claim that jingmenviruses cause disease in humans.

      P3, top right column: "e.g., the tick-borne Johnston Atoll quaranja- and thogotoviruses" is ambiguous. Please change to "e.g., the tick-borne quaranja- and thogotoviruses" or list particular viruses and clarify which belong to which genus

      Apologies for the confusion. We fixed this instance.

      P3, right column "smaller number" - change to "lower number"

      We have altered the offending sentence in response to reviewer 2 and this combination of words is no longer present.

      P3, right column "or only the polymerase" - makes no sense to the reader as it has not been introduced; and grammatically needs to be improved as the polymerase is also encoded on a segment. Likewise, PB1 makes no sense to unacquainted reader - maybe add a few sentences to the intro right after the family introduction on general genome composition and that PB1 is part of the polymerase holoenyzme?

      We have altered the offending sentence in response to reviewer 2 but we take the point. We’ve added detail about the RNA-directed RNA polymerase of orthomyxovirids to the introduction.

      P4: the Ebola virus glycoprotein is called GP1,2 [with 1,2 in subscript] (also Figure 2 legend)

      Respectfully, while the reviewer is technically correct in that the glycoprotein of Ebola virus is referred to as GP_1,2 in proteomics literature (the 1,2 referencing the protein held together by a cysteine bridge post-cleavage), calling it GP is not out of place in evolutionary studies and the term “Ebola virus GP” is unambiguous to the reader.

      P4: please change "West Africa" to "Western Africa" (the designation of the area by the UN)

      Unfortunately, while we agree that the reviewer is correct in that the UN refers to the region as “Western Africa”, references to the “West African Ebola virus epidemic” are ubiquitous in the literature and thus we do not see the reason to change the term here either.

      P6: change "with Rainbow / Steelhead trout orthomyxviruses" to "with mykissviruses (rainbow trout orthomyxovirus and steelhead trout orthomyxovirus)" [note that virus names are not capitalized except for proper noun components; hence also "infectious salmon anemia virus, bottom right column]

      While we recognise that viruses related to infectious salmon anaemia virus discovered in trout have received a separate taxonomic designation we feel very strongly about not mentioning it in our manuscript. Our fear is that “mykissviruses” have been designated too hastily on the basis of a handful of representatives and that relatives discovered in the future may show an indiscernible continuum between “mykissviruses” and isaviruses, invalidating the former as a valid term. We would therefore strongly prefer to keep references to specific viruses rather than a taxonomic designation that may disappear so that a future reader may have an easier time with our study.

      P6, right column: please change "RNA-dependent" to the IUPAC/IUB-correct "RNA-directed"

      Done.

      Figure 2 is too small. I could not figure out B with or without my confocals... Likewise S2, S3 are way too small. In Fig 2 legend, please place "spike" into lower case

      We understand the reviewer’s concern here but Figure 2B was a compromise between vertical space available on a page, the number of taxa in the PB1 tree, and what we thought important to communicate - the variation in segment number across orthomyxoviruses and mapping of PB1 diversity to gp64 diversity. This was done at the expense of individual taxon name visibility whilst fully zoomed out. To remedy this Figure 2B was rendered in 300 dpi resolution such that zooming in will show individual taxon names clearly. We ultimately hope to publish our study in an online-only journal where printing will not present an issue. Likewise for figures S2 and S3. We have changed “Spike” to be lower case in the legend.

      Figure 3: correct spelling of virus names (from top to bottom): rainbow trout orthomyxovirus, infectious salmon anemia virus, influenza C virus, influenza D virus, influenza A virus, influenza B virus, Wēnlǐng orthomyxo-like virus 2, Dhori virus, Thogoto virus, Jos virus, Aransas Bay virus, ... Johnston Atoll virus, Quaranfil virus, H�běi orthomyxo-like virus 2, Hǎin�n orthomyxo-like virus 2, Wǔh�n mosquito virus 6. Also apply to S6 and others where applicable.

      The names for viruses in Figure 3 were taken directly from their NCBI records and since we do not show their accessions there is no other way to disambiguate them to the reader. We have, however, added the necessary diacritics where appropriate.

      [PS: based on the somewhat backward, non-UNICODE editorial manager system, I am worried that the diacritics in virus names above are not rendered corretly. If so, please look up the Pinyin spelling of Wuhan, Hainan, Wenling etc. - easiest way is to search Wikipedia for the terns and then identify the Pinyin spelling, which is typically pointed out]

      CROSS-CONSULTATION COMMENTS

      I think we (all reviewers) are all largely in agreement - this is a very useful study; the manuscripts just needs various adjustments. I agree with the requests of the other two reviewers.

      Reviewer #1 (Significance (Required)):

      The strength of the paper is that it provides a road map on how undersampled taxa may be analyzed and which kind of information can be gleaned from these analyses. The paper also demonstrates that the analysis of seemingly "unimportant" viruses can prove important. The limitation of the paper is that there is no true novel revelation here. The sampling sites of WuMV-2 GenBank records already suggest broad distribution, which often goes along with sequence diversity; the continued discovery of orthomyxovirids in metagenomic studies clearly implied undersampling (but it is nice to have this "gut feeling" scientifically fortified now). The paper is useful for evolutionary virologists, virus taxonomists, orthomyxovirid specialists, and invertebrate virologists.

      We respectfully disagree with the reviewer and believe they may have missed an important point raised by our study. We do not claim that a global distribution of WuMV6 is what makes it remarkable but that its sampled diversity is 1) sufficient to calibrate molecular clocks (in our experience this is not always the case for arthropod viruses) and 2) that WuMV6 has reached its current global distribution ____recently____.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This is a nice example of bringing together a variety of data from metatranscriptomic studies to answer fundamental evolutionary questions in the field of viral evolution. There is a focus on a single virus family, and although some might see this as a little restrictive, I think the 'deep-dive' presented in this paper leaves space for a relatively detailed and comprehensive analysis. No doubt, other studies will gain inspiration from the approach presented here and expand this work to other viral groups.

      Overall, the paper is very well written, and the figures are of a very high quality. It is a shame that there are only 3 main figures in the paper because the supplementary figures are well presented and informative.

      We thank the reviewer for the kind words.

      The manuscript discusses the importance of host quite a bit, and for that reason it would have been nice to try and incorporate the host of the various viruses into the figures somehow (perhaps as a supplementary, since the trees are already quite busy). This might help orientate the reader).

      While we appreciate that host information is of interest, we foresee several issues. For one, we refer to broad host classes (essentially arthropod versus vertebrate) because they are largely determined by membrane fusion protein classes, the actual focus of our study, which exhibit strong phylogenetic signal. Secondly, host information in metagenomic studies can be imprecise, incorrect or unavailable.

      I have some minor comments or suggestions for the authors to consider below. Note, please use line numbers in the future for your submissions.

      A paragraph in the discussion laying out the limitations of this approach would be useful to the reader and would make this excellent paper even more robust.

      Thank you for the suggestion. We presume the reviewer is referring to our interpolation of orthomyxovirid diversity and included a few sentences about the limitations of this approach in the Discussion.

      Pg 3. The sentence starting 'The vast majority of known orthomyxoviruses use one...' should be made into two sentences to make it easier to read. A second sentence for the arthropod description is the obvious edit.

      We appreciate the suggestion and have included it in the manuscript.

      Pg 3. 'The number of segments of orthomyxoviruses with genomes known to be complete varies from 6 to 8'. Rephrase to - 'Orthomyxoviruses genomes are known to have 6-8 segments, but many metagenomically discovered viruses in this group have incomplete genomes...etc...',

      Thank you for the suggestion, it has been included.

      Figure 1 - what do the white triangles mean? Are these the directions of reassortment? This should be explained in the legend...

      We apologise for the omission, this is now explained.

      New Zealand is covered up by the circular tree. It looks like there is a point which is partially obscured.

      The reviewer spotted a mistake on our part here. The figure included the coordinates for Wellington, New Zealand when the detection was actually in Wellington Shire, Australia. This has been fixed.

      PD analysis - t I think you assume that viruses are static in this analysis. As we all know, they continue to mutate and eventually new species will evolve. Is it possible to consider the mutation rate in this analysis and the evolution of new variants/ eventually leading to new species? It might be complicated, and maybe a matter for future work, but it might be worth discussing this as a limitation at the very least. Especially when extrapolating to the future (although you do not extrapolate too far, so maybe this is not an issue here...). You could choose to discuss this in relation to the bird analogy (which was great), and compare the rate of mutation which will lead to the evolution of new species on a totally different time scale.

      We appreciate the point raised by the reviewer and while we wholly agree that the possibility of new viral taxa arising over time is an important caveat, we felt the discussion ends up being rather short. On one hand taxa definitions for different viral groups can be different, and on the other speciation in RNA viruses is difficult to place in absolute time because of a phenomenon called time-dependence of evolutionary rates. Methods accounting for the latter using sophisticated models or external calibration points would seem to imply that speciation timescales exceed those of research.

      Discussion: When discussing the hypothesis that WMV6 diversity is a result of repeat exposure to vertebrate hosts, can you also describe the alternative hypothesis here, and why the evidence leads you to put more weight on the former.

      This is a fair question and we have mentioned an alternative hypothesis in the discussion that’s been brought up by our colleagues before. It’s a hypothesis that alternating between different hosts induces divergent selection pressures on gp64. We contend that since gp64 proteins are thought to use a highly conserved host receptor (NPC1) we think it likely that no major changes are required when switching hosts. We are open to discussing other alternatives if the reviewer has suggestions.

      CROSS-CONSULTATION COMMENTS

      Seems like we are all in agreement and that after some minor adjustments this will be an excellent contribution.

      Reviewer #2 (Significance (Required)):

      Please see my review above. I did not use your formatting suggestions since I only saw it upon completing my review.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary

      This manuscript describes the use of data from metagenomic analyses to make inferences about the evolutionary and geographic history of the Orthomyxoviridae family of viruses and their hosts. Data from Wuhan Mosquito Virus 6 (WMV6) derived from various RNA-seq analyses is used to analyse loss and gain of virus segments over time, the time since the last common ancestor of these segments and the selection pressure acting on different genes. These results are used to hypothesise about which species have vectored this virus in the past and their geographic distribution. The additional phylogenetic diversity provided by characterisation of additional viruses of this species is quantified and projected into the future to demonstrate the value of further work in this area. The study also demonstrates more generally the benefit of additional sequencing and of characterising viruses in metagenomic datasets, even in cases where novel viruses are not identified.

      Major Comments

      The methodology in this manuscript appears to be sound and the results support the conclusions. Appropriate and detailed analyses have been performed and are described in detail. Code is provided to allow the results to be reproduced. The figures are informative and very well presented. I do not think any additional analyses are required.

      We thank the reviewer for the kind words.

      Minor Comments

      The manuscript is a little hard to follow in places. I think a brief introduction of WHV6 in the introduction section would help with this - where has it been isolated previously and what is known about its evolutionary history (if anything), how is it related to other Orthomyxoviruses. This information is included later but it would improve the flow of the paper to include it in the introduction.

      We apologise for the inconvenience and agree with the reviewer. We have improved the flow of the manuscript per reviewer suggestion.

      I think including a little more about the Method in the Results section would also be helpful, to save the reader jumping back and forth in order to understand the results. For example, at the beginning of the results section, briefly detailing how many samples were included, their broad geographic location and what the analysis is intended to show (e.g. "three full length sequences isolated from China, seven from Australia [...], between 1995 and 2019, were used to generate a reassortment network, in order to show.....") would be helpful. Each of the subsections of the Results would benefit from something similar.

      Apologies for the lack of clarity on our part. We have added more methodological information to each section in the results.

      Although it is clear in the Materials and Methods which datasets have been included, it is less apparent why these were selected. For example, in Figure 1A there are five countries listed - are these countries for which a particularly large amount of full length sequences were available or for which any full length sequence is available? Similarly, for Figure 1B, are these all of the countries where a dataset has originated containing any segment of WHV6?

      The confusion is entirely our fault as we have clearly not provided sufficient detail. This has been fixed now by explaining this better in the methods and Figure 1 legend.

      In the Discussion, it is stated that the frequency and fast evolution of WMV6 place it uniquely to enable tracking of mosquito populations, however there is no evidence presented to support this - does WMV6 evolve faster or occur more frequently than other mosquito RNA viruses?

      Our apologies for the jump in logic. We now expand on what we meant by the following sentence in the discussion: “In our experience, metagenomically discovered RNA viruses can be rare or, when encountered often, do not always contain sufficient signal to calibrate molecular clocks (Webster et al. 2015).”

      CROSS-CONSULTATION COMMENTS

      I also agree with the requests of the other two reviewers and that the manuscript will be in great shape once these are included.

      Reviewer #3 (Significance (Required)):

      This manuscript is very interesting, for the specific results presented here but, more importantly, in opening up further avenues for investigation. The study provides a proof of concept for using viruses derived from metagenomic data for specific and detailed evolutionary and ecological analyses of a single species. The scope of the analysis performed on WMV6 is not particularly broad, but it differs from the typical analysis of viruses in metagenomic datasets, which tends to focus on identification and characterisation of novel viruses only. I believe that this work is valuable to others working in the field, reveals additional potential in existing data and could provide inspiration for many future studies. To my knowledge, it is one of the first studies to focus on a single, fairly under-studied virus, and draw ecological conclusions based on only bioinformatic analyses.

      I think the results presented here for WMV6 may be of interest to a specialised audience, but that the manuscript overall is valuable to a broad audience, including ecologists, evolutionary biologists and virologists conducting fundamental science research.

      We appreciate the reviewer’s kind words.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The authors use an unclassified quaranjavirus, Wǔhàn mosquito virus 6 (WuMV-6), to demonstrate the possibility of orthomyxvirid global transmission dynamic analyses. The focused surface protein analysis strongly indicates a vertebrate host for WuMV-6 in addition to the insect host. The analysis is then expanded to other quaranjaviruses, which differ considerably in their surface glycoproteins, indicating a complex evolution. Finally, the authors scientifically demonstrate that orthomyxovirids are undersampled and hence that this family will have to expand considerably in the future.

      Minor comments:

      The article lacks precision and hence some global edits are in order. Generally:

      1. For clarity to the reader, please introduce the family Orthomyxoviridae, i.e., its current official composition (i.e., 9 genera, 21 species, and 22 viruses) so the reader is not confused by terms such as "quaranjavirus" or "isavirus" etc.).

      2. After that, please clearly indicate which viruses are classified and which ones are not. For instance, the main virus dealt with in this paper is unclassified, and so are Astopletus and Ūsinis viruses.

      3. Please ensure correct spelling, including diacritics, of the viruses and abbreviations throughout: Wǔhàn mosquito virus 6 (WuMV-6); Húběi orthomyxo-like virus 2 [note the deletion of one "virus"]; Wēnlǐng orthomyxo-like virus 2

      4. For orientation of the reader, please refer to family groups of viruses as -virids (e.g., "orthomyxovirids", "human coronavirids", "some rhabdovirids"). This way it is clear to the reader that, for instance, "quaranjaviruses" refers to a genus-level group

      5. "influenza" is a disease. There are several viruses that can cause influenza; they belong to four different genera. Please scan for "influenza" and replace each either with a virus name (for instance, in the abstract, "...RNA viruses containing influenza A virus" or with a genus name (e.g., "alphainfluenzaviruses")

      6. Please ensure the differentiation of taxa (concepts), such as species, and viruses (things). Orthomyxoviridae cannot infect anything, it can also not be sampled etc. Orthomyxovirids, the physical members of Orthomyxoviridae can infect things. Most instances of "Orthomyxoviridae" should be replaced accordingly.

      In particular:

      1. The title doesn't make much sense. Orthomyxovirids are not taxonomically incomplete - they are things that we simply may not have samples or may have characterized incompletely. Also, the analyses are largely restricted to quaranjaviruses. Hence, I would suggest "...genome evolution, and broad diversity of quaranjaviruses"

      2. Abstract: genomes are not employed and do not make money. Please replace "employed" with "used"

      3. Re: point 6 above, Introduction: species/families etc. cannot be discovered. They are being established by people for viruses that may be discovered. Please fix here and elsewhere (in most cases, "species" should be replaced with "viruses")

      4. P3, second paragraph: please place "jingmenviruses" in quotation marks as this is not an official term (yet). Please add "potentially" ("as potentially causing human disease"). Even the authors only speak of an "association" and do not fulfill Koch's postulates

      5. P3, top right column: "e.g., the tick-borne Johnston Atoll quaranja- and thogotoviruses" is ambiguous. Please change to "e.g., the tick-borne quaranja- and thogotoviruses" or list particular viruses and clarify which belong to which genus

      6. P3, right column "smaller number" - change to "lower number"

      7. P3, right column "or only the polymerase" - makes no sense to the reader as it has not been introduced; and grammatically needs to be improved as the polymerase is also encoded on a segment. Likewise, PB1 makes no sense to unacquainted reader - maybe add a few sentences to the intro right after the family introduction on general genome composition and that PB1 is part of the polymerase holoenyzme?

      8. P4: the Ebola virus glycoprotein is called GP1,2 [with 1,2 in subscript] (also Figure 2 legend)

      9. P4: please change "West Africa" to "Western Africa" (the designation of the area by the UN)

      10. P6: change "with Rainbow / Steelhead trout orthomyxviruses" to "with mykissviruses (rainbow trout orthomyxovirus and steelhead trout orthomyxovirus)" [note that virus names are not capitalized except for proper noun components; hence also "infectious salmon anemia virus, bottom right column]

      11. P6, right column: please change "RNA-dependent" to the IUPAC/IUB-correct "RNA-directed"

      12. Figure 2 is too small. I could not figure out B with or without my confocals... Likewise S2, S3 are way too small. In Fig 2 legend, please place "spike" into lower case

      13. Figure 3: correct spelling of virus names (from top to bottom): rainbow trout orthomyxovirus, infectious salmon anemia virus, influenza C virus, influenza D virus, influenza A virus, influenza B virus, Wēnlǐng orthomyxo-like virus 2, Dhori virus, Thogoto virus, Jos virus, Aransas Bay virus, ... Johnston Atoll virus, Quaranfil virus, Húběi orthomyxo-like virus 2, Hǎinán orthomyxo-like virus 2, Wǔhàn mosquito virus 6. Also apply to S6 and others where applicable.

      [PS: based on the somewhat backward, non-UNICODE editorial manager system, I am worried that the diacritics in virus names above are not rendered corretly. If so, please look up the Pinyin spelling of Wuhan, Hainan, Wenling etc. - easiest way is to search Wikipedia for the terns and then identify the Pinyin spelling, which is typically pointed out]

      CROSS-CONSULTATION COMMENTS

      I think we (all reviewers) are all largely in agreement - this is a very useful study; the manuscripts just needs various adjustments. I agree with the requests of the other two reviewers.

      Significance

      The strength of the paper is that it provides a road map on how undersampled taxa may be analyzed and which kind of information can be gleaned from these analyses. The paper also demonstrates that the analysis of seemingly "unimportant" viruses can prove important. The limitation of the paper is that there is no true novel revelation here. The sampling sites of WuMV-2 GenBank records already suggest broad distribution, which often goes along with sequence diversity; the continued discovery of orthomyxovirids in metagenomic studies clearly implied undersampling (but it is nice to have this "gut feeling" scientifically fortified now). The paper is useful for evolutionary virologists, virus taxonomists, orthomyxovirid specialists, and invertebrate virologists.

  7. jeffreycwitt.com jeffreycwitt.com
    1. Can you think of a way our social identity (e.g. our identify as “students”, Baltimoreans, Americans, etc.) depends on your understanding of the past?

      The way we identify socially may be viewed entirely differently to different individuals depending on their understanding of the history /past surrounding your identity. You may have one understanding of the past surrounding who you are, for example you may view you identity positively after understanding your history, but someone else may view your social identity negatively because they interpret your past differently, or with bias.

    1. Skip to content Toggle Menu Primary Navigation HomeReadAdminSign out Search in book: Search Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices. Book Contents Navigation Contents IntroductionIsrat Jahan Oeeshi and Stefanie Panke I. Instructional Strategies and Engaging Pedagogies 1. Education in EmergenciesMursal Amanzai and Fatima Qasemi2. Problem and Project Based LearningAmena Karimi; Fatima Nasiry; and Zainab Mirzaie3. Educating Bilingual learners in SchoolMahroosa Noori and Masoona Noori4. Design Thinking for Creativity and Innovation at SchoolsZulikha Malekzai II. Teaching and Learning with Technology 5. Educational Technology and Mobile LearningMasturah Pakbin Alizada; Tamana Setayesh; and BIBI LINA AZIZI6. Effective Online LearningMahdia Ahmadi and Fatima Ameri7. Teaching Writing Skills with Blended Learning Approach for Schools' learnersSima Ahmadi and Arezo Sultani8. Open Pedagogy: Collaborative Open Access Textbook DesignStefanie Panke and Israt Jahan Oeeshi III. Inclusion, Wellbeing and Community Building 9. Community Building and Parents CommunicationNILOFAR SHEWA and Roshan Gul Haidari10. Social Emotional Learning and Mental WellbeingAlizeh Sultan; Sawdah Hoque; and Khurshid Arif11. Inter-Group Empathy and Inclusive Learning in Secondary SchoolsFatima Mohammadi and Ritu Tripura Appendix Strong Schools Inclusion, Wellbeing and Community Building 10 Social Emotional Learning and Mental Wellbeing LEARNING OBJECTIVES, Introduction, Components of SEL - 5 components, PILLARS OF SEL - Three Pillars, Childhood Development and SEL, MENTAL WELLBEING AND TRAUMA NAVIGATION/HEALING, Conclusion, Review Questions, Mental Health Check In Activity, Key Terms Alizeh Sultan; Sawdah Hoque; and Khurshid Arif Learning Objectives  After reading this chapter you will be able to Define Social Emotional Learning (SEL) Discuss the importance of SEL in schools. Identify and deal with challenges faced by students and educators in school settings Learn about and apply conflict resolution and mediation. Effectively promote SEL in school Navigate the Teacher-Student-Parent dynamic Utilize SEL to ensure the mental well-being of students. Train stakeholders to increase safety in schools.   Abstract: Social-Emotional Learning and Well-being is not only an emerging research agenda, but also a critical issue concerning the individual as well as societal development, because how the issue is viewed has huge theoretical as well as practical, and even vital implications. In academic, while some argue that social-emotional learning and well-being is in the subjective perception of one’s life or psychological functioning, others argue that social-emotional learning and well-being is in the objective conditions and the broader environment. Many programs have been developed to help schools enhance students’ health and reduce trauma, bullying, violence, and ill-being. How should educators set it up for students in schools? This article describes the importance of social-emotional learning (SEL) in schools, well-being, the best educational practices for applying conflict resolution in schools, mediation in schools, the teacher-student-parent dynamic, and trauma. The SEL framework can be used to guide teachers in schools and parents outside of the schools to deal with problems caused by ill-being (mental illness) trauma and violence, and use mental health (mental well-being) to teach social-emotional learning.   Introduction  Social and emotional learning (SEL) aims to help students better understand their thoughts and emotions, become more self-aware and develop more empathy for others in their community and the larger world, takes these differences into account and helps put all students on an equal footing to succeed. SEL is the process by which children and adults regulate their emotions, set goals, demonstrate empathy, build and maintain healthy relationships, and make constructive choices (CASEL, 2022). Nowadays students frequently meet people for the first time in the classroom who come from various backgrounds, hold different opinions, and have unique abilities. It can be difficult for students to adapt to a new and diverse community, to feel at home there and to show respect and understanding of people with different viewpoints and values. Social and emotional learning (SEL) aims to help students better understand their thoughts and emotions, become more self-aware and develop more empathy for others in their community and the larger world, takes these differences into account and helps put all students on an equal footing to succeed. SEL is the process by which children and adults regulate their emotions, set goals, demonstrate empathy, build and maintain healthy relationships, and make constructive choices (CASEL, 2019). Recently, an increasing number of educational institutions are realizing that competence in socio-emotional development and academic performance are closely intertwined. 7.4% of American children between the ages of 3 and 17 have been diagnosed with a behavioural problem, which means that at least two students in a class of 30 would benefit from positive reinforcement if they had behavioural issues (Banks, 2022). In addition to student education, educational institutions are now regarded as “an important if not central arena for health promotion and primary prevention” (Roeser, 2000). SEL is a methodology that teaches students of all ages how to better understand their emotions, feel them fully, and show empathy for others. These learned behaviours are then applied to assist students in making positive, responsible decisions, developing frameworks for achieving their goals, and developing positive relationships with others. SEL should be included in the school curriculum because it improves the quality of education at schools as it improves the school & class environment, reduces behavioural problems and boosts the academic achievement of students. SEL with an integrated, coordinated approach not just teaches children how to pass exams, but also teaches them how to develop life skills and deal with social challenges. SEL helps students maximize their potential in school, but also throughout their lives. A meta-analysis of 213 school-based, universal social and emotional learning (SEL) programs involving 270,034 kindergartens through high school students found that students significantly improved their social and emotional skills, attitudes, behaviour, and academic performance, resulting in an 11-percentile-point increase in achievement (Durlak et al., 2011). By incorporating SEL into the school curriculum, students will learn how to control their emotions and behaviours in order to reduce stress. Some of these skills can assist children in self-regulating when they are required to do so independently. In order to find solutions, they will sometimes be better able to communicate their thoughts and feelings to others. According to Durlak et al. (2011), the four components of SAFE—active learning, focused activities to develop one or more social skills, sequenced activities that lead to skills in a coordinated and connected way, and explicit targets regarding specific skills—are present in the most successful SEL programs.  Teachers who use SEL curricula ensure that high expectations are communicated to students by eliciting their thoughts, displaying model work, providing specific feedback to spur improvement, and emphasizing that making mistakes is an important part of learning (Paterson, 2021). Social Emotional Learning: What is SEL and Why SEL Matters – Video  Components of Social-Emotional Learning (SEL): There are many components to SEL but most literature states 5 key components. they are – Figure: Five main components of social-emotional learning (MLSD, 2022). Self-Awareness: Self-awareness is a conscious awareness of one’s own strengths, weaknesses, actions, and presence. Self-awareness necessitates a clear understanding of your mental and emotional states. Recognizing your emotions and how they influence your behaviour; recognizing your strengths and weaknesses in order to gain confidence in your abilities. Self-Management:  Self-management consists of setting and achieving goals, as well as taking responsibility for your thoughts, emotions, and actions in various situations. Self-management is built on self-awareness. If students can accurately pinpoint their feelings and how they influence their behaviours, they will be better able to act on them. It can be very empowering to assist them in developing their capacity to manage their emotions and behaviour. Social Awareness:  Social Awareness is the ability to put yourself in the shoes of another person who comes from a different background or culture than you. To act with empathy and integrity in your home, school, and community. Recognizing others, understanding the perspectives of others and empathizing with them, including those from diverse backgrounds, cultures, and contexts, is what social awareness is all about. Relationship Skills:  Relationship skills are the ability to form and maintain healthy relationships with people from various backgrounds. This competency focuses on knowing when to ask for or offer assistance, listening to others, and communicating with them. Students who participate in SEL learn how to handle conflicts in their relationships more effectively, making it easier for them to maintain their friendships (Banks, 2022). Because of their problem-solving abilities, they can work in groups and even enjoy teamwork in the classroom. Responsible Decision-Making: Responsible decision-making refers to the ability to decide how to appropriately act or react in a given situation based on learned behaviours such as ethics, safety, weighing consequences, and the welfare of others as well as yourself. To make responsible decisions, students need to develop critical thinking, open-mindedness, sound judgment, reason, problem-solving, and solution-finding skills. They need to think about others as well as themselves when making decisions.   Other components that are important to keep in mind (especially for school children) are – Stress Management:  People of all ages, particularly children, are affected by trauma and stress. According to a 2014 American Psychological Association study, many teenagers have even higher stress levels than adults. Stress management is tools, strategies, or techniques that help you feel less stressed and lessen the negative effects of stress on your mental or physical health. There are numerous techniques for dealing with stress. These techniques include behavioural, emotional, and mental ones. The first step in assisting your students in overcoming their stress is to comprehend what is causing it. Peer pressure, abuse, and familial expectations can all have an impact on your student’s mental and emotional well-being. Stress management in the classroom necessitates consistent effort. A teacher’s creativity and patience are frequently required. To reduce classroom tension, incorporate these strategies into your lesson plans. Conflict Resolution and Mediation: There will always be disagreements between children, no matter how minor. For example, during recess kids can argue about anything, whether it’s a toy or another student. Conflict resolution techniques can be used in the classroom to keep the peace among the students. Peaceful conflict resolution refers to working through a problem or conflict in an early childhood setting in a way that does not negatively impact anyone involved on a physical, emotional, or social level. When disputes are settled amicably, children can gain confidence in their ability to handle situations and relationships. Teaching young people how to navigate conflict is one of the most important components to facilitate their growth. Even some adults have poor meditation skills and can’t navigate their professional and personal life properly. When we’re so angry that we can’t think straight, it’s difficult to solve a problem. This is why teaching young people from an early age to resolve conflicts and how to overcome their negative emotions and act accordingly is of utmost importance. Although Mediation is frequently described as a win-win process, with a focus on reaching a mutually satisfactory conflict resolution, the research shows that there are additional and possibly larger benefits for students who facilitate mediation as well as the school community as a whole (DeVoogd, 2016). Teaching students to get some perspective and put themself in the other person’s shoes and think beyond just one incident is crucial. It is important to build their listening and problem-solving skills through conflict resolution. Conflict resolution is an important personal skill and in schools, the role of peer mediation in conflict resolution is also important. Mediation is an important method that requires focus, an open mind, and the willingness to compromise. Both parties in a conflict need to give work towards a solution with respect in order to not waste time and reach a satisfactory solution. And having a mediator to help with conflict resolution is a good way to ensure that. The researcher DeVoogd (2016) states that “Student mediators also demonstrate better attendance than non-mediators and report feeling safer and more connected in positive ways to their school, with a sense of belonging.”  This shows that mediation training and being mediators, in general, is useful. But I think we should give more importance to building each individual student’s conflict resolution skill than just having some students in the role of mediators. Teaching students to get some perspective and put themself in the other person’s shoes and think beyond just one incident is crucial. It is important to build their listening and problem-solving skills through conflict resolution.   Here is a video that will explain the importance of SEL  in our Children and Adult life. SUBTOPIC – PILLARS OF SEL – Three Pillars: Social and emotional learning (SEL) is a term that broadly refers to the process through which people learn and put into practice a variety of social, emotional, and associated skills, attitudes, behaviours, and values that assist guide pupils. This involves having ideas, emotions, and behaviours that help people do well in school. But SEL has been described in a number of different ways (Humphrey et al., 2011). In today’s increasingly diverse world, children frequently meet people for the first time in the classroom who come from a variety of different origins, have diverse ideas, and have special talents. Social and emotional learning (SEL) aims to assist students in better understanding their thoughts and emotions, growing in self-awareness, and developing more empathy for others in their community and the wider world in order to take into account these differences and help put all students on an equal footing to succeed. These skills can be fostered in the classroom to help kids become stronger, more effective, self-conscious, and socially aware citizens in the years to come. Learn more about the significance of social-emotional learning and the advantages it has in the classroom and outside of it. Here we discuss the three pillars of the SEL which are Culture, Adult Skills and Curriculum. Culture From a cultural perspective, the kind of skills associated with SEL seems to be based on a theory of emotions that views them as internal, individual states that call for active management control in order to be channelled in socially beneficial, healthy ways. The main focus is on controlling or containing emotions that can “boil over,” leading people to behave irrationally (Lakoff & Kovecses, 1987). The SEL literature frequently suggests verbalization or visualization methods that involve verbalizing feelings, using visualization techniques, or engaging in breathing or counting activities. Children in the primary grades should be able to recognize and appropriately label simple emotions such as sadness, anger, and happiness, according to CASEL’s 2007 assessment of essential skills in emotional identification, labelling, and discussion. Students must “recognize and appropriately describe emotions and how they are linked to action [as well as] use language skills to understand other people’s thoughts and perspectives,” such as being able to employ “I messages” while discussing feelings, according to Illinois state requirements (Illinois State Board of Education, 2006). Another part of connection skills is that “students should be able to describe ways to make and keep friends” (CASEL, 2007). The norms surrounding emotional expression, emotional experience, and emotional regulation are, however, strongly conditioned by culture, as research on emotion in non-Western cultural contexts has long demonstrated (Briggs, 1998; Chao, 1995; Lutz, 1987, 1988; Markus & Kitayama, 1994; Miller, 1982, 1996; Shweder & LeVine, 1984; White, 1987). Not all cultures share the same regulatory or expressive behaviours (like talking) of the White, American middle class, nor do they interpret emotional experience in the same ways (see also Ballenger, 1992). According to Wierzbicka (1994), the Anglo script for emotional expression places a significant focus on behavioural control and the notion that speaking about one’s emotions qualifies as adequate expression, in contrast to other cultural scripts for this purpose (p. 178). She makes a compelling case for the cultural influences on the relationship between emotion and language in ways that directly contradict the universalizing assertions of a lot of psychology research on emotions in cognition. In addition to stressing the need for SEL training to be “culturally relevant, empowering children within their own cultural surroundings,” Denham and Weissberg (2004) also raise the potential that “some SEL categories may be unique to the child’s home culture” (p. 41). Adult Skills Any classroom in the world, from the most basic, without walls, to the most complex, needs good relationships between teachers and students in order for learning to occur. The combination of skills that enables kids to collaborate with others, study effectively and play vital roles in their families, communities, and workplaces is known as social-emotional skills or “emotional intelligence.” According to research, social and emotional learning can be taught to pupils, and their presence in classrooms and schools enhances academic performance. Students are more likely to retain and apply what they are taught when academic and social-emotional learning are both incorporated into the educational process. Additionally, they weave into their education a sense of accountability, compassion, and interest in the welfare of others as well as their own. Thus, learning can be said to affect both the “brain” and the “heart,” which leads to better-run classrooms and motivated students. Therefore, academic and social-emotional learning are intertwined in every school, everywhere. Curriculum SEL can be incorporated into a school’s curriculum even though it is not a defined subject like math or history. Students may be more likely to participate and may be less prone to mentally drift off during their classes when teachers personalize and relate academic topics to them. SEL can have a beneficial lifetime influence by encouraging self-awareness, empathy, and emotions of safety and inclusion in the classroom. SEL is approached from various angles. A more formally specified period of the school day, sometimes taught in homeroom, is devoted to SEL by certain teachers. To help students better understand the SEL basic skills, these lessons are repeated throughout the rest of the school day. In order to foster a sense of community or common ground between students of different ages, teachers may choose to assign students to write or journal about their thoughts and feelings regarding a certain SEL topic. Other teachers incorporate SEL teachings into topics that are more formal, like math, history, or literature. As an example of SEL in action, assigning a group project where students self-delegate roles to work together for the benefit of the group, having students role-play historical figures to comprehend the motivations behind their actions, or having students conduct formal interviews with one another to gauge current events are all examples of SEL in action. (“What is social-emotional learning (SEL): Why it matters,” 2022) What is social-emotional learning (SEL): Why it matters. (2022, August 17). National University. https://www.nu.edu/blog/social-emotional-learning-sel-why-it-matters-for-educators/ SUBTOPIC – Childhood Development and SEL SEL in the context of daily classroom instruction includes daily check-ins with students, embedded SEL content in the Reading and English Language Arts, and general awareness of the social well-being of students in their virtual/classroom environment (even if only with a simple greeting enquiring about their emotional state upon arrival to the class or an enquiry regarding the previous evening). According to a study by Babalis et al. (2013), SEL definitely affects primary school pupils’ emotional competence and academic achievement. Another study by Cook (2014) found that school culture and practices hinder students’ academic achievement. This suggests that, in the absence of SEL institutional practices and curriculum-supported content, students would struggle academically because their social and emotional needs are not met in the classroom. This study found that English language learners have been more negatively impacted by the absence of SEL-supported surroundings and material in schools. In addition to SEL having an effect on these learners’ academic achievement, additional factors include acculturation difficulties, encountering racism and discrimination, and poverty (as described in Benner & Graham, 2011) also negatively affect students’ academic performance. SUBTOPIC –  Mental Wellbeing and Trauma Navigation/Healing Wellbeing, Well being or Well-being  According to the Merriam-Webster Dictionary, the correct term is well-being, not wellbeing or well being, and it is hyphenated. The noun “well-being” is defined as “the state of being healthy, happy, or prosperous.” Being can be a verb or a noun, whereas well is an adjective. To create a noun, a hyphen must be added. Now, what is the difference between mental health and mental well-being? While we wish that was the case, the definition of mental well-being does not imply that life is trouble-free. Instead, it indicates that you are equipped to deal with whatever life throws at you. These are abilities that you can pick up and hone, making your mental health better today than it was yesterday  (Slade, 2010). On the other hand, your mental well-being is your state of mind. It resembles physical health in many ways, but only with regard to your mind. It fluctuates daily, just like your physical health does. In general, your mental health is influenced by your experiences, your environment, your relationships, and the strength of the community in which you live. You need to be as healthy as possible on both counts because your physical and mental health is closely related (Lawrence, et al., 2017). [Video podcast 1] (Mental Health & Emotional Well-being ,2022) Finally, mental well-being can be mental health but mental health cannot be mental well-being, because they are related but both of them are independent ( Wheeler, 2021). Definition and Meaning of Well-being Well-being is often described as the state of being comfortable, healthy or happy. A feeling of health and vitality that results from your thoughts, emotions, actions, and experiences is referred to as well-being. When we are in a state of well-being, we frequently feel joyful, healthy, socially connected, and purposeful (Lawrence, et al., 2017). These skills include:  Self-fulfilment The realization that you are part of something bigger  Ability to care for yourself independently Identifying and employing character strengths Accurate perception of reality  Desire to learn new skills  Emotional resilience  Interested in the world around you Recognizing and staying true to your values Forming and maintaining healthy relationships Having a sense of hope Understanding that happiness comes from within Being determined Taking action to improve your life [Video podcast 2] As ill-being is a lack of prosperity, happiness, or health. But well-being is often described as the state of being comfortable, healthy or happy. Well-being is completely the opposite of ill-being (Headey, Holmström , & Wearing , 1984). for more information, look at the figure. Why do we have to pay attention to the well-being of children in schools? Schools have a responsibility to consider ways to enhance the educational process while also paying attention to the students by safeguarding their mental health, especially well-being. Well-being is a useful strategy for enhancing the health of school-aged children. “The fact that we as leaders have been entrusted with the most priceless resource in the world—children—is the most vital reason to care about wellbeing. Every student, regardless of how they frustrate, perplex, delight, or impress us, has inherent value and potential. Because we genuinely care about the students entrusted to us, are aware of our moral duty to care for those under our stewardship, and want what is best for them both now and in the future, we should therefore place a priority on well-being” (Dewey, 1897, p. 78). How to apply Well-being in schools?  These techniques can be used to implement well-being in schools: Mental well-being training: To combat ill-being, educational facilities can offer teachers training courses. It is crucial to create a place for specialized mental health professionals in educational settings. However, all teachers must believe they have received the necessary instruction and encouragement to help their students with their mental well-being. Implement mental health, and well-being into the curriculum:  The influence of mental health and well-being on a person’s quality of life must be emphasized to students. Students’ increased mental health literacy will increase their personal awareness of particular problems. This can be accomplished by including specialized lessons on mental health and relationship education in school curricula. Promote healthy eating: Both students and staff must eat healthfully. by including lessons on healthy eating in the curriculum, starting cooking groups, and providing nutritious food on school grounds and university campuses. Encourage students and staff to stay hydrated: Drinking enough water every day is vital for both mental and physical health.                                                                                     What Is Trauma? Trauma is an important phenomenon these days among students, teachers, and parents, so firstly what does it mean? We can say, trauma is a mental sickness, it should be therapy as soon as possible. There are several methods to teach how to remove trauma from your society. Some experiences that can be bad accidents or bad actions in your life are called Trauma. A traumatic accident is because of bad that happened in the past, like: when a person is driving, one day he has an accident and after that, it has a bad effect on his morale and he can’t continue driving. (Escudero & Wong – RAND Corporation, 2001). Trauma is an emotional and normal response to miserable events such as violence, abuse, losing close relatives, conflict, and natural disasters. Trauma can be acute, chronic, and complex based on the types of experienced events. Everybody may underlie trauma in their life. But children are the most vulnerable part who are adversely affected by trauma. Parents and communities are responsible to facilitate their children with the basic concept of stress, anxiety, and trauma and letting them know that it is a normal reaction to different occurrences. Children need to be able to identify their reactions toward stress and release their stress by different methods which need to be taught to them in schools. So, providing mental and emotional health support related to children and youth cognitive behaviours is a significant issue that is required to be considered in schools’ curricula. Adding emotional and psychosocial support subject to the schools has a huge impact on the mental and physical well-being of students, making them more resilient, confident, innovative, and critical thinking (Escudero & Wong-RAND Corporation, 2001). According to the sources that have been provided in the reading sections in relation to the cognitive behaviour of children. There are many factors that impact the well-being of children even during adolescence; like safe, protective, and healthy environments especially in schools, social and emotional supports, and strategies on how to overcome stressful and traumatic events. One main reason a student is passive in learning, making relationships, and behaving properly is adverse childhood experiences such as neglect, war, domestic violence, and harassment  (Slade, 2010). Studies presented that children, youth, and teachers who trained with the emotional and psycho-social program had a significant decrease in their stress and trauma. They are empowered, confident, concentrated, optimistic, able to make relationships and enhance their ability to take part in the communities’ decision-making.  All the development programs for releasing tensions and stress are implemented in advanced countries’ schools with the main target groups of students, teachers, and parents to raise their skills in coping with stress, shifting their mindset into restorative practices, mindful breathing, and enabling them to better self-manage resources. All the 15 development skills programs validate the Whitaker Peace & Development Initiatives (WPDI) program that the major focus is the psychological, social, and emotional well-being of children as well as amplify the knowledge of parents on how to communicate friendly with their children, to identify their problems and provide the solution for them (Escudero & Wong – RAND Corporation, 2001). Students from developing countries are adversely affected by mental and psycho-social health problems. As an example, in Afghanistan students every day underlie conflict, child labour, early forced marriages and violence. As a result, they are either aggressive or isolated. They lose their self-esteem, concentrate on certain issues, and lose hope for the future (Ibrahim et al, 2020). Therefore, there is a dire need of adding emotional and social support subjects into Afghan schools’ curricula. Adding psycho-social support subjects in schools would help students with mental health problems, who are affected by ongoing conflict and disaster. In my country, students need special education and psychological and emotional support. Unfortunately, these important subjects have never been addressed in our curriculum at schools (Ibrahim et al, 2020). That’s why millions of students, in spite of endless attempts, could not get promoted at schools, and this resulted in a high drop-out rate in Afghanistan. Sadly, girls’ drop-out rate from school is much higher than boys’ due to many cultural reasons that do not let adult girls have access to education facilities. How to deal with Trauma? In addition to the previously mentioned methods of conflict resolution and mediation:  By Speaking with therapists or social workers. These are excellent resources for more information about identifying and comprehending the effects of trauma in addition to giving specific information about your students. Ensure order and consistency. On the board, write the agenda. Use exit and entry procedures. A student may feel more secure if she knows what to expect. warnings before changes in activity. If you’re going to do something unexpected like turn off the lights or make a loud noise, let someone know in advance. Develop their skills and passions. To support a positive self-concept, concentrate on one area of competence and promote its growth. Create a backup plan. Make it possible for a student to leave the classroom if she becomes agitated or overwhelmed. Set aside a location inside or outside the school so that you will know where to look for her if she needs to calm down or take a sensory break. You can also give a student access to a box or kit of sensory-calming equipment (Silly Putty, coloring, puzzles). Show them how to look after themselves. One of the most crucial things to keep in mind (Venet, 2014). Trauma causes challenging behavior There is proof that trauma exposure impairs the stress response system’s ability to regulate itself, which can result in impulsivity and poor emotional control (Tarullo & Gunnar, 2006; Bright & Thompson, 2018). Young people who have experienced trauma are therefore more likely to exhibit internalizing or externalizing behavioral issues in response to subsequent stressful events (Wilton, 2020). Conclusion: Good mental health is the key to living a good life and social-emotional learning is important for constructing and maintaining good mental health. Teaching young kids how to handle mental stress and navigate all the ups and downs of life is essential and including SEL in the school curriculum is the best way to make sure that young kids are learning the skills to take care of their mental well-being. Integrating SEL into the school curriculum is essential but we have to also be careful in how we implement it. The teacher has to be properly trained so they don’t accidentally turn any issues worse. There also has to be a balance between collective and individual well-being. And to achieve the best outcome both school teachers and staff need to be trained properly. Review Questions:  Think about the class you are in or teaching, do you/or your school incorporate SEL in the curriculum? If SEL is incorporated is there any gap? How can the curriculum be improved to maximize the benefit of SEL and how do you think SEL should be modified for your context?   Mental Health Check In Activity – Organize Mental  Health Awareness session / Play Mental Health Management Bingo with students. Introduce your topic Know Your Feelings and  Mental Health Management Bingo Map out how the activity will work and its steps for the students/teachers. Students will explain and draw 3 of their coping mechanisms for stress or another mental health issue. We can put all of the mechanisms on a bingo board and play MHM BINGO. To play, students require a copy of each sheet and a pencil, and each Bingo worksheet will contain a list of positive coping mechanisms that are related to maintaining good mental health. Students can check on each box as the teacher/facilitator explains and the class as a whole discusses each coping mechanism, its benefits and other implications. It’s easy for students to play, and just as easy for teachers or parents to join in! You can find a summary and some complimentary resources & information about our Chapter here. Here is a podcast episode for you where we talk about SEL with Expert Psychologist Nabila Afroz from Asian University for Women – Audio Playerhttps://pressbooks.pub/app/uploads/sites/2449/2022/09/SEL_074255.m4a00:0000:0000:00Use Up/Down Arrow keys to increase or decrease volume.   Earn A Badge:   You can get a micro-credential after finishing this chapter. All you have to do is write a review on this chapter, and answer the review questions. Our strong school team will issue you a badge through Badgr. Email sawdah.hoque@gmail.com if you wanna earn a badge for this chapter.     Key Terms:  Trauma: Is the response to a deeply distressing or disturbing event that overwhelms an individual’s ability to cope, causes feelings of helplessness, diminishes their sense of self and their ability to feel a full range of emotions and experiences Conflict: mental struggle resulting from incompatible or opposing needs, drives, wishes, or external or internal demands Conflict resolution: It is conceptualized as the methods and processes involved in facilitating the peaceful ending of conflict and retribution SEL(Social Emotional Learning): Social–emotional learning is an educational method that aims to foster social and emotional skills within school curricula. Mediation: intervention in a dispute in order to resolve it Maslow’s model: Maslow’s hierarchy of needs is a theory by Abraham Maslow, which puts forward that people are motivated by five basic categories of needs: physiological, safety, love, esteem, and self-actualization. Classroom Management: refers to the wide variety of skills and techniques that teachers use to keep students organized, orderly, focused, attentive, on task, and academically productive during a class. Reference: Collaborative for Academic, Social, and Emotional Learning. (2019). What is SEL? Retrieved from https://casel.org/what-is-sel Committee for Children (2020). Building a foundation of success. Retrieved from https://www.cfchildren.org/what-is-social-emotional-learning/schools/ Banks, A. (2022), 6 benefits of social and emotional learning in the classroom. Insights to Behavior. Retrieved October 25, 2022, from https://insightstobehavior.com/blog/6-benefits-social-emotional-learning-classroom/ DeVoogd, K., Lane-Garon, P. and Kralowec, C.A. (2016), Direct Instruction and Guided Practice Matter in Conflict Resolution and Social-Emotional Learning. Conflict Resolution Quarterly, 33: 279-296. https://doi.org/10.1002/crq.21156 Dewey, J. (1987). My pedagogic creed. The School Journal, 54(3), 77–80. Durlak, J. A., Weissberg, R. P., Dymnicki, A. B., Taylor, R. D., & Schellinger, K. B. (2011). The impact of enhancing students’ social and emotional learning: a meta-analysis of school-based universal interventions. Child development, 82(1), 405–432. https://doi.org/10.1111/j.1467-8624.2010.01564.x Lawrence, C., Sajni, G., Pierre, H., Alison, R., Bruce, A., & Dora, M. (2017). Health and Wellbeing. Esearch Gate, 236-254. MLSD. (2022). Social-emotional learning. [Infographic].  Social-Emotional Learning – Mental Wellness – Medical Lake School District. Retrieved January 1, 2023, from https://www.mlsd.org/apps/pages/index.jsp?uREC_ID=1726968&type=d&pREC_ID=1897652 Paterson, J. (2021). Three principles for using SEL in the classroom. NEA. Retrieved October 24, 2022, from https://www.nea.org/advocating-for-change/new-from-nea/three-principles-using-sel-classroom Roeser, R. W., Eccles, J. S., & Sameroff, A. J. (2000). School as a context of early adolescents’ academic and social-emotional development: A summary of research findings. The elementary school journal, 100(5), 443-471. Slade, M. (2010). Mental illness and well-being: The central importance of positive psychology and recovery approaches. BMC health services research, 1-14. Venet, A. S. (2014, September 14). 8 Ways to Support Students Who Experience Trauma. Retrieved from edutopia: https://www.edutopia.org/discussion/8-ways-support-students-who-experience-trauma WHEELER, K. (2021, June 17). Well-being. Retrieved from HappiFul: https://happiful. Wilton, J. (2020). Trauma, challenging behaviour and restrictive interventions in schools. Centre for Mental Health, 1-24. About the Authors name: Alizeh Sultan institution: Asian University for Women My name is Alizeh Sultan. I am an Afghan girl who was born in 1996 in Maidan Wardak, Afghanistan’s Provence. Kabul University is where I earned my bachelor’s degree in journalism. I have some experience in journalism. During my time at university, I did some work-study. In Nai, I worked as a reporter, announcer, and program manager. I was working as an interviewer for PHC (Pearl Horizon Consulting) with some powerful women in Afghanistan. I was also a member of IWA (Integrity Watch of Afghanistan) for about two years. I observed the teacher’s teaching method, the school environment, the curriculum, and the student’s school situations. During my service, I enjoyed being a part of Afghanistan’s educational system. name: Sawdah Hoque Sawdah Rubai Bente Hoque is currently a graduate student at Asian University for Women (AUW) pursuing a degree of MA in Education. She completed her Bachelor’s at AUW majoring in Environmental Science in 2021. Born in Chittagong, Bangladesh, Sawdah wishes to work on reforming the Education System of Bangladesh, raising awareness about Mental Health, and creating an inclusive curriculum that ensures students’ social-emotional well-being. name: Khurshid Arif institution: AUW This is Khurshid Arif and she is originally from Afghanistan, Ghazni province but she grew up and did her studies in Balkh province, Mazar-e-Sharif city. She has graduated from Balkh University, faculty of Mining and Environment Engineering department of Petroleum Engineering. During her studies she participated in many leadership program like AWDP( Afghanistan Workforce Development Program) and WLD( Women’s Leadership Development) which she learned various methods to improve her leadership skills. After her graduation, since she was one of the top students at the university, she started her work as an Assistant professor at Balkh University. This journey last only for about one year and she lost her job because of Covid19. Then she started her new job at Afghan-Turk Maarif Girls High school as mathematics teacher. She had worked at Afghan-Turk Maarif girls’ high school for one year and then she applied to the MA program in Bangladesh and got selected and now she is doing her Masters there. These experiences teach her a lot and she is very happy about it .For instance She learned how to interact with her students, how to learn from them, how to be flexible and how to deal with many conflict while teaching students. Edit Previous/next navigation Previous: Community Building and Parents Communication Next: Inter-Group Empathy and Inclusive Learning in Secondary Schools Back to top License Strong Schools Copyright © by Alizeh Sultan; Sawdah Hoque; and Khurshid Arif. All Rights Reserved. Share This Book Share on Twitter Powered by Pressbooks Guides and Tutorials |Pressbooks Directory |Contact Pressbooks on YouTube Pressbooks on Twitter

      Dear Alizeh, Sawdeh and Khurshid, I read your chapter twice carefully and I hope I have provided constructive feedbacks. In general, I can point out that while I was reading, I noticed your countless efforts in writing this chapter. Stay successful and talented.

    1. Author Response

      Reviewer #1 (Public Review):

      This is a carefully-conducted fMRI study looking at how neural representations in the hippocampus, entorhinal cortex, and ventromedial prefrontal cortex change as a function of local and global spatial learning. Collectively, the results from the study provide valuable additional constraints on our understanding of representational change in the medial temporal lobes and spatial learning. The most notable finding is that representational similarity in the hippocampus post-local-learning (but prior to any global navigation trials) predicts the efficiency of subsequent global navigation.

      Strengths:

      The paper has several strengths. It uses a clever two-phase paradigm that makes it possible to track how participants learn local structure as well as how they piece together global structure based on exposure to local environments. Using this paradigm, the authors show that - after local learning - hippocampal representations of landmarks that appeared within the same local environment show differentiation (i.e., neural similarity is higher for more distant landmarks) but landmarks that appeared in different local environments show the opposite pattern of results (i.e., neural similarity is lower for more distant landmarks); after participants have the opportunity to navigate globally, the latter finding goes away (i.e., neural similarity for landmarks that occurred in different local environments is no longer influenced by the distance between landmarks). Lastly, the authors show that the degree of hippocampal sensitivity to global distance after local-only learning (but before participants have the opportunity to navigate globally) negatively predicts subsequent global navigation efficiency. Taken together, these results meaningfully extend the space of data that can be used to constrain theories of MTL contributions to spatial learning.

      We appreciate Dr. Norman’s generous feedback here along with his other insightful comments. Please see below for a point-by-point response. We note that responses to a number of Dr. Norman’s points were surfaced by the Editor as Essential revisions; as such, in a number of instances in the point-by-point below we direct Dr. Norman to our responses above under the Essential revisions section.

      Weaknesses:

      General comment 1: The study has an exploratory feel, in the sense that - for the most part - the authors do not set forth specific predictions or hypotheses regarding the results they expected to obtain. When hypotheses are listed, they are phrased in a general way (e.g., "We hypothesized that we would find evidence for both integration and differentiation emerging at the same time points across learning, as participants build local and global representations of the virtual environment", and "We hypothesized that there would be a change in EC and hippocampal pattern similarity for items located on the same track vs. items located on different tracks" - this does not specify what the change will be and whether the change is expected to be different for EC vs. hippocampus). I should emphasize that this is not, unto itself, a weakness of the study, and it appears that the authors have corrected for multiple comparisons (encompassing the range of outcomes explored) throughout the paper. However, at times it was unclear what "denominator" was being used for the multiple comparisons corrections (i.e., what was the full space of analysis options that was being corrected for) - it would be helpful if the authors could specify this more concretely, throughout the paper.

      We appreciate this guidance and the importance of these points. We have taken a number of steps to clarify our hypotheses, we now distinguish a priori predictions from exploratory analyses, and we now explicitly indicate throughout the manuscript how we corrected for multiple comparisons. For full details, please see above for our response to Essential Revisions General comment #1.

      General comment 2: Some of the analyses featured prominently in the paper (e.g., interactions between context and scan in EC) did not pass multiple comparisons correction. I think it's fine to include these results in the paper, but it should be made clear whenever they are mentioned that the results were not significant after multiple comparisons correction (e.g., in the discussion, the authors say "learning restructures representations in the hippocampus and in the EC", but in that sentence, they don't mention that the EC results fail to pass multiple comparisons correction).

      Thank you for encouraging greater clarity here. As noted directly above, we now explicitly indicate our a priori predictions, we state explicitly which results survive multiple comparisons correction, and we added necessary caveats for effects that should be interpreted with caution.

      General comment 3: The authors describe the "flat" pattern across the distance 2, 3, and 4 conditions in Figure 4c (post-global navigation) and in Figure 5b (in the "more efficient" group) as indicating integration. However, this flat pattern across 2, 3, and 4 (unto itself) could simply indicate that the region is insensitive to location - is there some other evidence that the authors could bring to bear on the claim that this truly reflects integration? Relatedly, in the discussion, the authors say "the data suggest that, prior to Global Navigation, LEs had integrated only the nearest landmarks located on different tracks (link distance 2)" - what is the basis for this claim? Considered on its own, the fact that similarity was high for link distance 2 does not indicate that integration took place. If the authors cannot get more direct evidence for integration, it might be useful for them to hedge a bit more in how they interpret the results (the finding is still very interesting, regardless of its cause).

      Based on the outcomes of additional behavioral and neural analyses that were helpfully suggested by reviewers, we revised discussion of this aspect of the data. Please see our response above under Essential Revisions General comment #4 for full details of the changes made to the manuscript.

      Reviewer #2 (Public Review):

      This paper presents evidence of neural pattern differentiation (using representational similarity analysis) following extensive experience navigating in virtual reality, building up from individual tracks to an overall environment. The question of how neural patterns are reorganized following novel experiences and learning to integrate across them is a timely and interesting one. The task is carefully designed and the analytic setup is well-motivated. The experimental approach provides a characterization of the development of neural representations with learning across time. The behavioral analyses provide helpful insight into the participants' learning. However, there were some aspects of the conceptual setup and the analyses that I found somewhat difficult to follow. It would also be helpful to provide clearer links between specific predictions and theories of hippocampal function.

      We appreciate the Reviewer’s careful read of our manuscript and their thoughtful guidance for improvement, which we believe strengthened the revised product. We note that responses to a number of the Reviewer’s points were surfaced by the Editor as Essential revisions; as such, in a number of instances in the point-by-point below we direct the Reviewer to our responses above under the Essential revisions section.

      General comment 1: The motivation in the Introduction builds on the assumption that global representations are dependent on local ones. However, I was not completely sure about the specific predictions or assumptions regarding integration vs. differentiation and their time course in the present experimental design. What would pattern similarity consistent with 'early evidence of global map learning' (p. 7) look like? Fig. 1D was somewhat difficult to understand. The 'state space' representation is only shown in Figure 1 while all subsequent analyses are averaged pairwise correlations. It would be helpful to spell out predictions as they relate to the similarity between same-route vs. different-route neural patterns.

      We appreciate this feedback. An increase in pattern similarity across features that span tracks would indicate the linking of those features together. ‘Early evidence’ here describes the point in experience where participants had traversed local (within-track) paths but had yet to traverse across-tracks.

      Figure 1D seeks to communicate the high-level conceptual point about how similarity (abstractly represented as state-space distance) may change in one of two directions as a function of experience.

      General comment 2: The shared landmarks could be used by the participants to infer how the three tracks connected even before they were able to cross between them. It is possible that the more efficient navigators used an explicit encoding strategy to help them build a global map of the world. While I understand the authors' reasoning for excluding the shared landmarks (p. 13), it seems like it could be useful to run an analysis including them as well - one possibility is that they act as 'anchors' and drive the similarity between different tracks early on; another is that they act as 'boundaries' and repel the representations across routes. Assuming that participants crossed over at these landmarks, these seem like particularly salient aspects of the environment.

      We agree that these shared landmarks play an important role in learning the global environment and guiding participants’ navigation. However, they also add confounding elements to the analyses; mainly, shared landmarks are located near multiple goal locations and associated with multiple tracks, and transition probabilities differ at shared landmarks because they have an increased number of neighboring landmarks and fractals. In the initial submission, shared landmarks were included in all analyses except (a) global distance models and (b) context models (which compare items located on the same vs different tracks).

      With respect to (a) the global distance models, we ran these models while including shared landmarks and the results did not differ (see figure below and compare to Fig. 5 in the revised manuscript):

      Distance representations in the Global Environment, with shared landmarks included. These data can be compared to Figure 5 of the revised manuscript, which does not include shared landmarks (see page 5 of this response letter).

      We continue to report the results from models excluding shared landmarks due to the confounding factors described above, with the following addition to the Results section:

      “We excluded shared landmarks from this model as they are common to multiple tracks; however, the results do not differ if these landmarks are included in the analysis.”

      With respect to (b) the context analyses (which compare items located on the same vs different tracks), we cannot include shared landmarks in these analyses because they are common amongst multiple tracks and thus confound the analyses. Finally, we are unable to conduct additional analyses investigating shared landmarks specifically (for example, examining how similarity between shared landmarks evolves across learning) due to very low trial counts. We share the Reviewer’s perspective that the role of shared landmarks during the building of map representations promises to provide additional insights and believe this is a promising question for future investigation.

      General comment 3: What were the predictions regarding the fractals vs. landmarks (p. 13)? It makes sense to compare like-to-like, but since both were included in the models it would be helpful to provide predictions regarding their similarity patterns.

      We are grateful for the feedback on how to improve the consistency of results reporting. In the revision, we updated the relevant sections of the manuscript to include results from fractals. Please see our above response to Essential Revisions General comment #5 for additions made to the text.

      General comment 4: The median split into less-efficient and more-efficient groups does not seem to be anticipated in the Introduction and results in a small-N group comparison. Instead, as the authors have a wealth of within-individual data, it might be helpful to model single-trial navigation data in relation to pairwise similarity values for each given pair of landmarks in a mixed-effects model. While there won't be a simple one-to-one mapping and fMRI data are noisy, this approach would afford higher statistical power due to more within-individual observations and would avoid splitting the sample into small subgroups.

      We appreciate this very helpful suggestion. Following this guidance, we removed the median-split analysis and ran a mixed-effects model relating trial-wise navigation data (at the beginning of the Global Navigation Task) to pairwise similarity values for each given pair of landmarks and fractals (Post Local Navigation). We also altered our approach to the across-participant analysis examining brain-behavior relationships. Please see our above response to Essential Revisions General comment #3 for additions to the revised manuscript.

      General comment 5: If I understood correctly, comparing Fig. 4B and Fig. 5B suggests that the relationship between higher link distance and lower representational similarity was driven by less efficient navigators. The performance on average improved over time to more or less the same level as within-track (Fig. 2). Were less efficient navigators particularly inefficient on trials with longer distances? In the context of models of hippocampal function, this suggests that good navigators represented all locations as equidistant while poorer navigators showed representations more consistent with a map - locations that were further apart were more distant in their representational patterns. Perhaps more fine-grained analyses linking neural patterns to behavior would be helpful here.

      Following the above guidance, we removed the median-split analyses when exploring across-participant brain-behavior relationships (see Essential Revisions General comment #3), replacing it with a mixed-effects model analysis, and we revised our discussion of the across-track link distance effects (see Essential Revisions General comment #4). For this reason, we were hesitant and ultimately decided against conducting the proposed fine-grained analyses on the median-split data.

      General comment 6: I'm not completely sure how to interpret the functional connectivity analysis between the vmPFC and the hippocampus vs. visual cortex (Fig. 6). The analysis shows that the hippocampus and visual cortex are generally more connected than the vmPFC and visual cortex - but this relationship does not show an experience-dependent relationship and is consistent with resting-state data where the hippocampus tends to cluster into the posterior DMN network.

      We expected to see an experience-dependent relationship between vmPFC and hippocampal pattern similarity, and agree that these findings are difficult to interpret. Based on comments from several reviewers, we removed the second-order similarity analysis from the manuscript in favor of an analysis which models the relationship between vmPFC pattern similarity and hippocampal pattern similarity. Moreover, given the exploratory nature of the vmPFC analyses, and following guidance from Reviewer 1 about the visual cortex control analyses, both were moved to the Appendix. Please see our above response to Essential Revisions General comment #7 for further details of the changes made to the manuscript.

      Reviewer #3 (Public Review):

      Fernandez et al. report results from a multi-day fMRI experiment in which participants learned to locate fractal stimuli along three oval-shaped tracks. The results suggest the concurrent emergence of a local, differentiated within-track representation and a global, integrated cross-track representation. More specifically, the authors report decreases in pattern similarity for stimuli encountered on the same track in the entorhinal cortex and hippocampus relative to a pre-task baseline scan. Intriguingly, following navigation on the individual tracks, but prior to global navigation requiring track-switching, pattern similarity in the hippocampus correlated with link distances between landmark stimuli. This effect was only observed in participants who navigated less efficiently in the global navigation task and was absent after global navigation.

      Overall, the study is of high quality in my view and addresses relevant questions regarding the differentiation and integration of memories and the formation of so-called cognitive maps. The results reported by the authors are interesting and are based upon a well-designed experiment and thorough data analysis using appropriate techniques. A more detailed assessment of strengths and weaknesses can be found below.

      Strengths

      1) The authors address an interesting question at the intersection of memory differentiation and integration. The study is further relevant for researchers interested in the question of how we form cognitive maps of space.

      2) The study is well-designed. In particular, the pre-learning baseline scan and the random-order presentation of stimuli during MR scanning allow the authors to track the emergence of representations in a well-controlled fashion. Further, the authors include an adequate control region and report direct comparisons of their effects against the patterns observed in this control region.

      3) The manuscript is well-written. The introduction provides a good overview of the research field and the discussion does a good job of summarizing the findings of the present study and positioning them in the literature.

      We thank Dr. Bellmund for his positive evaluation of the manuscript. We greatly appreciate the insightful feedback, which we believe strengthened the manuscript’s clarity and potential impact. We note that responses to a number of Dr. Bellmund’s points were surfaced by the Editor as Essential revisions; as such, in a number of instances in the point-by-point below we direct the Reviewer to our responses above under the Essential revisions section.

      Weaknesses

      General comment 1: Despite these distinct strengths, the present study also has some weaknesses. On the behavioral level, I am wondering about the use of path inefficiency as a metric for global navigation performance. Because it is quantified based on the local response, it conflates the contributions of local and global errors.

      We appreciate this point with respect to path inefficiency during global navigation. As noted below, following Dr. Bellmund’s further insightful guidance, we now complement the path inefficiency analyses with additional metrics of across-track (global) navigation performance, which effectively separate local from global errors (please see below response to Author recommendation #1).

      General comment 2: For the distance-based analysis in the hippocampus, the authors choose to only analyze landmark images and do not include fractal stimuli. There seems to be little reason to expect that distances between the fractal stimuli, on which the memory task was based, would be represented differently relative to distances between the landmarks.

      We are grateful for the feedback on how to improve the consistency of results reporting. In the revision, we updated the relevant sections of the manuscript to include results from fractals. Please see our above response to Essential Revisions General comment #5 for full details.

      General comment 3: Related to the aforementioned analysis, I am wondering why the authors chose the link distance between landmarks as their distance metric for the analysis and why they limit their analysis to pairs of stimuli with distance 1 or 2 and do not include pairs separated by the highest possible distance (3).

      We appreciate the request for clarification here. Beginning with the latter question, we note that the highest possible distance varies between within-track vs. across-track paths. If participants navigate in the Local Navigation Task using the shortest or most efficient path, the highest possible within-track link distance between two stimuli is 2. For this reason, the Local Navigation/within-track analysis includes link distances of 1 and 2. For the Global Navigation analysis, we also include pairs of stimuli with link distances of 3 and 4 when examining across-track landmarks.

      Regarding the use of link distance as the distance metric, we note that the path distance (a.u.) varies only slightly between pairs of stimuli with the same link distance. As such, categorical treatment link distance accounts for the vast majority of the variance in path distance and thus is a suitable approach. Please note that in the new trial-level brain-behavior analysis included in the revised manuscript (which replaces the median-split analysis), we used the length of the optimal path.

      General comment 4: Surprisingly, the authors report that across-track distances can be observed in the hippocampus after local navigation, but that this effect cannot be detected after global, cross-track navigation. Relatedly, the cross-track distance effect was detected only in the half of participants that performed relatively badly in the cross-track navigation task. In the results and discussion, the authors suggest that the effect of cross-track distances cannot be detected because participants formed a "more fully integrated global map". I do not find this a convincing explanation for why the effect the authors are testing would be absent after global navigation and for why the effect was only present in those participants who navigated less efficiently.

      We appreciate Dr. Bellmund’s input here, which was shared by other reviewers. We revised and clarified the Discussion based on reviewer comments. Please see our above response to Essential Revisions General comment #4 for full details.

      General comment 5: The authors report differences in the hippocampal representational similarity between participants who navigated along inefficient vs. efficient paths. These are based on a median split of the sample, resulting in a comparison of groups including 11 and 10 individuals, respectively. The median split (see e.g. MacCallum et al., Psychological Methods, 2002) and the low sample size mandate cautionary interpretation of the resulting findings about interindividual differences.

      We appreciate the feedback we received from multiple reviewers with respect to the median-split brain-behavior analysis. We replaced the median-split analysis with the following: 1) a mixed-effects model predicting neural pattern similarity Post Local Navigation, with a continuous metric of task performance (each participant’s median path inefficiency for across-track trials in the first four test runs of Global Navigation) and link distance as predictors; and 2) a mixed-effects model relating trial-wise navigation data to pairwise similarity values for each given pair of landmarks and fractals (as suggested by Reviewer 2). Please see our above response to Essential Revisions General comment #3 for additions to the revised manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      This study used GWAS and RNAseq data of TCGA to show a link between telomere length and lung cancer. Authors identified novel susceptibility loci that are associated with lung adenocarcinoma risk. They showed that longer telomeres were associated with being a female nonsmoker and early-stage cancer with a signature of cell proliferation, genome stability, and telomerase activity.

      Major comments:

      1) It is not clear how are the signatures captured by PC2 specific for lung adenocarcinoma compared to other lung subtypes. In other words, why is the association between long telomeres specific to lung adenocarcinoma?

      We thank the reviewer for raising this point (similarly mentioned by reviewer #2). Indeed, it is unclear why genetically predicted LTL appears more relevant to lung adenocarcinoma. We have used LASSO approach to select important features of PC2 in lung adenocarcinoma and inferred PC2 in lung squamous cell carcinomas tumours to better explore the differences between histological subtypes. The new results are presented in Figure 5, as well as being described in the methods and results sections. In addition, we have expanded upon this point in the discussion with the following paragraph (page 11, lines 229-248):

      ‘An explanation for why long LTL was associated with increased risk of lung cancer might be that individuals with longer telomeres have lower rates of telomere attrition compared to individuals with shorter telomeres. Given a very large population of histologically normal cells, even a very small difference in telomere attrition would change the probability that a given cell is able to escape the telomere-mediated cell death pathways (24). Such inter-individual differences could suffice to explain the modest lung cancer risk observed in our MR analyses. However, it is not clear why longer TL would be more relevant to lung adenocarcinoma compared to other lung cancer subtypes. A suggestion may come from our observation that longer LTL is related to genomic stable lung tumours (such as lung adenocarcinomas in never smokers and tumours with lower proliferation rates) but not genomic unstable lung tumours (such as heavy smoking related, highly proliferating lung squamous carcinomas). One possible hypothesis is that histologic normal cells exposed to highly genotoxic compounds, such as tobacco smoking, might require an intrinsic activation of telomere length maintenance at early steps of carcinogenesis that would allow them to survival, and therefore, genetic differences in telomere length are less relevant in these cells. By contrast, in more genomic stable lung tumours, where TL attrition rate is more modest, the hypothesis related to differences in TL length may be more relevant and potentially explaining the heterogeneity in genetic effects between lung tumours (Figure 2). Alternately, we also note that the cell of origin may also differ, with lung adenocarcinoma is postulated to be mostly derived from alveolar type 2 cells, the squamous cell carcinoma is from bronchiolar epithelium cells (19), possibly suggesting that LTL might be more relevant to the former.

      2) The manuscript is lacking specific comparisons of gene expression changes across lung cancer subtypes for identified genes such as telomerase etc since all the data is presented as associations embedded within PCs.

      The genes associated with telomere maintenance such as TERT and TERC are very low expressed in these tumours (Barthel et al NG 2017). In this context, no sample has more than 5 normalised read counts by RNA-sequencing for TERT within TCGA lung cohorts (TCGA-LUSC, TCGA-LUAD). As such we have not explored the difference by individual telomere related genes. Nevertheless, we have explored an inferred telomerase activity gene signature, developed by Barthel et al and we did explore this in the context of lung adenocarcinoma tumours. We have added a note in the result section to inform the reader regarding why we did not directly test TERT/TERC expression (page 9, lines 184-187).

      3) It is not clear how novel are the findings given that most of these observations have been made previously i.e. the genetic component of the association between telomere length and cancer.

      Others, including ourselves, have studied TL and lung cancer. We have built on that on the most updated TL genetic instrument and the largest lung cancer study available. In addition, we provided insights into the possible mechanisms in which telomere length might affect lung adenocarcinoma development. Using colocalisation analyses, we reported novel shared genetic loci between telomere length and lung adenocarcinoma (MPHOSPH6, PRPF6, and POLI), such genes/loci that have not previously linked to lung adenocarcinoma susceptibility. For MPHOSPH6 locus, we showed that the risk allele of rs2303262 (missense variant annotated for MPHOSPH6 gene) colocalized with increased lung adenocarcinoma risk, lower lung function (FEV1 and FVC), and increased MPHOSPH6 gene expression in lung, as highlighted in the discussion section of the revised manuscript.

      In addition, we have used a PRS analysis to identify a gene expression component associated with genetically predicted telomere length in lung adenocarcinoma but not in squamous cell carcinoma subtype. The aspect of this gene expression component associated with longer telomere length are also associated with molecular characteristics related to genome stability (lower accumulation of DNA damage, copy number alterations, and lower proliferation rates), being female, early-stage tumours, and never smokers, which is an interesting but not completely understood lung cancer strata. As far as we are aware, this is the first time an association between a PRS related to an etiological factor, such as telomere length and a particular expression component in the tumour.

      We have adjusted the discussion further highlight the novel aspects in the discussion section of the revised manuscript.

      Reviewer #2 (Public Review):

      The manuscript of Penha et al performs genetic correlation, Mendelian randomization (MR), and colocalization studies to determine the role of genetically determined leukocyte telomere length (LTL) and susceptibility to lung cancer. They develop an instrument from the most recent published association of LTL (Codd et al), which here is based on n=144 genetic variants, and the largest association study of lung cancer (including ~29K cases and ~56K controls). They observed no significant genetic correlation between LTL and lung cancer, in MR they observed a strong association that persisted after accounting for smoking status. They performed colocalization to identify a subset of loci where LTL and lung cancer risk coincided, mainly around TERT but also other loci. They also utilized RNA-Seq data from TCGA lung cancer adenocarcinoma, noting that a particular gene expression profile (identified by a PC analysis) seemed to correlate with LTL. This expression component was associated with some additional patient characteristics, genome stability, and telomerase activity.

      In general, most of the MR analysis was performed reasonably (with some suggestions and comments below), it seems that most of this has been performed, and the major observations were made in previous work. That said, the instrument is better powered and some sub-analyses are performed, so adds further robustness to this observation. While perhaps beyond the scope here, the mechanism of why longer LTL is associated with (lung) cancer seems like one of the key observations and mechanistically interesting but nothing is added to the discussion on this point to clarify or refute previous speculations listed in the discussion mentioned here (or in other work they cite).

      Some broad comments:

      1) The observations that lung adenocarcinoma carries the lion's share of risk from LTL (relative to other cancer subtypes) could be interesting but is not particularly highlighted. This could potentially be explored or discussed in more detail. Are there specific aspects of the biology of the substrata that could explain this (or lead to testable hypotheses?)

      We thank the reviewer for these comments. A similar point was raised by reviewer #1. Please see our response above, as well as the additional analysis described in Figure 5 that considers the differences by histological subtype.

      2) Given that LTL is genetically correlated (and MR evidence suggests also possibly causal evidence in some cases) across a range of traits (e.g., adiposity) that may also associate with lung cancer, a larger genetic correlation analysis might be in order, followed by a larger set of multivariable MR (MVMR) beyond smoking as a risk factor. Basically, can the observed relationship be explained by another trait (beyond smoking)? For example, there is previous MR literature on adiposity measures, for example (BMI, WHR, or WHRadjBMI) and telomere length, plus literature on adiposity with lung cancer; furthermore, smoking with BMI. A bit more comprehensive set of MVMR analyses within this space would elevate the significance and interpretation compared to previous literature.

      Indeed, there are important effects related to BMI and lung cancer (Zhou et al., 2021. Doi:10.1002/ijc.33292; Mariosa et al., 2022. Doi: 10.1093/jnci/djac061). We have tested the potential for influence on our finding using MVMR, modelling LTL and BMI using a BMI genetic instrument of 755 SNPs obtained from UKBB (feature code: ukb-b-19953). This multivariate approach did not result any meaningful changes in the associations between LTL and lung cancer risk.

      3) In the initial LTL paper, the authors constructed an IV for MR analyses, which appears different than what the authors selected here. For example, Codd et al. proposed an n=130 SNP instrument from their n=193 sentinel variants, after filtering for LD (n=193 >>> n=147) and then for multi-trait association (n=147 >> n=130). I don't think this will fundamentally change the author's result, but the authors may want to confirm robustness to slightly different instrument selection procedures or explain why they favor their approach over the previous one.

      We appreciate the reviewer’s suggestion. Our study is designed for a Mendelian Randomization framework and chose to be conservative in the construction of our instrumental variable (IV). We therefore applied more stringent filters to the LTL variants relative to Codd et al’s approach. We applied a wider LD window (10MB vs. 1MB) centered around the LTL variants that were significant at genome-wide level (p<5e-08) and we restricted our analyses to biallelic common SNPs (MAF>1% and r2<0.01 in European population from 1000 genomes). Nevertheless, the LTL genetic instrument based on our study (144 LTL variants) is highly correlated with the PRS based on the 130 variants described by Codd et al. (correlation estimate=0.78, p<2.2e-16). The MR analyses based on the 130 LTL instrument described by Codd et al showed similar results to our study.

      4) Colocalization analysis suggests that a /subset/ of LTL signals map onto lung cancer signals. Does this mean that the MR relationships are driven entirely by this small subset, or is there evidence (polygenic) from other loci? Rather than do a "leave one out" the authors could stratify their instrument into "coloc +ve / coloc -ve" and redo the MR analyses.

      Mainly here, the goal is to interpret if the subset of signals at the top (looks like n=14, the bump of non-trivial PP4 > 0.6, say) which map predominantly to TERT, TERC, and OBFC1 explain the observed effect here. I.e., it is biology around these specific mechanisms or generally LTL (polygenicity) but exemplified by extreme examples (TERT, etc.). I appreciate that statistical power is a consideration to keep in mind with interpretation.

      We appreciate the reviewer’s comment and, indeed, we considered this idea. However, the analytical approach used the lung cancer GWAS to identify variants that colocalise. To validate this hypothesis that a subset of colocalised variants would be driving all the MR associations, we would need an independent lung cancer case control study to act as an out-of-sample validation set. This is not available to us at this point. Nevertheless, we slightly re-worded the discussion to highlight that the colocalised loci tend to be near genes related to telomere length biology and are also exploring the colocalisation approach to select variants for PRS analysis elsewhere.

    1. Tis but fortune, all is fortune. Maria once told me she did affect me, and I have heard herself come thus near, that should she fancy, it should be25 one of my complexion. Besides, she uses me with a more exalted respect than anyone else that follows her. What should I think on ’t?

      Malvolio, despite being a character who should not be trifled with, falls for a poorly planned prank, first and foremost due to his desire for power (which is evident in his verbal juggling), which he believes will only come to him through his marriage to Olivia. Additionally, because Shakespeare wanted to mock the Puritan tendencies of the time, he purposefully created Malvolio to fall for this prank in this scene. A typical set of issues are brought up by the practical prank played on Malvolio, including identity instability, the significance of clothing in establishing one's identity and place, and the illusions and delusions we allow ourselves to fall into in the name of love. Malvolio succumbs to the seduction of romance just like everyone else, including Orsino and Viola. He is as romantic as anyone, despite his outward puritanism, albeit his idea of wedlocking Olivia is motivated more by social aspiration than by love. Malvolio's self-delusion is caused by his desire to surpass his class, but it also helps to explain why Sir Toby and the others find his fantasy so absurd. Malvolio is not a good match for Olivia due to both his undesirable demeanour and the fact that he lacks aristocratic blood. He is an ordinary person, whereas Olivia is a lady. They find it offensive that Malvolio would envisage Olivia and him getting married. We may remember how intrigued Olivia is when she learns from young Cesario, on whom she has a crush, that he is a "gentleman"—meaning that he is of noble birth—in an earlier scene. A noblewoman marrying a lower-class man would have been quite unusual in the class system of Shakespeare's day

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      All the conclusions are based on solid evidence and convincing, and the methodology are in detail to follow or repeat. The writing of the manuscript is logical and easy to follow.

      We thank the reviewer for these comments

      1. The mutation experiments indicated that nkd enhanced the phenotype of scr, but there is no leaf phenotype variation in nkd muations, this is some way difficult to understand, it would much better if the authors can give much more explanation in the discussion.

      We have added more discussion on this point. One possibility is that collectively the four genes function redundantly, however, due to the transcriptional negative feedback loop discovered here (Figure 3B), when NKD genes are mutated then SCR expression is enhanced, hence phenotypic perturbations are less likely to be observed than when SCR genes are mutated.

      2.The word green millet in the first paragraph should be changed to green foxtail. Millet means domesticated small cereal grains, such as foxtail millet, finger millet, proso millet etc.

      We thank the reviewer for this feedback and have made the suggested change.

      Reviewer #1 (Significance (Required)):

      The manuscript, which titled Mutations in NAKED-ENDOSPERM IDD genes reveal functional interactions with SCARECROW and a maternal influence on leaf patterning in C4 grasses by Hughes et al., first reported that SCR works regulating both leaf inner pattern and epidermal stomatal patterning in the C4 model plant green foxtail. The functional difference of this gene in Setaria from that in maize and rice indicated that the inner leaf cell patterning regulation of SCR is not a characteristic of C4 Species; this gave us insight understanding of the complex of C4 leaf cell patterning. In addition to this important discover, the authors found that mutations in NKD IDD genes enhance loss of function scr phenotypes in the leaves of C4 maize and Setaria but not in the C3 rice, indicating NKD IDD was involved in the leaf cell patterning in C4 species, but no in C3. They also identified a maternal effect on cell-type patterning in maize leaves that are initiated during embryogenesis.

      We thank the reviewer for their kind comments and suggestions.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The leaf anatomy that distinguishes C4 from C3 plants has been known for decades, with veins in C4 plants separated by 1 to 3 (generally 2) mesophyll cells whereas those in C3 plants are considerably farther apart. This anatomical pattern appears to be critical for the function of the C4 pathway, which under some environmental conditions is a more efficient way to fix carbon than the C3 pathway. Despite the obvious importance of close vein spacing, the genetic mechanisms that control it have been surprisingly difficult to untangle. The statement on the bottom on p. 2 ("To date, very few regulators of cell-patterning in inner leaf tissues have been identified...") is an understatement. The paper by Hughes et al. offers a major step in uncovering the basis of C4 vein spacing.

      We thank the reviewer for this feedback and agree that this work represents a major step forward in understanding C4 vein spacing.

      The authors build on their previous work in Scarecrow-like proteins in maize and rice. In maize, SCR controls patterning of the mesophyll, whereas in rice it controls development of stomata. This paper pursues the possibility that the differences in SCR roles may have to do with interacting proteins. Based on work in Arabidopsis the authors focus on proteins with an indeterminate domain (IDD) and specifically on the NAKED ENDOSPERM genes.

      The paper presents an analysis of an impressive set of mutants in three species. A major step in this paper is the comparison among three species of grasses - maize, rice, and Setaria - rather than the more common two species, usually maize and rice. Maize and rice differ in photosynthetic pathway but they also differ in many other traits that reflect the ca. 50 million years since their last common ancestor. Setaria is, like maize, C4 and the two species are more closely related to each other than either is to rice, although they represent two independent acquisitions of C4. This paper shows that SCR orthologs control stomatal patterning in both rice and Setaria implying that the stomatal function of SCR may be ancestral in the grasses and also is not directly connected to photosynthetic pathway.

      The availability of allelic combinations of SCR and NKD in maize in particular permits the inference of possible maternal effect on the vein spacing phenotype, although exactly how this happens remains unclear.

      The discussion provides a careful and logical assessment of the state of knowledge on SCR and IDD proteins in general, and the new data on SCR and NKD in particular. Many questions remain unresolved, and many additional experiments could be suggested. However, the power of the genetics and the phenotypic analysis together provide a novel direction for research on vein spacing. I will refrain in this review from suggesting what additional information would be nice to have since I think a review should assess the quality of the paper as it stands, not as it could be with months more of work.

      My only really substantive suggestion is that the micrographs of the Setaria leaves need to be improved. Specifically, in Figure 6E it is hard to see the details of the fused veins. Either the section is too thick or the camera was not focused properly. Because this image in particular is central to the entire paper I would recommend aiming for the clarity of the images of Zea cross sections, which are fine.

      We thank the reviewer for this suggestion. Obtaining leaf cross section micrographs from the Setaria scr1;scr2;nkd mutants was extremely challenging as the growth phenotype is so severe (Figure 5), meaning that the available leaves are small and extremely fragile. Multiple attempts to fix and section leaves using a microtome failed, with leaves consistently collapsing. In our hands, Setaria is not as amenable to fresh vibratome sectioning as maize, and combined with the additional challenges of handling the tiny triple mutant leaves mean that the resultant images are not of the same quality as the maize figures. We have included a supplemental figure (Figure S8) with additional examples of fused veins identified in our screening.

      Very minor point: p. 3 - "double Zmscr1;Zmscr1h mutants" - what does the "h" in Zmscr1h refer to?

      In this context h refers to this gene being a homeologous gene duplicate, as first explained in Hughes et al. (2019). We have included an explanation in the revision.

      Reviewer #2 (Significance (Required)):

      Strengths of the paper are 1) the inclusion of three species to help determine which aspects of the gene function may be ascribed to C4; 2) thoughtful and comprehensive genetic analysis; 3) careful sections of leaves; 4) outlines of the limitations of the approach. Limitations (several of which the authors acknowledge in the Discussion) include a general lack of molecular genetic data (protein interactions, DNA binding sites, RNA-seq, etc.). While this information would be great to have, I think the strength of the genetics is such that the paper will be foundational for future work in any case. The one bit of additional data that would be ideal would be information bearing on the two mechanistic hypotheses laid out on p. 10. The model that SCR and NKD promote cell division and specify mesophyll identity is the opposite of the model that SCR and NKD inhibit vein formation. An experiment that helped point the reader toward one or the other of these models would be very valuable.

      We agree that an experiment that could distinguish these possibilities would be extremely valuable, and will undoubtedly be the subject of future experimentation.

      The paper fills a critical gap. Little to nothing is known about how the internal anatomy of leaves is patterned and the data presented provide evidence that SCR and NKD are two important players. The paper also provides a conceptual advance in offering a couple of genes and some plausible mechanisms of how they might function.

      The audience will be primarily developmental geneticists and physiologists. The paper addresses an important problem that is of broad interest to developmental biologists and is potentially important for global agriculture.

      We thank the reviewer for their kind comments and suggestions.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript of Hughes et al. aimed to demonstrate the functional interactions between Naked-Endosperm IDD genes and the transcription factor SCARECROW and a maternal effect on leaf patterning in C4 grasses. To this end, the authors conducted a greenhouse and labor experiment to create mutants of related genes and assess the expression of these genes through qRT-PCR combined with fluorescence microscopic images in Rice, Maize, and Setaria. They found an increase in the proportion of fused veins with no intervening mesophyll cells in scr;nkd mutants in C4 species (Maize and Setaria) but not in C3 species (rice). In the end, they revealed a maternal effect of derived NKD on patterning cells in leaf primordia during embryogenesis.

      Major comments - Optional: the authors should have conducted a whole transcriptome experiment through RNA-seq on the mutants as compared to the controls to check if these genes were significantly up-related followed by qRT-PCR for validation. By doing so, the authors should be able to get a broad overview of all key plays involved in leaf patterning.

      We agree with the reviewer that it would be useful to have this data, and such an approach will undoubtedly inform future research.

      • Optional: although the authors may evoke the statistical significance of observing fused veins in mutants sr;nkd, the presence of fused veins in one mutant Svscr1;Svscr2 and Zmscr1-m2;Zmscr1h-m1 may contradict the claim that the authors made regarding the association between scr and nkd. Moreover, the sampling size is not also large enough to draw a substantial conclusion.

      We disagree with the reviewer that our sampling size is not large enough to draw a substantial conclusion. In maize we surveyed 11 quadruple mutants and 588 veins. Although this phenotype is occasionally seen in Zmscr1;Zmscr1h mutants, it is far more penetrant in Zmscr1;Zmscr1h;Zmnkd1;Zmnkd2 quadruple mutants and easily distinguished by eye when viewing each mutant, the statistical analysis only serves to make this point. In Setaria we agree that the differences are less stark, and the sampling size is necessarily lower due to the challenges of working with the triple mutant leaves which are extremely small and fragile (far more so than the maize quadruple mutant leaves). We have already included discussion as to why the phenotype may be less penetrant in setaria. Together we think that the fact the direction of the phenotype matches that of maize is convincing evidence that the increase in fused veins is a real consequence of combining the scr and nkd mutations.

      • There are two copies of nkd in maize but only one copy in rice and Setaria. Does the presence of two copies in maize has any evolutionary or functional meaning? Does the presence and absence of one or two copies has any effect on leaf patterning? It would be interesting to discuss this in the discussion section.

      We thank the reviewer for this comment and have added discussion of this in the manuscript. This situation is common in maize, which underwent a more recent whole genome duplication since its divergence from rice and setaria. Most of these gene-pairs function redundantly, however, there is evidence of functional divergence in terms of expression in some gene-pairs. We have added a sentence in the results explaining why we think the presence of two NKD gene copies in maize is unlikely to have functional significance in this case.

      • The methods section is not easy to read for a non-specialized audience. I suggest providing an explanation of the abbreviations used to describe mutants.

      We thank the reviewer for this suggestion and have made the suggested change.

      • For the results section, you should provide a table summarizing the differences between mutants and controls regarding the leaf structure.

      We have added such a table at the end of the results section and referred to it in the discussion.

      Minor comments: - "Zmscr1-m2;Zmscr1h-m1 seed were" seeds instead

      We have made the suggested change.

      • "Loss of NKD gene function enhances SCR mutant phenotypes in maize and setaria" This section is confusing because several perturbations were observed in triple mutants of Setaria and quadruple mutants of Maize as compared to their double mutants (Svscr1;Svscr2 and Zmscr1;Zmscr1h). You should rewrite this subtitle for clarity.

      We have changed this sub title to read “In maize and setaria, but not in rice, nkd loss of function mutations enhance scr mutant phenotypes”

      • "The accumulation of transcripts in the ground meristem cells" How do you estimate the accumulation of transcripts? What do you mean by the accumulation of transcripts? What do you consider transcripts?

      We use this term as opposed to ‘gene expression in the ground meristem cells’ because we do not know whether the presence/absence/level of detectable RNA is regulated by transcriptional or post-transcriptional mechanisms.

      Reviewer #3 (Significance (Required)):

      The manuscript of Hughes et al. is very interesting in the context of C4 photosynthesis research because there are many transcription factor candidates involved in the development of C4 leaf anatomy but few of them have been validated. However, a whole comparative transcriptome of mutants and controls should provide a broad overview and probably new insight into key players involved in leaf patterning.

      We agree with the reviewer that this would be of great interest, but we feel it is beyond the scope of this study and will be a productive avenue of future research.

      This study goes far beyond the simple validation by outlining the potential interactions between transcription factors. The authors made a substantial effort by combining gene expression results with visual data that strengthen the quality of this manuscript. Therefore, this manuscript is very interesting for the C4 research communities and for the field of developmental biology.

      We thank the reviewer for their kind comments and suggestions.

      A plant biologist working on the evolution and regulation of morphological characters using transcriptomics and genomics.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Major comments: The main conclusions of this work are that promoters of the different classes of genes display differing usage of GTFs and cofactors to promote transcription and likely recruit polymerase by different mechanisms. The in vivo experiments using factor depletion offer strong evidence that certain factors including TBP/TRF2 are differentially required for transcription at the housekeeping/developmental gene classes. The in-depth analysis of different promoter types combined with the genetic approaches outlined above provide compelling mechanistic insights into promoter-specific engagement of regulatory factors. In general, the data supports the authors' suggestions.

      One important shortcoming of these experiments is in the in-vitro DNA binding analysis of GTFs at differing core promoter contexts. The lack of GTFs binding to the housekeeping promoters may be a reflection of low intrinsic transcription activity. If the housekeeping promoters don't assemble active transcription complexes in this in vitro system but the developmentally-regulated promoters do, then a simple comparison of proteins bound to each promoter type is potentially misleading as to the factors required for transcription. For example, results of the in-vivo analysis suggest that the +1 nucleosome is an important factor in the positioning of the transcription start site at housekeeping promoters, therefore the use of chromatinized templates rather than naked DNA would likely better reflect the intrinsic binding properties of factors at promoters.

      We thank the reviewer for highlighting that the in vivo experiments constitute strong evidence for the differential requirements of certain factors at different promoter types and that our work provides compelling mechanistic insights into promoter-specific engagement of regulatory factors. We are also grateful to the reviewer for pointing out that we had not sufficiently clearly explained the aim and rationale of the initial in vitro DNA binding analyses (Figures 1 & 2). These which were not meant to assess different factor requirements but to assess if short core-promoter DNA is sufficient recruit transcription-related proteins, as had been reported for TATA promoters, and whether different core-promoter types differ in this ability. We therefore based the in vitro DNA binding assays on the fact that 121bp-short TATA core-promoter DNA is able to recruit and assemble the PIC even in the absence of activators, i.e. when the core promoters are transcriptionally inactive, and assayed all other core-promoter types under identical conditions. Interestingly, while the TATA core promoters enrich for canonical PIC components as expected, housekeeping promoter DNA does not, suggesting that the core-promoter DNA fragments’ abilities to recruit and assemble the PIC differs.

      We agree with the reviewer that one could possibly find conditions in which the different promoter types are active in vitro, e.g. by providing activators or chromatinized templates, and we hope that our explanations above clarify why this has not been the goal of these analyses. As the reviewer pointed out, we assay functional requirements of various TFs and GTFs in vivo in the remainder of the manuscript. We revised the manuscript to improve clarify the aim and scope of these sections (pages 4-9) and are grateful to the author for allowing a discussion of this topic as alternative (see below), many thanks

      One way to address this issue is to test transcription activity of the promoters used in the mass spec analysis. After incubation of promoters with extract, add NTPs and quantitate the basal transcription activity of each type of promoter. If they are the ~same - great. If not, at a minimum, the authors need to acknowledge this as a limitation of the study. The suggested transcription experiment is a simple extension of the work already completed.

      As outlined above, we deliberately assay all core promoter types under identical conditions, such that differences in protein binding reflect the different DNA fragments distinct functional properties. Please also note that while all core-promoter fragments are transcriptionally inactive, they can be activated by input from a strong enhancer (please see Supplementary Figure 2C; housekeeping and developmental core promoters can be induced to comparable levels, and thus weaker binding of GTFs to housekeeping promoters is not a reflection of weaker inducibility or activity).

      We note that all statements and claims are strictly in line of what we tested, namely the core promoter DNA’s ability to recruit transcription-related proteins in vitro. However, we agree with the reviewer that the notion that the core promoters are assayed under identical conditions but are not active is important and discuss it in the main text (pages 8 – 9) and the ‘limitations of this study’ section.

      The authors suggest from the depletion experiments of TBP/TRF2 that the factors are functionally redundant since the level of transcription for target genes recovers after prolonged depletion, though there is not specific functional evidence to support this claim. A suggested experiment to test the functional redundancy of TBP/TRF2 at subsets of genes is to assess the levels of proteins and/or protein binding to promoters after factor depletion. For instance, is there a global upregulation/stabilization of TBP after TRF2 depletion? Or is there an increase in TBP binding at promoters? These can be addressed by western blot for overall protein levels and ChIP-seq or similar method to assess binding to promoters, which are fairly straightforward experiments given that the cells lines have already been produced.

      We thank the reviewer for suggesting potential compensatory mechanism regarding the redundancy of TBP and TRF2 at a subset of tested promoters. To address the question regarding the stability of TBP or TRF2 in the absence of one or the other, we have performed label-free quantitative mass spectrometry on the TRF2-AID cell line and examined TBP levels (Supplementary Figure 4E). We do not see a stabilization of TBP upon the depletion of TRF2 with auxin. The apparent functional redundancy (e.g. Fig. 4J) thus indeed suggests that there might be increased TBP binding. Unfortunately, we are not able to directly test this experimentally due to a lack of resources. We now add a discussion of the potential compensatory mechanisms to the main text (page 14), many thanks.

      A discussion would be appreciated on the generality of the suggested mechanism in metazoans. For example, is DREF conserved only in insects but could other eukaryotes use a similar mechanism at housekeeping genes?

      We agree that some of the specific TFs don’t have one-to-one orthologs outside insects, yet that other prominent features of Drosophila housekeeping promoters are shared more widely. We now discuss the parallel between dispersed patterns of initiation at different promoter types across species, including Drosophila housekeeping and vertebrate CpG island promoters. We also provide an outlook towards future functional, biochemical and structural studies that might reveal more diverse transcription initiation mechanisms at the different promoter types in our genomes (pages 23-24).

      Minor comments: The manuscript is very difficult to read. One major problem is the large number of figures - many of which are not essential for understanding the results. I strongly suggest that the authors think carefully about which figures to include in the manuscript and keep only the most important.

      We agree that the manuscript is complex with six main figures and several different approaches, including biochemistry and mass spectrometry but also genomics and bioinformatics. In addition, the manuscript includes in vitro tests of DNA-protein binding and in vivo assays to probe functional requirement (by depletion) and sufficiency (by recruitment). These different assays assess different properties and complement and validate each other, which is why we feel they are required. We hope that the clarification of the different aspects and their purpose makes the manuscript more easily accessible, many thanks.

      Second, the legends on many of the graphs are very tiny and difficult to read.

      We have revised the figures to improve font size and readability of the figures, many thanks.

      Third, it would greatly help readability if the main figures and legends were imbedded in the manuscript and if the supplemental figures + legends were in a separate document. We have now included the main figures and legends into the manuscript, thanks.

      Fig 4E: very difficult to understand what was done.

      We now add further explanations to the figure legend to describe the different promoter groups compared in the analysis of ChIP-seq coverage of TBP and TRF2. Fig 4A vs G: why are ~ the same number of genes affected by TRF2 vs TBP + TRF2 depletion? I got the impression from the text that there should be a large difference in the number of affected genes.

      We had the same prior expectation, but indeed observed a similar number of downregulated genes upon TRF2 depletion versus TBP and TRF2 double depletion. This may partly be technical, e.g. relating to clonal selection of the different AID-cell lines or thresholding effects, but is likely explained by the relatively few TBP dependent genes (200) that don’t contribute substantially to the larger group of TRF2 dependent genes (3826). The observed number 3935 is 98% of the sum, even ignoring potential overlap. We now clarified this in the text. Fig 5A and similar figures: include the number of affected genes in the figure.

      We added the number to the figure, thanks. Fig S2C: hard to understand what was done from the legend.

      We have added additional explanations to the figure legend, thanks. Fig S2F and similar figures: hard to distinguish the legend and the green colors used. Proofreading: Add citation for Cut&run in the methods.

      We did not analyze CUT&RUN data, however ATAC-seq and ChIP-seq data sets are cited.

      In supplemental Fig1a, the percentage of "INR only" is greater than 100%.

      We thank the reviewer and fixed the typo.

      Supplemental Fig 1a legend-should 170,000 protein coding genes read "17,000"? Santana et al. reference on pg 8 should read 2022.

      We thank the reviewer and fixed the typos Readability: The categorizations of genes classes based on core promoter elements is somewhat unclear-from 1a, is it the case that all TATA contain INRs? A different way of representing the data to capture overlaps in motifs other than a pie chart may better convey these motif relationships. Work could be done to increase clarity in general on the promoter motif subtypes used and how mutually exclusive these elements are in the tested subsets.

      We thank the reviewer for the suggestion. We have added a heatmap in Supplementary Figure 1A showing the percent match score to motif PWMs across Drosophila promoters. As the reviewer suspects, most developmental core promoters have a high-scoring INR motif and some have an additional TATA box or DPE motif. We have also revised the remainder of the text and rewritten the methods section regarding the motif analysis (pages 36 to 38) to improve clarity. Many thanks. Figure 5: authors state "all protein coding genes" are downregulated with TFIIA depletion, though it appears some transcripts are unchanged or upregulated in 5B/C. Suggest change in language.

      We thank the reviewer for this comment. Less than 70 genes are not downregulated upon TFIIA depletion, and manual inspection shows that these genes include intronic non-coding RNAs such as tRNAs that hinder accurate PRO-seq quantification. However, we agree with the reviewer and revised the text to reflect that essentially all promoters are downregulated, affecting all promoter types. A discussion on the developmental context of the S2 cell line seems appropriate. If S2 cells represent a late stage developmental cell line, would the authors expect the relative utilization of cofactors to be the same/different in other cellular contexts?

      We thank the reviewer for this comment. We indeed expect the relative utilization of cofactors to be the same I most cellular contexts and now added a discussion with relevant references (page 23), many thanks.

      Reviewer #2

      1. The DNA affinity purification method is excellent as a discovery method, but it has some potential caveats. One is that it cannot capture remodeling events that could potentially remove otherwise stably bound factors to allow for transient PIC assembly and gene activation. It is possible that some of the insulator factors such as BEAF-32 and Ibf1/2, which selectively bind housekeeping sequences, could prevent or reduce binding by PIC factors. This could occur if BEAF-32 and/or Ibf1/2 inhibit PIC assembly if bound to DNA and if these factors bind housekeeping promoters with high affinity and slow off-rates. That is, in live cells, a competition could exist between binding of these enriched housekeeping factors and PIC assembly. By contrast, this caveat is not relevant at developmental promoters due at least in part to low/sub-nM TBP binding affinity. Ultimately, this is a minor concern but the authors should address in the article to inform readers about potential limitations of the experiments.

      We thank the reviewer for highlighting that DNA affinity purification is an excellent discovery method and for pointing out important differences between such in vitro assays and the in vivo situation. We agree and interpret our results from the DNA affinity purification carefully and specifically regarding differences observed for different types of core promoters under identical experimental conditions. We now highlight these differences more clearly throughout the relevant sections on pages 4-8 and expand the discussion of this issue in the ‘limitations of the study’ section. Many thanks.

      1. More information about how the PRO-seq spike-ins were implemented is recommended. For example, were they fit to a linear regression of read counts/chromosome between all samples, or did they take all hg19 reads as raw fold-change of all samples compared to a control replicate?

      We thank the reviewer for addressing the insufficient information provided about the spike-ins used for PRO-seq. We have added this information to the materials and methods section: We calculated the ratio of spiked-in reads representing the percentage of reads mapping to the human genome over all reads. This ratio was used to determine a scaling factor representing the fold-change of total transcriptional output between the auxin-treated sample and the control samples.

      1. Figure S1C should be cited (not S1B) to support the statement "Mutating either the TATA box or DRE motifs reduced TBP or DREF binding to control levels..."

      We thank the reviewer for this correction and implemented the correct panel citation.

      The authors could note that TATA box mutants still show slight enrichment for TBP compared to negative controls.

      We now note this in the figure legend and explain that it is consistent with TBP binding to non-TATA-box developmental core promoters (Figure 2 B & E).

      In Figure 2A, it would help to remind readers here that TATA, DPE, INR = developmental and TCT, Ohler1/6, DRE = housekeeping.

      We thank the reviewer for this suggestion and implement it

      Figure S2A shows only 121bp and 350bp DRE core promoters but the text refers to 450bp and 1000bp sequences as well. Can the authors show representative results from these longer sequences?

      We thank the reviewer for pointing out these inconsistencies, which we now fixed by revisions to the text and supplementary figures.

      1. In comparing data in Fig 2B and 2E, it seems the statement "the ChIP signals reflected the differential binding preferences observed in vitro for the respective promoter subtypes" should be modified. It is true to an extent but it is more nuanced than indicated by the text.

      We have reworded the section and now discuss the observed trends for GTFs and TFs.

      In Fig S2I, Ohler1 + Ohler6 and TCT are difficult to distinguish because of color scheme choice.

      We agree and now explain in the figure legend that the brighter green corresponds to the Ohler1/6 promoters and the darker green to the TCT promoters, we have additionally edited the legend for better color visibility, many thanks.

      In Fig 3F, perhaps add that Gld has TATA and Fit2 has DRE?

      We now indicate the presence of TATA-box and DRE motifs in the figure, thanks.

      Fig 5D: legend is cut off in the Figure. We thank the reviewer for this comment and now fixed the cropped legend. 11. Fig S2B needs more description and clarification in the main text and the legend. We now deleted Fig.S2B. 12. Page 8, 2nd paragraph "avoiding potential" should be replaced with "minimizing" or similar. We thank the reviewer for this comment and have changed the word choice. 13. Page 16, penultimate paragraph: "Essentially" should be replaced with "Essentiality"

      We thank the reviewer for this comment and correct the wording.

      Reviewer #3

      1. The authors perform a k-means clustering of PWM match scores within 17,000 promoter sequences. They describe in the Methods section that this data revealed 9 groups of promoters. However, although it is likely that several of these promoters contain matches for multiple core promoter motifs, the promoter classes are simply named DRE-promoters, TATA-promoters, TCT-promoters, etc., disregarding any combinatorial association. Furthermore, the clustering data is not visualized to support this naming. The authors should at least provide a heatmap showing the PWM match scores for these clusters and indicate which promoters were used. This is crucial for interpretation of results. We thank the reviewer for pointing out the description of the motif analysis lacked clarity and that the clustering of Drosophila promoters should be visualized. We agree and now provide the k-means clustering heatmap of all 17118 protein coding gene promoters, visualizing the position-weight-matrix (PWM) scores matches for the different promoter motifs in Supplementary Figure 1A. This visualization confirms the reviewer’s suspicion that core-promoter motifs often co-occur in the same core-promoter. For example, TATA promoters typically contain TATA-boxes and INR motifs, etc, which is now clearly seen in the newly provided heatmap. We have also revised the main text, figure legends and have rewritten the method section (pages 36-38) to clarify the analysis of motifs throughout the manuscript. Many thanks.

      2. Relatedly, this paper uses a seemingly over-simplified terminology to describe promoters as housekeeping or developmental. While this terminology has been used in several studies from the Stark lab, this is not well supported by data and the usage of this terminology will likely lead to confusion among readers. Here, housekeeping seems to refer solely to the presence of a motif match in the promoter sequence rather than to ubiquitous expression across cell types. Similarly, developmental promoters seem to refer to anything that is not housekeeping. Are S2 cells best reflecting the activity of developmental genes? What about genes that are not expressed as part of a specific developmental trajectory, but still cell-type restricted? Since focus here is on the behavior of promoters with respect to their core promoter elements, why not just refer to them according to their promoter elements? A good example where the developmental versus housekeeping distinction is not useful is the authors' desire to generalize differences observed in Figure 2B, in which it is quite obvious that there is no clear developmental versus housekeeping split. Rather the data demonstrate that TATA-containing and DRE-containing promoters behave differently.

      We thank the reviewer for raising a concern about the terminology of functionally distinct promoter types in Drosophila. The use of functionally distinct promoter types enriched in different motifs is built on extensive evidence by our lab and others (e.g. the Ohler or Kadonaga groups) that found extensive agreement between promoter sequence, promoter function, initiation pattern, gene annotation, and ubiquitous vs. cell-type-restricted activities. Ubiquitously active housekeeping promoters tend to contain the TCT, DRE and Ohler 1/6 motifs, while cell-type-restricted developmental promoters tend to contain TATA-box, DPE and INR motifs (Arnold & Zabidi, Nat Biotech 2017, Haberle et al. Nature 2019, Ngoc et al. Genetics 2019, Ohler et al. Genome Biol 2002, Ohtsuki et al. Genes & Dev 1998, Rach et al. Plos Genetics 2011).

      We find that the terminology is simple and thus accessible for the non-specialist reader. We agree with the reviewer that clarity is key and revise the introduction of the terminology to clarify that it is based on multiple lines of evidence. We also clarify that Figure 2B – in contrast to the reviewer’s claim – does support a clear developmental versus housekeeping split (please see the dendrogram on top of the heatmap). We now clarified this in the main text and legend to Figure 2B, many thanks.

      1. The authors state that the "prevalent model" in the community is that PIC assembly is the same at all promoters. This is not true. For instance, it is well established that certain core promoter elements have a strong positional effect on TSS selection, while dispersed promoters lack strong positional features. What is less known is how the dispersed pattern, e.g. of non-TATA promoters, arises. The authors should more clearly specify the unknowns and the novel findings of their paper.

      We agree with the reviewer that certain core promoter elements have strong positioning effects on TSS selection and that these occur in promoters with focused initiation patterns such as TATA promoters and developmental non-TATA promoters (e.g. promoters with INR and/or DPE motifs). We also agree that it is unclear how dispersed patterns at housekeeping promoters arise, especially because the initiation sites don’t co-occur with the TF motifs present in these promoters (e.g. DRE or M1BP motifs; see Figure 6A).

      However, the question we address goes beyond TSS selection: we have not seen any study of PIC recruitment and assembly at any promoter with dispersed initiation pattern and the idea of a single uniform Pol II PIC assembly has been the predominant view of transcription initiation during the past two decades (Schier & Taatjes, Genes & Dev 2020). Here, we provide evidence that protein recruitment and GTF usage differs between promoter types, which has mechanistic implications beyond TSS choice alone. In particular, we show that at least two modes of transcription initiation exist that differ between focused developmental and dispersed housekeeping promoters, whereby the developmental promoter DNA directly engages the Pol II PIC via TBP and TFIID, while the housekeeping promoter DNA does not and instead, housekeeping promoters recruit TFs, which recruit COFs and TFIIA. This is exciting and inconsistent with uniform GTF recruitment and assembly, and we hope that this work motivates the study of these different PIC assembly mechanisms at different promoter types.

      One of the major claims made by the authors in the paper is that PIC is recruited directly or indirectly depending on the presence of TATA or DRE. However, their approach seems to pick up a lot of indirect bindings, especially for TATA. This raises concerns of potential biases, which if addressed would strengthen the author's claims. The results do not exclude that TFIIA is directly recruited to TATA but might simply reflect stronger binding to other factors compared to DRE. It is also puzzling that DRE is the only one selected for further validation as it appears to have the lowest affinity for PIC binding and the focus on Ohler1/6 motifs in the final model. Disclaimer, this reviewer is not an expert on DNA-affinity purification assays.

      We thank the reviewer for pointing out that we had not sufficiently clearly explained the DNA affinity purifications. They were performed under identical conditions for all promoter types, such that the differential binding to TATA vs DRE promoters reflects the respective promoter DNA’s affinity to various transcription-related proteins – they are key results of our work. Please note that, despite the high number of TATA interactions, many of these interactors are expected and reflect the binding of multi-subunit protein complexes such as the Mediator and TFIID (please see Figure 2B) and reflect the fact that we did not purify the PIC nor reconstitute it from purified components but determine nuclear proteins that bind to TATA-box promoter DNA. We now introduce and discuss these aspects more clearly.

      It is possible that the fewer interactors found for housekeeping promoters stem from lower affinity of the PIC, the lack of chromatin, or the stable binding of sequence-specific binders such as DREF, BEAF-32 and M1BP in our assay (please see our response to reviewer 2 above). As these result from identical experiments under identical conditions, the fewer interactors for housekeeping promoters are also an important result that likely reflects lower affinity or more transient binding. We now clarify these results and their interpretation in the main text and discuss differences between this assay and transcription in vivo in the “limitations of the study” paragraph.

      As the reviewer might appreciate, the follow up experiments, including the creation of AID cell lines, PRO-seq, etc., are a lot of work such that we did them for promoters at the two extreme ends of the spectrum and their respective DNA-binding factors TBP and DREF identified in Figure 1. We think that these representatives sufficiently strongly demonstrate that PIC assembly and factor requirement is distinct for different promoter types, many thanks.

      Their final model is supported by results by Baumann et al (2018), which directly shows binding and interactions between M1BP, putzig, gfzf and TRF2. However, these factors bind to Ohler1, while most of the work within this study (Figures 1, 3) focused on DRE. How do DRE-containing promoters fit with the final model? Currently, these promoters are not even represented in the model figure.

      We thank the reviewer for pointing out that the final model highlights the Ohler 1 motif but omits the DRE motif. Based on the functional analyses shown in Figure 6 (pages 19-21), we think that the different motifs function equivalently in recruiting housekeeping cofactors and activating housekeeping transcription and have now included DRE motifs in the final cartoon. Our original choice was indeed based on the fact that previous reports from Baumann et al 2018 corroborate our findings for M1BP. As DRE promoters also recruit and depend on TRF2 (Hochheimer et al. Nature 2002), we now show a model by which housekeeping DRE promoters recruit a TRF2 containing PIC through TFIIA, but would like to stress that both likely function equivalently, leading to dispersed initiation. We also revised the data presentation and the final discussion regarding these promoters, many thanks.

      Minor comments

      1. The TSS patterns of promoters were evaluated using STAP-seq (in vitro data) and developmental CAGE data. For the purpose of the paper and to match the in DNA-affinity purification data better, it would be more reasonable to make use of S2 cell CAGE data (e.g. Rennie et al, 2018 PMID: 29659982).

      We thank the reviewer for bringing up this point. For figure 6 we have used CAGE data from Drosophila embryos instead of S2 cells in order to capture a larger proportion of expressed developmental genes and their promoters, rather than just the ones that are expressed in S2 cells. As promoter motifs are found in stereotypical positions in relation to the TSS (Ohler et al. Genome Biol 2002) and because non-S2-cell core promoters can be activated in STAP-seq (Arnold 2017; Haberle 2019), our use of CAGE data from Drosophila embryos allows us to base all subsequent analyses on many more core promoters and also exclude any cell-type specific effects that may arise in TSS selection.

      Previous models on TSS selection within non-TATA promoters have highlighted the dinucleotide frequency of +1 nucleosomal DNA as a strong positional feature. Here, the authors investigate this model using a rather weak analytical approach. We know that nucleosomes can vary between cells (fuzzy positioning). Variability across promoters may cause larger variability in relative TSS positioning. Hence, what is observed here as a TSS spread relative to the +1 nucleosome positioning might in fact be caused by averaging. A more suitable approach would be to analyze the positional cross-correlation between TSS locations (e.g. revealed by CAGE reads) and nucleosomal positions (e.g. revealed by MNase-seq reads). This would better support claims regarding specific TSS positioning with respect to nucleosome positioning.

      We agree that the analysis of cross correlation between TSS locations and nucleosomal positions at individual promoters would provide a more precise measure of TSS positioning relative to the nucleosome. We had originally chosen a visualization that more directly assesses whether the +1 nucleosome determines the TSSs by centering on the predicted +1 positions. In response to this comment, we have performed two additional analyses: a cross-correlation analysis on CAGE and Mnase-seq read coverage in relation to the dominant CAGE TSS (new Supplementary Figure 6I) and a TSS-centric analysis of Mnase-seq coverage (new Supplementary Figure 7. Both analyses agree with the original analysis and we thank the reviewer for pointing out how to strengthen this analysis.

      The cross-correlation analysis reveals a peak in the mean correlation score 125 base pairs downstream of housekeeping TSS (at TCT, Ohler1 and DRE) promoters but not downstream of developmental promoters (TATA-box, DPE and INR), in line with housekeeping TSS being positioned upstream of the +1 nucleosome.

      The analysis assessing +1 nucleosome positions as derived from MNase-seq coverage relative to the position of the dominant TSS reveals the expected phasing of downstream nucleosomes in housekeeping promoters but not at developmental promoters. Many thanks.

      It is interesting that tethering of housekeeping-associated coactivators leads to a higher positional dispersion compared to the result of developmental-associated coactivators. However, the positional TSS dispersion of housekeeping promoters seems to always be larger than that of developmental promoters regardless of coactivator recruitment. Can the authors explain these results?

      We agree that CAGE data typically show TSS dispersion at housekeeping promoters, yet this reflects the promoters’ transcriptionally active states during which endogenous TFs and coactivators are present. Our analyses are based on short, transcriptionally inactive core promoters that can be activated by cofactor recruitment, leading to the observed outcomes. We now clarify this in the manuscript and highlight that the differences in focused versus dispersed patterns occur even on the very same DNA sequences upon the recruitment of developmental or housekeeping activators (e.g. Fig. 6F).

      The authors seem to suggest that positional dispersion of TSSs within housekeeping promoters is due to stochastic initiation after non-positional specific PIC recruitment mediated via certain co-activators. If TSS selection is truly stochastic, why do these promoters then have dominant TSSs?

      We thank the reviewer for pointing out that our phrasing might have suggested that TSS selection was entirely random or stochastic, which is neither true for STAP-seq nor for endogenous CAGE data. In fact, not all positions have the same probability to initiate transcription, but certain positions or nucleotides seem to be inherently favored. We speculate that favorable positions relate to the local DNA structure, the energy barrier landscape for both DNA helix melting to occur and for the first phospho-diester bond to form (e.g. Dineen, D. et al. NAR 2009 and Vanaja, A. et al. ACS Publications 2022). We now added this discussion and the corresponding references to our manuscript (page 21).

      The authors find Chromator as a likely cofactor for indirect recruitment of TFIIA to housekeeping promoters. BEAF-32 is another factor the authors highlight as being enriched at housekeeping promoters (DRE promoters). Both of these factors have previously been considered insulator proteins or architectural proteins involved in the formation of chromatin folding (Ramirez et al, 2018, PMID: 29335486; Wang et al, 2018. PMID: 29335463). Could the authors comment on this link with their own findings?

      We thank the reviewer for addressing the importance of chromatin topology in the light of our findings, which we now discuss in the main text (pages 22-23).

      1. Can the authors justify PWM match thresholds used and why these were changed from Haberle et al 2019?

      We thank the reviewer for pointing out that these changes had not been justified. We adjusted them to be more stringent (e.g. DPE) or sensitive (e.g. TATA-box) exclusively for the motif enrichment analysis, which we did outside the rule-based promoter-annotation effort. These adjusted thresholds reflect the motifs vastly different information contents, which is low for DPE and high for TATA-box motifs.

      Figure related comments/concerns: • General: Sometimes wrong ordering of figure panels with regards to their first mention in the main text, varying font sizes, and minimal figure legends that are often inconsistent (e.g. PRO-seq is sometimes specified when used, but not always) • Typo: Supp Fig 1: INR only 121.37% • Fig 1E not explained, what does x axis describe and how is it calculated? • Figure 2C-D: The CAGE signal is poorly visualized in panel C, it also poorly describes that this is supposedly done using a pool of promoters. Where is the 450bp blot (it seems plausible that the 450bp fragment could actually facilitate a luciferase signal in Fig S2-B)? How was this pool selected, is it exclusively based on DRE-containing promoters? • Fig 2D: apparent gel leakage and loading on the second panel is low. Preferably, provide positive control on the same gel. • Figure 4C: all classes are negatively affected by TRF2 depletion, thus enrichment (4B) makes little sense here • Figure 5C: Missing axis labels • Figure 6F: A y scale would help here

      We thank the reviewer for these recommendations and have implemented all of them.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary:

      In this study, mice were exposed to a specific form of so-called Intermittent Fasting (IF) and the effects of IF on adult neogenesis in the hippocampus were determined. The specific IF protocol used had no effect on activation, proliferation, or maintenance of adult Neural Stem Cells (aNSCs) and displayed a decrease in number of new neurons in the neurogenic niche but only after 1 month of the IF protocol. These results contrast previously published results from multiple studies that concluded that IF promotes survival of new neurons and by extension promote adult neurogenesis. The unresponsiveness of aNSCs or their immediate cell progeny, the Intermediate Neural Progenitors (IPCs), to IF is a novel finding. The authors make several relevant points in the discussion about the publication bias towards positive results (or omission of negative results), which may reinforce established dogmas. However, the presented results did not convincingly demonstrate that the absence of effects of IF on aNSCs or adult neurogenesis is simply not a result of a specific IF paradigm, which is not robust enough to elicit changes in adult neurogenesis. In other words, there is a lack of positive controls and alternative protocols that would rule out that the observed absence of effects is not a consequence of type II error (the error of omission), or more colloquially, a consequence of false negatives.

      We thank the reviewer for acknowledging the importance and novelty of our findings. On them being the result of a specific IF paradigm, we must point out that we used the same IF paradigm as in previous studies that had shown changes in neurogenesis upon IF. We do not claim that IF is unable to increase neurogenesis in all conditions, but report that IF is not a reliable method to increase adult neurogenesis (in particular, every-other-day intermittent fasting with food re-administration in the evening). We have repeated the experiment multiple times in different strains, always with enough animals to make our experiments conclusive and we never observed an increase in adult neurogenesis, effectively ruling out that our results are a false negative. Of note, even if other protocols might indeed increase neurogenesis (which we never claimed cannot) that would not make our results a false negative.

      Major Comments:

      1. Protocol-driven absence of effects: The absence of IF effects on aNSCs and IPCs observed in this study does not lend it the authority to conclude that aNSCs are resilient to IF or all IF paradigms and protocols. The absence of IF effects on aNSCs and neurogenesis could be specifically related to the chosen IF paradigm. Indeed, not all previous studies that observed IF-driven effects on adult neurogenesis used the same "night-time every-other-day fasting" protocol chosen in this study. For example, Brandhorst et al., 2015 (cited in this paper) used 4 days of IF 2x per month and observed an increase of DCX+BrdU+ cells. On the other hand, certain previous studies used the same or similar IF protocol used here, but often with longer duration or with a post-fasting ad libitum feeding period, which may be responsible for the pro-neurogenic or pro-survival effects. In fact, the authors acknowledge this in the discussion (page 7, lines 289-290 and 292-294). Why would the authors then not include similar feeding/IF paradigm in their study and determine if these would generate effects on survival of new neurons but also on aNSCs and/or IPCs?

      As just stated above, we never claimed that aNSCs are resilient to all IF paradigms. We refer to fasting in general in the introduction but quickly focus on every-other-day fasting throughout the paper and directly compare our results only to similar IF paradigms. We chose the most commonly used IF paradigm that had been shown to increase adult neurogenesis. As the reviewer points out, we speculate in the discussion that a refeeding period may explain the differences between our results and others. This is because a post-fasting ad libitum period was introduced in the study published in Dias et al. 2021. We are currently analysing a new experiment in which we replicate the IF protocol in that study, which we will include in our revised version.

      In addition, the authors acknowledge that the chosen IF paradigm may have affected the stress levels or behaviour of mice (page 9, lines 372-378). Why did they not test if their IF protocol does not increase stress or anxiety of mice by simple behaviour tests such as open field or elevated T maze?

      While testing all possible causes for the lack of positive results in our experiments is not viable, we do agree with the reviewer that stress levels might indeed influence the outcome of the experiments. We will collect blood from ad libitum-fed and fasted mice to analyse the levels of stress hormones (e.g. corticosterone). The results will be included in our revised version. These measurements will give us a more accurate reading of stress levels than behavioural tests. Of note, regardless of the outcome of this experiment, our conclusions will remain identical. We will not be able to compare stress levels with previous publications, as they were not tested. And if the protocol did increase stress levels, it would still argue that IF is not a reliable method to increase neurogenesis (as presumably might or might not increase stress to levels that affect neurogenesis).

      Alarmingly, the used IF protocol does not result in changes in final weight or growth curves (S.Fig.2), which is surprising and raises a question the used IF protocol is robust enough or appropriate.

      We were also surprised by the lack of change in the final weight our IF mice respect to control. Differences in final weight between different labs despite using the exact same protocol are one of the reasons why we conclude that this IF paradigm is not a robust intervention. However, we are not the first ones to report little or no difference in weight upon IF in C57BL6/J mice (Goodrick et al., 1990 and Anson et al., 2003) and this would not be a reason to dismiss the experiment since the benefits in crucial circulating factors induced by IF seem to be independent of weight loss (Anson et al., 2003).

      Finally, the authors acknowledge that their own results do not support well-established findings such as aging-related reduction in number of aNSCs (page 4, lines 177-179). This again questions whether the selected protocols and treatments are appropriate.

      As we already discuss, we believe this might be due to a difference between strains in the time when aNSC numbers decline. Nevertheless, we will complement our current data by counting the number of aNSCs at 1 and 3 months post-tamoxifen (3 and 5 month old mice) using GFAP, Sox2 and Nestin triple stainings (as suggested by another reviewer).

      Lack of topic-specific positive controls: The authors successfully demonstrated that the used IF protocol differentially impacts the adipose tissue and liver, while also inducing body weight fluctuations synchronized with the fasting periods. However, these peripheral effects outside the CNS do not directly imply that the chosen IF protocol is robust enough to elicit cellular or molecular changes in the hippocampus. The authors need to demonstrate that their IF protocol affects previously well-established CNS parameters associated with fasting such as astrocyte reactivity, inflammation or microglia activation, among other factors. In fact, they acknowledge this systemic problem in the discussion (page 8, lines 359-360).

      We fully agree with the reviewer in that even though the chosen IF protocol induces peripheral effects, it is not robust enough to elicit cellular or molecular changes in the hippocampus, and this is precisely the message of our paper. We have looked for references showing the influence of IF on astrocyte reactivity or microglia activation, but the studies we found so far look at the effects of IF and other forms of fasting in the CNS in combination with pathologies such as Alzheimer’s disease, Multiple Sclerosis, physical insults or aging (Anson et al., 2003; Chignarella et al., 2018; Rangan et al., 2022; Dai et al., 2022. Reviewed in Bok et al., 2019 and Gudden et al., 2021). Fasting seems to reduce astrocyte reactivity, inflammation or microglia activation in these pathological situations respect to the same pathology in ad libitum mice, but its effect in control, healthy mice is far less clear. In fact, the only reference that we could find where healthy mice were included in the analysis showed that these benefits only happened in the context of the injury (Song et al., 2022).

      Problematic cell analyses: Cell quantification should be performed under stereological principles. However, the presented results did not adhere to stereological quantification. Instead, the authors chose to quantify specific cell phenotypes only in subjectively selected subsets of regions of interest, i.e., the Subgranular Zone (SGZ). This subjective pre-selection may have been responsible for the absence of effects, especially if these are either relatively small or dependent on anatomical sections of SGZ. For example, IF may exert effects on caudal SGZ more than on rostral SGZ. But if the authors quantified only (or predominantly) rostral SGZ, they may have missed these effects by biasing one segment of SGZ versus other. The authors should apply stereological quantification at least to the quantification of new neurons and test if this approach replicated previously observed pro-survival effects of IF. Also, the authors should describe how they pre-selected the ROI for cell quantification in greater details.

      We did analyse only the more septal region of the hippocampus, which we will make clear in the text. As also suggested by other reviewers, we will include stereological counts of the neuronal output of aNSCs in the revised version. As for selecting the SGZ for aNSC counts, this is the standard in the field, as one of the criteria to identify aNSCs is precisely the location of their nucleus in the SGZ. Neuroblasts and new neurons were counted both in the SGZ and the granule cell layer. There was no subjective pre-selection of areas of interest since we counted the whole DG in each section and not a specific random region.

      Alarming exclusion of data points: There appears to be different number of data points in different graphs that are constructed from same data sets. For example, in the 3-month IF data set in Figure 4, there are 14 data points for the graph of Ki67+ cells (Fig.4B), but 16 (or 17) data points for the graph of DCX+ cells (Fig.4D). How is that possible? If data points were excluded, what objective and statistical criteria were applied to make sure that such exclusion is not subjective and biased? In fact, the authors state that "Samples with poor staining quality were also excluded from quantifications" (page 12, line 528-529). Poor preparation of tissue is not only suboptimal but not a valid objective reason for data point exclusion. This major issue needs to be explained and corrected.

      As we disclose in the methods, those stainings that did not work were excluded. This was done always before counting. Different samples were used in different counts because of the variability of staining quality between different antibodies. We will look back into the samples that failed in at least one of the stainings and exclude them from all counts, so that only samples for which all stainings worked are considered. These revised graphs will be provided in our revised version of the manuscript.

      Different pulse-and-chase time-points: One of the reasons why this study has found that aNSCs may not be responsive to IF could be the use of less appropriate pulse-and-chase time-points either after EdU or after Tamoxifen for cell lineage tracing. The authors observed that IF has negative effects on new neurons initially (Fig.4F). Similarly, it is well established that voluntary physical exercise affects SGZ adult neurogenesis only during the first 2 weeks. After this period, the neurogenic effects of exercise are diminished beyond observational detection (i.e., van Praag's and Kempermann's papers in the past 25 years). These two arguments suggest that the observed absence of aNSC responsiveness might be a consequence of the chosen EdU administration and the EdU pulse should not be administered 15 days after Tamoxifen/IF protocol start but earlier, in the first week of the IF protocol. In fact, the decreased number of new neurons during the initial IF phase may not be only a consequence of reduced survival but of higher aNSC quiescence during the first week of the IF protocol.

      We fully agree with the reviewer that BrdU or EdU pulses can give a biased view of the effects of any intervention on neurogenesis and that the EdU and Tamoxifen protocols would not allow us to detect an increase in neurogenesis during the first few days of IF. We cannot rule out that IF has a transient effect on aNSCs at some point of the treatment, but this hypothetical effect does not seem to have any consequences on neuronal output or aNSC maintenance. As for the effects on neurogenesis in the longer IF treatments, we used the same EdU protocol as in previous publications: administration after 2/3 months of IF and analysis after one month of chase.

      Discussion needs more specificity and clarity: The authors claim that the absence of IF effects on neurogenesis is multi-layered (including the influence of age, sex, specific cell labelling protocols etc.) but they do not specifically address why certain studies did find IF-driven neurogenic effects while they did not. In addition, some statements and points in the discussion are not clear. For example, when the authors refer to their own experiments (page 8, lines 331-334), it is not clear, which experiments they have in mind.

      We will double check our discussion and improve its clarity and direct comparison to other studies.

      Minor comments:

      1. Change in the title: The authors have shown that a very specific IF protocol does not affect aNSCs but initially decreases number of new neurons in SGZ. The title should reflect this. For example, it could state "Specific (night-time every-other-day) fasting does not affect aNSCs but initially decreases survival of new neurons in the SGZ".

      We find our title, together with the abstract, clearly and faithfully represent our findings and would rather prefer to keep our current title unmodified.

      Data depiction: Data in 3 datasets were found not normally distributed (Fig. S5A, B and S6A) and were correctly analysed with non-parametric tests. However, the corresponding graphs wrongly depict the data as mean +/- SD while they should depict median +/- IQR (or similar adequate value) because non-parametric statistical tests do not compare means but medians.

      We thank the reviewer for spotting this, we will correct the graphs in Fig. S5A, B and S6A.

      Statistical analysis: For ANOVA, the F and p values are not listed anywhere. The presented asterisks in the graphs are only for non-ANOVA or ANOVA post-hoc tests. This does not allow to judge statistical significance well and should be corrected.

      Again, thanks for spotting this, we will include them.

      Asymmetric vs Symmetric cell divisions: Representative images in Fig.2B suggest that IF may affect the plane of cell division for the Type-1 aNSCs. The plane of cell division is an indirect indicator of symmetric vs asymmetric (exhaustive vs maintaining) modes of cell division. Is it possible, IF influences this, especially during the first week of IF (see major comment 5)?

      This is an interesting hypothesis. However, since we do not see any effects on aNSC maintenance, it is unlikely that IF produces any long-lasting effects on the mode of division of aNSCs. In general, we did not notice a difference in the plane of division of aNSCs between control and IF mice, although we did not systematically test for this (would require specific short EdU pulses to capture aNSCs in M-phase). In Figure 2B, the two stem cells shown in the control are unlikely to be the two daughter cells after the division of one aNSC, as one of them is positive and the other negative for Ki67. We only pointed to the second one to show a Ki67-negative aNSC. We will emphasize this in the figure legend.

      Improved and more accurate citations: Some references are not properly formatted (e.g., "Dias", page 7, line 288). Some references are included in generalizing statements when they do not contain data to support such statements. For example, Kitamura et al., 2006 did not determine the number of new neurons (only BrdU+ cells) in the SGZ, yet this reference is included among sources supporting that IF "promote survival of newly born neurons" (page 2, line 60). Authors should be more careful how the cite the references.

      Thanks for spotting these mistakes, we will correct them and check again all our references. As for the sentence where the Kitamura paper is cited, most of the other references also use only BrdU+ cells while concluding that IF enhances the survival of new neurons. We will change new neurons for new cells to reflect this, which we already bring up in the discussion (see also extended discussion in previous BioRxiv version).

      How do the authors explain that they observe 73-80% caloric restriction and yet the final body weight is not different between IF and control animals? Would it suggest that the selected IF protocol or selected diet are not appropriate (see major point 4)?

      We also found this surprising and were expecting a change in overall activity in IF mice, which we did not observe. Many factors might play a role, like, as the reviewer suggests, changes in stress levels, which we will measure and show in the revised version.

      Given that aNSCs rely more on de novo lipogenesis and fatty acids for their metabolism as shown by Knobloch et al., Nature 2013 and given the interesting changes in RER with the IF shown in this study, it would be interesting to see whether there are differences in Fasn expression in aNSCs between control and IF animals (see minor point 4).

      This is an interesting suggestion but given that we see no effect on aNSCs, we find it’s unlikely and unnecessary to test for Fasn expression differences in our IF protocol.

      Determining apoptosis in the SGZ by picnotic nuclei (Figure S6A) should be supplemented by determining the number and/or proportion of YFP+ cells positive for the Activated Caspase 3.

      We previously found that counting picnotic nuclei is a more accurate and sensitive readout of cell death in the DG, as cells positive for caspase 3 are extremely rare due to the high efficiency of phagocytosis of apoptotic cells by microglia (see Urbán et al., 2016).

      Reviewer #1 (Significance (Required)):

      General assessment:

      This study concludes that aNSCs do not respond to the intermittent fasting. This expands and supplements previous findings that suggest that the intermittent fasting promotes adult neurogenesis by increasing survival and/or proliferation in the Subgranural Zone. The study is well designed, however, over-extends its conclusions beyond a specific fasting paradigm and does not acknowledge serious limitations in the experimental design and analyses. In fact, until major revision is done, which would rule out that the absence of effects of fasting on aNSCs is not due to false negative results, many conclusions from this study cannot be accepted as valid.

      Advance:

      As mentioned above, the study has a potential to advance our understanding of how fasting affects neurogenesis and fills the knowledge gap of how fasting specifically affects the stem cells. However, unless the study addresses its limitations, its conclusions are not convincing.

      Audience:

      This study would be particularly interesting for the niche readers from the neurogenesis field. However, the study can also be interesting for researchers in metabolomics and dietology.

      My expertise:

      adult neurogenesis, neural stem cells, dietology, metabolism

      We disagree with the reviewer and find our conclusions well balanced, as we acknowledge our results are to be compared only with similar IF protocols. We also do not believe our results can be attributed to a false negative, as we consistently observe the same with different strains and protocols, always with sufficient animals to make our counts conclusive.

      We nevertheless thank the reviewer for assessing our paper and for the advice to improve it. We hope that the reviewer will maintain the same level of scrutiny and scepticism with all IF-related papers.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Gabarro-Solanas et al. question the suitability of IF (Intermittent fasting - non-pharmacological strategy to counteract ageing, which has been previously shown to increase the number of adult-born neurons in the dentate gyrus of mice) as a pro-neurogenic intervention, since IF treatment did not stimulate adult hippocampal neurogenesis, neither at the stem cell level nor on immature and/or dividing neurons. The Authors used a tamoxifen inducible transgenic model (Glast-CreERT2;RYFP mice) to trace neural stem cell lineage and found that IF did not enhance neural stem cell proliferation, nor the abundance of immature, DCX+ neurons. Three-months of IF failed to increase the number of new adult-born neurons (NeuN+/YFP+), while one month of IF significantly reduced the number of new adult-born neurons.

      The study appears technically sound, including many different approaches in order to reach its conclusions.

      For instance, tamoxifen has been reported to impair various physiological processes, including neurogenesis (Smith et al., 2022), and most studies on adult hippocampal neurogenesis use the C57BL/6J strain of mice; hence, the use of Tamoxifen or that of the GlastCreERT2;RYFP model may have underscored these observations. However, to account for this potentially confounding factor, the Authors characterised the effect of their IF treatment in C57BL/6j mice, also reporting no evident effects of IF as a pro-neurogenic intervention.

      I think the study was carefully planned and the analyses well done. Several possible variables were considered, including sex, labelling method, strain, tamoxifen usage or diet length. Several controls were performed in other organs and tissues (liver, fat) to establish the fasting protocol and to check its effects.

      Data are presented in a clear way. Quality of images is high level.

      In general, it appears as a highly reliable paper reaching an authoritative conclusion for the absence of effect of IF on adult neurogenesis.

      Major comments:

      I think that the key conclusions are convincing and no further experiments are required.

      The methods are presented in such a way that they can be reproduced, and the experiments adequately replicated with proper statistical analysis.

      We thank the reviewer for the encouraging remarks and the appreciation of our efforts.

      Minor comments:

      Prior studies are referenced appropriately, both regarding the IF protocols and the adult neurogenesis modulation.

      Line 288 - a reference is incomplete (Dias); integrate with: (Dias et al., 2021)

      We will re-format the reference, thanks for spotting the mistake.

      There is one concept that is not expressed in the manuscript. Maybe it is not strictly necessary, but I think can be useful to mention it here. It is the fact that most information currently available strongly indicates that adult neurogenesis in humans is not present after adolescence. Of course the research described here is carried out on mice, and in the manuscript it is stated many times that adult hippocampal neurogenesis is strongly decreasing with age, also due to age-related stem cell depletion. Yet, it seems that in humans the exhaustion of such a process can start after adolescence. We know that a sort of controversy is currently present on this subjects, because DCX+ neurons can be detected in adult and old human hippocampi. Yet, it is also clear that there is no substantial cell division (stem cells are depleted) to sustain such hypothetical neurogenesis. Hence, it has been hypothesized that non-newlyborn, "immature" neurons can persist in the absence of cell division, as it has been well demonstrated in the cerebral cortex (see La Rosa et al., 2020 Front Neurosci; Rotheneichner et al., 2018, Cereb Cortex).

      This point can be important in the case someone want to use dietary approached such as IF (or any other pharmacological treatment) to stimulate neurogenesis in humans.

      We agree with the reviewer and also find this a very interesting and timely topic. However, we find it a bit far from our results and would prefer not to comment on it in the context of the current paper.

      Reviewer #2 (Significance (Required)):

      The significance of this study relies on the fact that adult neurogenesis field (AN) has been often damaged by the search of "positive" results, aiming at showing that AN does occur "always and everywhere" and that most internal/external stimuli do increase it. This attitude created a bias in the field, persuading many scientists that a result in AN is worthy of publication (or of high impact factor publication) only when a positive result is found.

      Personally, I found particularly meaninful the last sentences of the Discussion (reported below), which might seem "off topic" in a research paper, while - I think - underline the real significance of the manuscript:

      "In addition, publication bias might be playing a role in skewing the literature on fasting and neurogenesis towards reporting positive results.

      In some reviews, even studies reporting no effect are cited as evidence for improved neurogenesis upon IF. Reporting of negative results, especially those challenging accepted dogmas, and a careful and rigorous evaluation of the publications cited in reviews are crucial to avoid unnecessary waste of resources and to promote the advancement of science."

      Reviewer field of expertise - keywords: adult neurogenesis, brain structural plasticity, non-newly born immature neurons, comparative neuroplasticity.

      We are very happy that the reviewer shares our concern with the biased publication of positive results in the field. We hope our work (and that of Roberts et al., 2022) will encourage other labs to publish their negative results.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Gabarro-Solanas et al. investigate the effects of intermittent fasting (IF) on adult hippocampal neurogenesis in young adult mice. IF has been reported to increase the number of adult-born neuron in the hippocampus, a region that is important for learning and memory. However, it is not well understood what stages of adult neurogenesis are regulated by IF. To address this, the authors utilized lineage tracing and label retention assays in mice undergoing an IF diet. The authors used 2 months old Glast-CreERT2;RYFP mice in combination with Edu label retention to characterize adult NSCs and placed these mice on 1 and 3 months of IF. Despite seeing a decrease in neural stem cell proliferation with age, the authors did not observe a change due to diet. The authors then used immunohistochemistry to characterize changes in cell proliferation, neuroblasts, and new neurons following 1 month and 3 months of IF. Only 1 month of IF seemed to decrease the number of new neurons; however, by 3 months the neuronal output was the same. There were no differences in neuroblasts or cell proliferation due to diet. Gabarro-Solanas et al. conclude that IF transiently and mildly inhibits neurogenesis. Due to contradicting results, the authors then try to determine what variables (sex, labeling method, strain, tamoxifen usage, or diet length) could be affecting their data. The authors saw no substantial differences due to any of their variables.

      Major Points

      1. The authors analyze NSCs homeostasis and neurogenesis in young adult mice and do not observe any significant changes with their chosen alternate day intermittent fasting paradigm. However, a lot of the data and cell counts appears to be highly variable between animals in the same group. At times, there is an order of magnitude difference between the highest and lowest counts (e.g. Figure 2C,E). According to the method section, it appears that the authors predominantly analyzed a single DG (section?) for most immunostainings, which may explain the large variability in their data. If this is indeed the case, it is insufficient to quantify only a single section for each animal. The authors should quantify several DG sections for each mouse from a pre-defined range along the rostral-caudal axis of the hippocampus in accordance with a standard brain reference atlas. There are also several quantifications, especially of Ki67 where several individuals appear to have no Ki67+ (Figure 3B, 6D) NSCs. These findings are surprising given the still young age of these mice and may be another reflection of the limited brain sections that were analyzed.

      The counts are indeed very variable. The counts were made on 1 to 4 DG sections (counted in full), depending on the staining. We will more clearly disclose this information in the revised version. In addition, we will re-count the neuronal output after fasting using stereology. Regarding the very low number of Ki67+ aNSCs, our counts are lower than those in other publications because we are much more stringent with our aNSC identification. Instead of using merely Sox2 (which also labels IPCs), we rely on the presence of a radial GFAP+ process.

      There appear to be significant cutting or imaging artifacts across most fluorescent images further raising concerns regarding the accuracy of the quantifications (e.g. Figure 3D, 4C,E, 6B) and publication quality of the images and data. Importantly, uneven section thickness, either from cutting artifacts or imaging issues, may lead to inaccurate cell quantifications a could, possibly, account for the high variability. This issue would further exacerbate concerns regarding the quantification of a single DG section for each animal.

      We only processed those samples that passed our QC after sectioning, meaning any unevenly cut brains were never considered (or stained). The stitched images do show artifacts (lower signal in the image junctions), particularly in the NeuN staining. However, this did not affect quantifications, as the measured levels were always clearly above the threshold to consider a cell positive, regardless of the position within the image. The images were cropped to improve the visualisation of NSCs, and to avoid the display of empty tiles. A low magnification image will be provided in the revised version to show that there were no staining artifacts.

      It is unclear how NSCs were counted in the B6 mice (Fig 6D,E). The authors only provide a description for the Glast-CRE mice, where they used YFP labeling and GFAP. We assume they performed Sox2/GFAP or Nestin labeling, however, this is not clear at all. The authors should describe their methodology and provide representative images.

      We used GFAP, location and morphology to count aNSCs in non-YFP mice. We will make this clear in the text and will also add one more count using Sox2, GFAP and Nestin to identify aNSCs.

      NSC populations represent a heterogenous group of stem cells with different replicative properties. As such, the Glast-Cre approach used for the majority of this study may represent a specific subset of NSCs. In line with the previous point, we recommend the authors complement their NSC counts with Sox2/GFAP and Nestin immunostainings.

      aNSCs labelled with Glast-Cre are the great majority of aNSCs (>90%) in both ad libitum fed and fasted mice. The data will be included in the revised version. Nevertheless, we will add counts using Sox2, GFAP and Nestin for key experiments.

      Stress is a significant negative regulator of neurogenesis. Is it possible that the IF mice display higher stress level which could counteract any beneficial effects of the IF intervention. The authors should provide some measures of stress markers to rule out this potential confounding factor in their IF paradigm.

      This is a great suggestion. We will collect blood from control and fasted mice and measure the levels of stress factors (e.g. corticosterone). We will include the data in our revised version.

      Minor Point

      1. The authors state that "Experimental groups were formed by randomly assigning mice from different litters within each mouse strain and all experiments were conducted in male and female mice". Given that neurogenesis, especially at young ages, is highly sensitive to the exact age of the mice, the authors should provide a rationale why animals from different litters instead of littermate controls were used in these experiments.

      Littermate controls were always used in the experiments. But also, more than one litter was used for each experiment, since one litter was never generating enough mice for the experiments. We will clarify this point in text.

      Currently, the statistical tests are only described in the method section, however it would be helpful if this information to be integrated into the figure legend as well. Additionally, the authors provide individual data points for some but not all bar graphs (eg Figure 1D).

      We will consider including the statistical information in the figure legend, provided there is not a maximum length for figure legends. In the case of figure 1D, data points are not shown because of how the food intake was calculated: as an average per cage instead of per animal (included in the materials and methods). We therefore do not consider it useful to show the datapoints in the final version of the manuscript, but will provide them for the reviewer.

      Cell counts per AU is a rather unorthodox unit. With a representative selection of tissue for each animal, the authors could avoid the need to normalize to the DG length and may be able to extrapolate an estimate of cell counts for the entire DG instead.

      Thanks for the suggestion. Our arbitrary units (AU) were in fact already equivalent to cells per mm of DG, and we have updated our graphs to reflect this.

      In Figure 4D, the authors highlight a few NSC with arrowheads. At a quick glance this is rather confusing as it appears that the authors only counted 3 NSCs in each picture. It may be a better option to show a zoomed in picture to highlight an example of a representative NSC.

      Examples of representative NSCs are already shown in Fig 2. With this image, we intended to show a larger number of NSCs. We realise the arrows only pointed to some of them, making the message confusing. We will consider removing them from the figure in the revised version.

      In Supplementary Figure S6, the authors should complement the quantification of the nuclei with representative images.

      We will include representative images in Figure S6.

      For the daytime IF, did the authors assess weights, food intake, RER as well liver/fat measurements similar to night-time IF? If so, this data should be provided in the supplement.

      We do have data for the daytime IF in the metabolic cages, which was taken from mice housed in groups (during the preliminary phase of our study). We also have the weight and data on neurogenesis, which we will show as a supplement.

      Reviewer #3 (Significance (Required)):

      The authors are commended for compiling a manuscript on what is commonly considered 'negative data', that, at the same time, are also contradicting independent reports on the effects of IF on neurogenesis. The studies outlined in this manuscript are comprehensive and mostly well designed. Given the broad, growing interest in dietary restriction as an aging intervention the study is timely.

      We thank the reviewer for the positive assessment of the significance of our work.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      Summary:

      In this manuscript, Gabarró-Solanas et al. tested the effect of intermitted fasting (IF, every-other-day fasting) on adult neural stem cells and neurogenesis. They demonstrate that the paradigm they have used does not affect NSC activation or maintenance, and does also not promote neurogenesis. As previous reports showed increased neurogenesis with IF, the authors controlled for various parameters such as mouse strain, sex, and diet length. They also used different methods of identification of newborn neurons, such as tamoxifen-induced lineage-tracing versus birth-dating with thymidine-analogues to substantiate their findings.

      Major comments:

      This study is very well done with carefully designed and controlled experiments. The manuscript reads nicely and the data are presented in a clear way, making it easy to follow. The authors have done a "tour-de force" to rule out confounding factors that might influence their findings that IF does not affect NSCs nor neurogenesis.

      The claims and conclusions are supported by the data. The methods are clearly described and should allow to reproduce the data independently. The number of replicates (i.e. the number of mice analyzed) is impressive and statistical analysis is adequate.

      The major findings, namely that the chosen IF does not affect NSCs and neurogenesis is not in line with some previous studies. Despite a careful ruling out of potentially confounding factors (see also "significance" below), it remains unclear why other studies have found an increase in neurogenesis with IF. As each of these studies has some specific experimental design, it is difficult to judge these data in the context of previous data without going through all the details of the other studies. It would thus be a great help for the reader if the authors could provide a table or schematic, which lists the major parameters of each of these studies, such as detailed paradigm of IF, age of mice at start, sex, duration of the intervention, method of identification of NSCs and neurogenesis etc.

      This is a very good suggestion, and we had already created such a table. We, however, consider that it might be better suited for a review on the effects of IF on neurogenesis than for this work. We will include the table in our response to the reviewers together with our revised version.

      Two points that the authors have not discussed might also be worth mentioning in the discussion part:

      1.) The mice in the night-time IF were single caged, could there be a potential negative effect on neurogenesis that would mask the presumably beneficial effect of IF? Although the controls were also single caged, the stress of social isolation might play a role?

      The mice were only single caged for the metabolic phenotyping, but not for the neurogenic counts. We will make this clearer in the text. In any case, we do agree that stress might play a role and we will measure stress levels in the control and fasted mice and will include this data in the revised version.

      2.) The IF mice gained the same weight over time (Fig. S2), but had a ~20% reduction in overall calory intake. This would be explainable by a reduction in energy expenditure, but the overall activity was also not significantly changed (Fig. S1). Can the authors speculate why they reach the same weight with less calories?

      We also found this surprising and were expecting a reduction in the overall activity of the fasted mice. We do not have an explanation for this discrepancy, but perhaps stress levels might explain part of it (we will check stress levels in the revised version). We will also look at whether energy expenditure and activity levels changed over time.

      Minor comments:

      1.) It would be nice to replace the arbitrary units (AU) in the graphs were this is used (e.g. Fig. 2F, 3C, 4B, D and F etc) to the actual number of cells per a certain µm DG, so that the number of cells can be put in context and compared between the figures.

      Yes, our AU already corresponded to mm and we will update our figures accordingly.

      2.) Fig 3 D: can the authors also show the Ki67 channel to illustrate how it looks after a 3 month IF?

      We find it does not help much, as Ki67+ cells are mostly IPCs and that data is already shown in Fig. 4A. We will nevertheless include the image in our response to the reviewers together with our revised version.

      3.) Fig.4E: the NeuN staining looks strangely interrupted, this might be due to tile-stitching? In that case, it would be better to either only show one segment or to try to get a better stitching algorhythm.

      It is indeed because of the tile-stitching and uneven illumination. However, this did not affect the counts, as already discussed in the response to reviewer #3 (major point #2).

      4.) Fig.6 D shows a minus axis in Y-axis, this should only been shown from 0 to positive values, as it is a percentage of cells and cannot be negative.

      True, thanks for spotting this. We will correct the graphs in the revised version.

      5.) Fig.6 B: the same problem with the NeuN staining as mentioned under point 3. This should be improved.

      As with point 3, the stitching did not affect the quantification. We find it more accurate to show the image with the stitching, as that was the one used for quantification. We will provide a new picture with lower magnification to better show the quality of the staining.

      6.) Fig. S6B: maybe add a comment in the result part or in the figure legend that a 10 day chase after an EdU pulse is not the classical protocol to look at mature NeuN positive neurons. But apparently enough newborn neurons were already NeuN positive for this quantification.

      We fully agree 10 days is not the standard for neuronal identification. We did find neurons after the 10-day chase but in low numbers. We will add a comment in the text of the revised version to clarify this.

      7.) The authors refer to personal communications with M. Mattson and S. Thuret to underline that circadian disruption is not enough to explain the differences (line 367 onwards). Can they refer the reader to published data instead?

      While the results are published in their papers, the methods did not specify the time at which the food was added/removed for the IF protocol. That is why we refer to personal communication.

      Further showing that disruption of circadian rhythms is not enough to explain the difference in outcome of the IF protocol, we will show the data for the 1-month daytime IF, which again does not increase adult neurogenesis (reviewer #3, minor point #6).

      Reviewer #4 (Significance (Required)):

      Given the great interest in the seemingly positive effects on health of IF in general, and also for increasing neurogenesis, it is important to better understand the mechanism of this intervention. The study by Gabarró-Solanas et al. clearly demonstrates that IF is not a universal, "works all the time" way of increasing neurogenesis. The study is very well done, with well controlled and measured parameters. It shows that a physiological interference such as IF might depend on many factors and might be less robust across laboratories than anticipated. This study is a very good example that all the details of the experimental settings need to be taken into consideration and are ideally reported with every IF study. It is also a good example how to follow up "no effect" data in a way that they are conclusive.

      The significance of this study is to point out that IF as a strategy to increase neurogenesis needs to be reconsidered. It raises the questions how IF can be beneficial in some studies and not in others, asking for more experiments to better understand the detailed mechanisms of IF action. In a systematic approach, this study rules out some of the potentially confounding factors and shows that at least with the chosen IF paradigm, these factors are not the reason for not seeing increased neurogenesis. The study is thus of clear interest for the neurogenesis field and will also need to be considered by the broader field of IF research, although it speaks against the beneficial effects of IF. It might have the potential to bring together the different study authors who did or did not see increased neurogenesis with IF and discuss together the non-published details of their study design to advance the field.

      We thank the reviewer for the positive assessment of our work and for acknowledging its importance for the broader field of IF research.

      List of references used in the response to reviewers:

      Anson, R. M. et al. Intermittent fasting dissociates beneficial effects of dietary restriction on glucose metabolism and neuronal resistance to injury from calorie intake. Proceedings of the National Academy of Sciences 100, 6216–6220 (2003).

      Bok, E. et al. Dietary Restriction and Neuroinflammation: A Potential Mechanistic Link. International Journal of Molecular Sciences 20, 464 (2019).

      Cignarella, F. et al. Intermittent Fasting Confers Protection in CNS Autoimmunity by Altering the Gut Microbiota. Cell Metabolism 27, 1222-1235.e6 (2018).

      Dai, S. et al. Intermittent fasting reduces neuroinflammation in intracerebral hemorrhage through the Sirt3/Nrf2/HO-1 pathway. Journal of Neuroinflammation 19, 122 (2022).

      Dias, G. P. et al. Intermittent fasting enhances long-term memory consolidation, adult hippocampal neurogenesis, and expression of longevity gene Klotho. Mol Psychiatry 1–15 (2021).

      Goodrick, C. L., Ingram, D. K., Reynolds, M. A., Freeman, J. R. & Cider, N. Effects of intermittent feeding upon body weight and lifespan in inbred mice: interaction of genotype and age. Mechanisms of Ageing and Development 55, 69–87 (1990).

      Gudden, J., Arias Vasquez, A. & Bloemendaal, M. The Effects of Intermittent Fasting on Brain and Cognitive Function. Nutrients 13, 3166 (2021).

      Lee, J., Seroogy, K. B. & Mattson, M. P. Dietary restriction enhances neurotrophin expression and neurogenesis in the hippocampus of adult mice. Journal of Neurochemistry 80, 539–547 (2002).

      Rangan, P. et al. Fasting-mimicking diet cycles reduce neuroinflammation to attenuate cognitive decline in Alzheimer’s models. Cell Reports 40, 111417 (2022).

      Roberts, L. D. et al. The 5:2 diet does not increase adult hippocampal neurogenesis or enhance spatial memory in mice. 2022.10.03.510613 BioRxiv Preprint (2022).

      Song, M.-Y. et al. Energy restriction induced SIRT6 inhibits microglia activation and promotes angiogenesis in cerebral ischemia via transcriptional inhibition of TXNIP. Cell Death Dis 13, 449 (2022).

      Urbán, N. et al. Return to quiescence of mouse neural stem cells by degradation of a proactivation protein. Science 353, 292–295 (2016).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, Gabarro-Solanas et al. question the suitability of IF (Intermittent fasting - non-pharmacological strategy to counteract ageing, which has been previously shown to increase the number of adult-born neurons in the dentate gyrus of mice) as a pro-neurogenic intervention, since IF treatment did not stimulate adult hippocampal neurogenesis, neither at the stem cell level nor on immature and/or dividing neurons. The Authors used a tamoxifen inducible transgenic model (Glast-CreERT2;RYFP mice) to trace neural stem cell lineage and found that IF did not enhance neural stem cell proliferation, nor the abundance of immature, DCX+ neurons. Three-months of IF failed to increase the number of new adult-born neurons (NeuN+/YFP+), while one month of IF significantly reduced the number of new adult-born neurons.

      The study appers technically sound, including many different approaches in order to reach its conclusions. For instance, tamoxifen has been reported to impair various physiological processes, including neurogenesis (Smith et al., 2022), and most studies on adult hippocampal neurogenesis use the C57BL/6J strain of mice; hence, the use of Tamoxifen or that of the GlastCreERT2;RYFP model may have underscored these observations. However, to account for this potentially confounding factor, the Authors characterised the effect of their IF treatment in C57BL/6j mice, also reporting no evident effects of IF as a pro-neurogenic intervention. I think the study was carefully planned and the analyses well done. Several possible variables were considered, including sex, labelling method, strain, tamoxifen usage or diet length. Several controls were performed in other organs and tissues (liver, fat) to establish the fasting protocol and to check its effects. Data are presented in a clear way. Quality of images is high level. In general, it appears as a highly reliable paper reaching an authoritative conclusion for the absence of effect of IF on adult neurogenesis.

      Major comments:

      I think that the key conclusions are convincing and no further experiments are required. The methods are presented in such a way that they can be reproduced, and the experiments adequately replicated with proper statistical analysis.

      Minor comments:

      Prior studies are referenced appropriately, both regarding the IF protocols and the adult neurogenesis modulation. Line 288 - a reference is incomplete (Dias); integrate with: (Dias et al., 2021) There is one concept that is not expressed in the manuscript. Maybe it is not strictly necessary, but I think can be useful to mention it here. It is the fact that most information currently available strongly indicates that adult neurogenesis in humans is not present after adolescence. Of course the research described here is carried out on mice, and in the manuscript it is stated many times that adult hippocampal neurogenesis is strongly decreasing with age, also due to age-related stem cell depletion. Yet, it seems that in humans the exhaustion of such a process can start after adolescence. We know that a sort of controversy is currently present on this subjects, because DCX+ neurons can be detected in adult and old human hippocampi. Yet, it is also clear that there is no substantial cell division (stem cells are depleted) to sustain such hypothetical neurogenesis. Hence, it has been hypothesized that non-newlyborn, "immature" neurons can persist in the absence of cell division, as it has been well demonstrated in the cerebral cortex (see La Rosa et al., 2020 Front Neurosci; Rotheneichner et al., 2018, Cereb Cortex). This point can be important in the case someone want to use dietary approached such as IF (or any other pharmacological treatment) to stimulate neurogenesis in humans.

      Significance

      The significance of this study relies on the fact that adult neurogenesis field (AN) has been often damaged by the search of "positive" results, aiming at showing that AN does occur "always and everywhere" and that most internal/external stimuli do increase it. This attitude created a bias in the field, persuading many scientists that a result in AN is worthy of publication (or of high impact factor publication) only when a positive result is found.

      Personally, I found particularly meaninful the last sentences of the Discussion (reported below), which might seem "off topic" in a research paper, while - I think - underline the real significance of the manuscript: "In addition, publication bias might be playing a role in skewing the literature on fasting and neurogenesis towards reporting positive results.

      In some reviews, even studies reporting no effect are cited as evidence for improved neurogenesis upon IF. Reporting of negative results, especially those challenging accepted dogmas, and a careful and rigorous evaluation of the publications cited in reviews are crucial to avoid unnecessary waste of resources and to promote the advancement of science."

      Reviewer field of expertise - keywords: adult neurogenesis, brain structural plasticity, non-newly born immature neurons, comparative neuroplasticity.

    1. we must acknowledge that our styles of teaching may need to change. Let's face it: most of us were taught in classrooms where styles of teachings reflected the hotion of a single norm of thought and experience, which we were encouraged to believe was universal

      This quote addresses our society's ever-changing nature, and the need for our teaching methods to change as a result. The way that our current educational institutions are structured are a sort of "one-size-fits all". However, it only fits a select group and does not take into account that people think and learn differently.

    2. we must acknowledge that our styles of teaching may need to change. Let's face it: most of us were taught in classrooms where styles of teachings reflected the hotion of a single norm of thought and experience, which we were encouraged to believe was universal. This has been just as true for nonwhite teachers as for white teachers. Most of us learned to teach emulating this model.

      I agree with the author that if the next generation can be made to understand and change perceptions, it will require teachers and parents to show respect for different groups in society by example, and I think it is beneficial for children to learn about groups that are different from their own social class. Changing teaching styles is also an important point, as mentioned in the article that there needs to be different learning styles for students, so teachers need to learn more diverse teaching styles to improve the quality of teaching.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Marta Sanvicente-García et al and colleague developed a comprehensive and versatile genome editing web application tool and a nextflow pipeline to give support to gene editing experimental design and analysis.

      The manuscript is well written and all data are clearly shown.

      While I did not tested extensively, the software seems to work well and I have no reason to doubt the authors' claims.

      I usually prefer ready to use web applications like outknocker, they are in general easier to use for rookies (it would be good if the author could cite it, since it is very well implemented) but the nextflow implementation is anyway well suited.

      We have been able to analyze the testing dataset that they provide, but we have tried to run it with our dataset and we have not been able to obtain results. We have also tried to run it with the testing dataset of CRISPRnano and CRISPResso2 without obtaining results. The error message has been in all the cases: “No reads mapping to the reference sequence were found.”

      Few minor points:

      Regarding the methods to assess whether the genome editing is working or not, I would definitely include High Resolution Melt Analysis, which is by far the fastest and probably more sensitive amongst the others.

      Following the Reviewer 1 suggestion, we have added this technique in the introduction: “Another genotyping method that has been successfully used to evaluate genome editing is high-resolution melting analysis (HRMA) [REFERENCE]. This is a simple and efficient real-time polymerase chain reaction-based technique.”

      Another point that would important to taclke is that often these pipelines do nto define the system they are working with (eg diploid, aploid vs etc). This will change the number of reads needed ato unambigously call the genotypes detected and to perform the downstream analysis (the CRISPRnano authors mentioned this point).

      In the introduction, it is already said: " it is capable of analyzing edited bulk cell populations as well as individual clones". In addition, following this suggestion we have added in the help page of CRISPR-A web application and in the documentation of the nextflow pipeline a recommended sample coverage to orient the users on that.

      I am also wondering whether the name CRISPR-A is appropriate since someone could confuse it with CRISPRa.

      CRISPR-A is an abbreviation for CRISPR-Analytics. Even if it is true that it can be pronounced in the same way that CRISPRa screening libraries, it is spelled differently and would be easily differentiated by context.

      CROSS-CONSULTATION COMMENTS

      Reviewer 2 made an excellent work and raised important concerns about the software they need to be addressed carefully.

      In the meantime we had more time to test the software and can confirm some of the findings of Reviewer 1:

      1) We spent hours running (unsuccessfully) CRISPR A on Nextflow. The software does not seem to run properly.

      2) No manual or instruction can be found on both their repositories (https://bitbucket.org/synbiolab/crispr-a_nextflow/

      https://bitbucket.org/synbiolab/crispr-a_figures/)

      We have added a readme.md file to both repositories and we hope that with the new documentation the software can be downloaded and run easily. We have also added an example test in CRISPR-A nextflow pipeline to facilitate the testing of the software. Currently, the software is implemented in DLS1 instead of DLS2, making it impossible to be run with the latest version of nextflow. We are planning to make the update soon, but we want to do it while moving the pipeline to crisprseq nf-core pipeline to follow better standards and make it fully reproducible and reusable.

      Few more points to be considered:

      • UMI clustering is not proper terminology. Barcode multiplexing/demultiplexing (SQK-LSK109 from Oxford Nanopore).

      We have added more details in the methods section “Library prep and Illumina sequencing with Unique Molecular Identifiers (UMIs)” to clarify the process and used terminology: “Uni-Molecular Identifiers are added through a 2 cycles PCR, called UMI tagging, to ensure that each identifier comes just from one molecule. Barcodes to demultiplex by sample are added later, after the UMI tagging, in the early and late PCR.”

      We had already explained the computational pipeline through which these UMIs are clustered together to obtain a consensus of the amplified sequences in “CRISPR-A gene editing analysis pipeline” section in methods:

      “An adapted version of extract_umis.py script from pipeline_umi_amplicon pipeline (distributed by ONT https://github.com/nanoporetech/ pipeline-umi-amplicon) is used to get UMI sequences from the reads, when the three PCRs experimental protocol is applied. Then vsearch⁴⁸ is used to cluster UMI sequences. UMIs are polished using minimap2³² and racon⁴⁹ and consensus sequences are obtained using minialign (https://github.com/ocxtal/minialign) and medaka (https://github.com/nanoporetech/medaka).”

      We also have added the following in “CRISPR-A gene editing analysis pipeline” methods section to help to understand differences between the barcodes that can be used: “In case of working with pooled samples, the demultiplexing of the samples has to be done before running CRISPR-A analysis pipeline using the proper software in function of the sequencing used platform. The resulting FASTQ files are the main input of the pipeline.”

      Then, SQK-LSK109 from Oxford Nanopore is followed through the steps specified in methods: “The Custom PCR UMI (with SQK-LSK109), version CPU_9107_v109_revA_09Oct2020 (Nanopore Protocol) was followed from UMI tagging step to the late PCR and clean-up step.”

      Finally, we want to highlight that, as can be seen in methods as well as in discussion, UMIs are used to group sequences that have been amplified from the same genome and not to identify different samples: “Precision has been enhanced in CRISPR-A through three different approaches. [...] We also removed indels in noisy positions when the consensus of clusterized sequences by UMI are used after filtering by UBS.” As well as in results (Fig. 5C).

      • Text in Figure 5 is hard to read.

      We have increased the letter size of Figure 5.

      • They should test the software based on the ground truth data

      We have added a human classified dataset to do the final benchmarking. And we can see that for all examined samples CRISPR-A has an accuracy higher than 0.9. As has been shown in the figure with manual curated data, CRISPR-A shows good results in noisy samples using the empiric noise removal algorithm, without need of filtering by edition windows.

      • The alignment algorithm is not the best one, I think minimap2 would be better for general purpose (at least it work better for ONT).

      As can be seen in figure 2A, minimap is one of the alignment methods that gives better results for the aim of the pipeline. In addition, we have tuned the parameters (Figure 2B) for a better detection of CRISPR-based long deletions, which can be more difficult to report in a single open gap of the alignment.

      • The minimum configuration for installation was not mentioned (for their Docker/next flow pipeline).

      Proper documentation to indicate the configuration requirements for installation has been added to the readme.md of the repository·

      • Fig 2: why do they use PC4/PC1?

      Principal Component Analysis is used to reduce the number of dimensions in a dataset and help to understand the effect of the explainable variables, detect trends or samples that are labeled in incorrect groups, simplify data visualization… Even PC4 explains less variability than PC2 or PC3, this helps us to understand and better decipher the effect of the 4 different analyzed parameters even if the differences are not big. We have decided to include as a supplementary figure other PCs to show these.

      • There are still typos and unclear statements thorughout the whole manuscript.

      One more drawback is that the software seems to only support single FASTQ uploading (or we cannot see the option to add more FASTQ).

      In the case of paired-end reads instead of single-end reads, in the web application, these can be selected at the beginning answering the question “How should we analyze your reads? Type of Analysis: Single-end Reads; Paired-end Reads”. In the case of the pipeline, now it is explained in the documentation how to mark if the data is paired-end or single-end. It has to be indicated in “input” and “r2file” configuration variables.

      In the case of multiple samples, and for that reason multiple FASTQ files, there is the button to add more samples in the web application. In the pipeline, multiple samples can be analyzed in a single run by putting all together in a folder and indicating it with variable “input”.

      Since usually people analyze more than one clone at the time (we usually analyze 96 clones together) this would mean that I have to upload manually each one of them.

      All files can be added in the same folder and analyzed in a single run using the nextflow pipeline. Web application has a limit of ten samples that can be added clicking the button “Add more”.

      Also, the software (the webserver, the docker does not work) works with Illumina data in our hands but not with ONT.

      This should be clarified in the manuscript.

      If a fastq is uploaded to CRISPR-A, the analysis can be done even if we haven't specifically optimized the tool for long reads sequencing platforms. We have checked the performance of CRISPR-A with CRISPRnano nanopore testing dataset and we have succeeded in the analysis. See results here: https://synbio.upf.edu/crispr-a/RUNS/tmp_1118819937/.

      Summary of the results:

      Sample

      CRISPRnano

      CRISPR-A

      'rep_3_test_800'

      42.60 % (-1del); 12.72 % (-10del)

      71% (-1del);

      16% (-10del)

      – 36 (logo)

      'rep_3_test_400'

      37.50 % (-1del); 15.63 % (-10 del)

      65% (-1del);

      28% (-10del)

      – 38 (logo)

      'rep_1_test_200'

      39.29 % (-1del); 8.33 % (-17del)

      10del; 17del; 1del

      'rep_1_test_400'

      80.11 % (-17 del)

      del17; del20; del18; del16;del 16

      'rep_0_test_400'

      80.11% (-17 del)

      del17; del20; del 18; del16; del16

      'rep_0_test_200'

      71.91% (-17 del)

      del17; del18

      As we can see from these exemple, CRISPR-A reports all indels in general without classifying them as edits or noise. Since nanopore data has a high number of indels as sequencing errors the percentages of CRISPR-A are not accurate. Eventhat, CRISPR-A reports more diverse outcomes, which are probably edits, than CRISPRnano.

      Therefore, we have added the following text in results:

      “Even single-molecule sequencing (eg. PacBio, Nanopore..) can be analyzed by CRISPR-A, targeted sequencing by synthesis data is required for precise quantification.”

      Reviewer #1 (Significance (Required)):

      As I mentioned above I think this could be a useful software for those people that are screening genome editing cells. Since CRISPR is widely used i assume that the audience is broad.

      There are many other software that perform similarly to CRISPR-A but it seems that this software adds few more things and seems to be more precise. It is hard to understand if everything the author claims is accurate since it requires a lot of testing and time and the reviewing time is of just two weeks. But 1) I have no reason to doubt the authors and 2) the software works

      Broad audience (people using CRISPR)

      Genetics, Genome Engineering, software development (we develop a very similar software), genetic compensation, stem cell biology

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      CRISPR-Analytics, abbreviated as CRISPR-A, is a web application implementing a tool for analyzing editing experiments. The tool can analyze various experiment types - single cleavage experiments, base editing, prime editing, and HDR. The required data for the analysis consists of NGS raw data or simulated data, in fastq, protospacer sequence and cut site. Amplicon sequence is also needed in cases where the amplified genome is absent from the genome reference list. The tool pipeline is implemented in NextFlow and has an interactive web application for visualizing the results of the analysis, including embedding the results into an IGV browser.

      The authors developed a gene editing simulation mechanism that enables the user to assess an experiment design and to predict expected outcomes. Simulated data was generated by SimGE over primary T-cells. The parameters and distributions were also fitted for 3 cell lines to make it more generalized (Hek293, K562, and HCT116). The process simulated CRISPR-CAS9 activity and the resulting insertions, deletions, and substitutions. The simulation results are then compared to the experimental results. The authors report the Jensen-Shannon (JS) divergence between the results. The exact distributions that served as input to the JS are not well defined in the manuscript (see below).

      To clarify the used distributions in the JS divergence calculation, we have changed the following piece of text in section “Simulations evaluation” of methods:

      “ Afterward, we tested the performance on the fifth fold, generating the simulated sequences with the same target and gRNA as the samples that belong to the fifth fold, in order to calculate the distance between these. The final validation, with the mean parameters of the different training interactions, was performed on a testing data set that was not used in the training. Validation was done with samples that had never taken place in the training process. Jensen distance is used to compare the characterization of real samples and simulated samples since this is the explored distance that differentiates better replicates among samples. In order to obtain the different distributions, the T cell data, including 1.521 unique cut sites, was split into different datasets based on the different classes: deletions, insertions and substitutions. For each of these classes, giving as input the datasets with only that class, we obtained the distribution for size and then for position of indels. The same was done for the other three cell lines: K562, HEK293 and HCT116, which included 96 unique cut sites, with three replicates each. The whole datasets (with 1521 and 96 unique cut sites) were split into five-folds (4 for training and one for test) and validation, in order to train and validate the simulator. Using the parameters obtained during the training-test iterations (the average value of the 5 iterations), we generate simulated sequences with the same target and gRNA as the samples that are assigned to the test subset to calculate the Jensen-Shannon (JS) divergence between the simulated and real samples of that subset. Finally, the same was performed for validation. The input for the distance calculations were the generated simulated subset and its real equivalent (same target and gRNA) distributions of the classes. ”

      The authors also report an investigation of different alignment approaches and how they may affect the resulting characterization of editing activity.

      The authors examine three different approaches to increase what they call "edit quantification accuracy" (aka, in a different place - "precise allele counts determination" - what is this???): (1) spike-in controls (2) UMI's and (3) using mock to denoise the results. See below for our comments about these approaches.

      Moreover, the authors developed an empirical model to reduce noise in the detection of editing activity. This is done by using mock (control), and by normalization and alignment of reads with indels, with the notion and observation that indels that are far from the cut site tend to classify as noise.

      The authors then perform a comparison between 6 different tools, in the context of determining and quantifying editing activities. One important comparison approach uses manually curated data. However - the description of how this dataset was created is far from being sufficiently clear. The comparison is also performed for HDR experiment type, which can be compared only to 2 other tools.

      We have changed alleles by editing outcomes in the title section “Three different approaches to increase precise editing outcomes counts determination” trying to be more clear.

      There is already a section in methods “Manual curation of 30 edited samples” explaining how the manual curation was done.

      We see the potential contribution aspects of the paper to be the following:

      1. NextFlow pipeline implementation is an important engineering contribution. Same is true for the interactive web application
      2. The option to simulate an experiment to assess it is a nice feature and can help experiment design
      3. Identification of amplicons when not provided as input
      4. CRISPR-A seeks substitutions along the entire amplicon sequence and is less dependent on the quantification window and on the putative cutsite
      5. Analysis of the difference, in edit activity, comparing different cell lines
      6. CRISPR-A supports the use of UMIs
      7. Interesting sequence pattern insights - like "...found certain patterns associated with low diversity outcomes: free thymine or adenine at the 3' nucleotide upstream of the cut site that leads to insertions of the same nucleotide, a free cytosine at the same place that leads to its loss, and strong micro-homology patterns that lead to long deletions " We further comment on the soundness of these contributions in our comments below and on their significance in our comments related to the general potential significance of the paper.

      Major comments:

      • Upon attempting to run an analysis from the web interface (https://synbio.upf.edu/crispr-a) and using: fastq of Tx and mock (control), the human genome and the gRNA sequence provided as input for the protospacer field, our run was not successful. In fact the site crashed with no interpretable error message from CRISPR-A. We have improved the error handling together with the explanations in the help page, where you will find a video. Hopefully these improvements will avoid unexpected crashings.

      • Moreover, there should be more clear context. There is no information regarding the type of experiments that can be analyzed with the tool. We figure it is multiplex PCR and NGS but can the tool also be used for GUIDESeq, Capture, CircleSeq etc.? Experiments that could be analyzed are specified in Results: “CRISPR-A analyzes a great variety of experiments with minimal input. Single cleavage experiments, base editing (BE), prime editing (PE), predicted off-target sites or homology directed repair (HDR) can be analyzed without the need of specifying the experimental approach.” We have also specified this in the nextflow pipeline documentation as well as in the web application help page.

      • No off target analysis. Only on-target The accuracy of the tool allows checking if edits in predicted off-target sites are produced, this being an off-target analysis with some restrictions, since just variants of the predicted off-target sites are assessed. Translocations or other structural off-targets will not be detected by CRISPR-A since the input data analyzed by this tool are demultiplexed amplicon or targeted sequencing samples.

      • No translocations and long/complex deletions The source of used data as input does not allow us to do this. There are other tools like CRISPECTOR available for this kind of analysis. We have added this to supplementary table 1.

      • We view the use of a mock experiment as control as a must for any sound attempt to measure edit activity. This is even more so when off-target events need to be assessed (any rigorous application of GE, certainly any application aiming for clinical or crop engineering purposes). We therefore think that all investigation of other approaches should be put in this context. We agree with the necessity of using negative controls to assess editing. For that reason we have included the possibility of using mocks in the quantification. In addition, there are few tools that include this functionality.

      • It's a nice feature to have simulated data, however, it is not a good approach to rely on it. As can be seen in the manuscript we highlight the support that simulations can give without pretending to substitute experimental data by just simulated data. Simulated data has been useful in the development and benchmarking of CRISPR-A, but we are aware of the limitations of simulations. Here some examples from the manuscripts explaining how we have used or can be used simulated data:

      “Analytical tools, and simulations are needed to help in the experimental design.”

      “simulations to help in design or benchmarking”

      “We developed CRISPR-A, a gene editing analyzer that can provide simulations to assess experimental design and outcomes prediction.”

      “Gene editing simulations obtained with SimGE were used to develop the edits calling algorithm as well as for benchmarking CRISPR-A with other tools that have similar applications.”

      Even simulated data has been useful for the development and benchmarking of CRISPR-A, we have also used real data and human validated data.

      • In p7 the authors indicate the implementation of three approaches to improve quantification. They should be clear as to the fact that many other tools and experimental protocols are also using these approaches. for example, ampliCan, CRipresso2 and CRISPECTOR all take into account a mock experiment run in parallel to the treatment. Even in page 7 (results) we don’t mention the other tools that also use mocks for noise correction, we detail this information in Supplementary Table 1. CRISPResso2 was not included since they can run mocks in parallel but only to compare results qualitatively, i.e. there is not noise reduction in their pipeline. It has been added to the table.

      • Figure1: ○ The figure certainly provides what seems to be a positive indication of the simulations approach being close to measured results. Much more details are needed, however, to fully understand the results.

      We have added more details.

      ○ Squema = scheme ??

      We have changed the word “schema” by diagram.

      ○ What was the clustering approach?

      As is said in the caption of Figure 1 the clustering is hierarchical: “hierarchical clustering of real samples and their simulations from validation data set.” And we have added that “The clustering distance used is the JS divergence between the two subsets.”

      ○ What is the input to the JS calculation? What is the dimension of the distributions compared? These details need to be precisely provided.

      The distribution has two dimensions, sizes and counts or positions and counts.

      As said before, to clarify the used distributions in the JS divergence calculation, we have changed the following piece of text in section “Simulations evaluation” of methods:

      “ Afterward, we tested the performance on the fifth fold, generating the simulated sequences with the same target and gRNA as the samples that belong to the fifth fold, in order to calculate the distance between these. The final validation, with the mean parameters of the different training interactions, was performed on a testing data set that was not used in the training. Validation was done with samples that had never taken place in the training process. Jensen distance is used to compare the characterization of real samples and simulated samples since this is the explored distance that differentiates better replicates among samples. In order to obtain the different distributions, the T cell data, including 1.521 unique cut sites, was split into different datasets based on the different classes: deletions, insertions and substitutions. For each of these classes, giving as input the datasets with only that class, we obtained the distribution for size and then for position of indels. The same was done for the other three cell lines: K562, HEK293 and HCT116, which included 96 unique cut sites, with three replicates each. The whole datasets (with 1521 and 96 unique cut sites) were split into five-folds (4 for training and one for test) and validation, in order to train and validate the simulator. Using the parameters obtained during the training-test iterations (the average value of the 5 iterations), we generate simulated sequences with the same target and gRNA as the samples that are assigned to the test subset to calculate the Jensen-Shannon (JS) divergence between the simulated and real samples of that subset. Finally, the same was performed for validation. The input for the distance calculations were the generated simulated subset and its real equivalent (same target and gRNA) distributions of the classes. ”

      ○ What clustering/aggregation approach did the authors use here (average dist, min dist, dist of centers?)

      Hierarchical clustering.

      ○ 5 pairs were selected out of how many? Call that number K.

      We have 100 samples in the validation set. Following the suggestion of indicating the total number of samples in the testing set, we have added this information to the figure caption.

      ○ What does the order of the samples in 1C mean? Is 98_real closer to 22_sim than to 98_sim? If so then state it. If not - what is the meaning of the order? Furthermore - how often, over K choose 2 pairs does this mis-matching occur for the CRISPR-A simulator??

      Exactly, it is a hierarchical clustering, where samples are sorted by JS divergence. It was already stated in Results: “In addition, on top of comparing the distance between the experimental sample and the simulated, we have included two experimental samples, SRR7737722 and SRR7737698, which are replicates. These two and their simulated samples show a low distance between them and a higher distance with other samples.” As well as in Figure 1 caption: “For instance, SRR7737722 and SRR7737698, which cluster together, are the real sample and its simulated sample for two replicates.” Then, since these samples are replicates, its simulations will come from the same input and is expectable to find low distance between these two real samples as well as between both of them and their simulation. We have stated it in the discussion.

      • "From the characterized data we obtained the probability distribution of each class" (page 3) - How is this done? how many guides? how many replicates? what is class? where do you elabore regarding it? how you obtain the distributions? More details of the methods need to be provided. Added in methods.

      • The 96 samples used for development here - where are they taken from? This should be indicated in the first time these samples are mentioned. Namely - bottom of P6 Added: “The 96 samples, from these cell lines, are obtained from a public dataset BioProject PRJNA326019.”

      • CRISPECTOR is not mentioned in the comparison in the section: "CRISPR-A effectively calls indels in simulated and edited samples" (Table S2). Is there a specific reason for having left it out? CRISPECTOR, as well as ampliCan, is not in Table S2, since in this table is shown detailed data from Figure 2. CRISPECTOR is compared with CRISPR-A in figure 5, where the different approaches to enhance precision, like using a negative control, are explored.

      • In the section "Improved discovery and characterization of template-based alleles or objective modifications" - part of the analysis was made over simulated data and then over real data. The authors state "it is difficult to explain the origin of these differences...". Thus, needs to be investigated in more detail ... :) (P5) Moreover - the performance over real data is, at the end of the day, the more interesting one for comparison purposes. We have added this sample to the human validated dataset to understand better what was happening in this case and the results and pertinent discussion have been added in the manuscript: “CRISPResso2 is detecting a 2% more of reads classified as WT. These 2% correspond with the percentage classified as indels by CRISPR-A. In total, the percentage difference between CRISPResso2 and CRISPR-A template-based class is 0.6%, higher in CRISPR-A. CRISPR-A percentage is closer to the ground truth data than CRISPResso2.”

      • We found no explanation of "spike-in"/"spike experimental data" across the entire article. There is some general language about lengths but the scheme is still totally unclear. We have indicated in methods section when we were talking about the spike-in controls.

      • Description of the 96 gRNAs? Is this data from REF26? If so - where do you state this? If so - how do the methods described herein avoid the unique characteristics of the data of REF26? We have added the reference: “The 96 samples, from these cell lines, are obtained from a public dataset BioProject PRJNA326019.” In addition, there are other sources of data, simulations and now even human validated data.

      • "distance between the percentage of microhomology mediated end-joining deletions of samples with the same target was calculated and the mean of all these distances was used to reduce the information of the 96 different targets to a single one." (P6) What is the exact calculation used? which distance? How was clustering performed? What is the connection for gene expression? The used distance was euclidean distance and the clustering was performed using hierarchical clustering. We have added this information to the manuscript. Regarding the connection of gene expression, we are exploring the correlation of two phenotypes: the gene expression of the proteins differentially related with NHEJ and MMEJ pathways, and the gene editing landscape (indel patterns that are related with MMEJ and those that are more prone to be generated with NHEJ). We have tried to improve this explanation in the manuscript.

      • "we have fitted a linear model to transform the indels count depending on its difference in relation to the reference amplicon" (P7) - needs more explanation. Is this part of the pipeline? We have explained better how we have fitted the linear model in methods: “A linear regression model was fitted to obtain the parameters of Equation 1 using spike-in controls experimental data (original count, observed count and size of the change in the synthetic molecules). We have used the lm function from R. Parameter m in Equation 1 is equivalent to the obtained coefficient estimate of x which was 0.156 and n is the intercept (n=10). ”.

      The model is optionally used as part of the pipeline as explained at the end of section “CRISPR-A gene editing analysis pipeline” to correct amplification biases due to differences in amplicon size. Then, what is part of the pipeline is the use of this model to make the transformation of counts from the observed counts to the predicted original counts. This is done with Equation 1 and can be found in the pipeline (VC_parser-cigar.R).

      • What is it "...manually curated data set"? (page 8) This is explained in “Manual curation of 30 edited samples” in methods.

      • Section "CRISPR-A empiric model removes more noise than other approaches" - with what data were the comparisons performed? Moreover, how were the comparison criteria selected (efficiency and sensitivity)? The literature already used several approaches to compare data analysis tools for editing experiments. See for example ampliCan, Crispresso (1 and 2) and CRISPECTOR. Maybe the authors should follow similar lines. The data used in this comparison comes from the reference 26:“26. van Overbeek, M. et al. DNA Repair Profiling Reveals Nonrandom Outcomes at Cas9-Mediated Breaks. Mol. Cell 63, 633–646 (2016).We have added it to the manuscript.

      The values of efficiency and sensitivity were not used directly for the comparison. We wanted to firstly evaluate our own algorithm. For that we obtained the values of efficiency and sensitivity for the previous mentioned dataset. These values were chosen to firstly have an idea of firstly, how much noise the algorithm is able to detect, and secondly, how much of it is able to be reduced after the Tx vs M process. That established a framework of comparison in which we can then compare directly the reported percentage of edition of the different tools.

      Regarding the approaches used to compare data analysis tools for editing experiments, we are going to explain why we haven’t followed similar lines or how we have now included it:

      In the case of ampliCan, the comparison that they do is with a synthetic dataset with introduced errors:

      "synthetic benchmarking previously used to assess these tools (Lindsay et al. 2016), in which experiments were contaminated with simulated off-target reads that resemble the real on-target reads but have a mismatch rate of 30% per base pair".

      In CRISPResso2, they benchmarked the efficiency against an inhouse dataset but this dataset is not published. Finally, for the benchmarking of CRISPECTOR, a manual curated dataset is used as a standard: "Assessment of such classification requires the use of a gold standard dataset of validated editing rates. In this analysis, we define the validated percent indels as the value determined through a detailed human investigation of the individual raw alignment results". In this sense, we have added a human validated dataset to do something similar to complement the analysis that we had already done.

      In the end, we consider that simulated or synthetic datasets, as those used by ampliCan or CRISPResso2, does not capture the complete landscape of confounding events that can be detrimental to the analysis results. Similar limitations are found in the use of a gold standard dataset of validated editing rates, since the amount of reads or samples that can be validated by humans is not big since it is time consuming. In addition, humans can also make errors and have biases. Eventhogh, we have found very valuable talking into consideration adding a human validated dataset to complete our exploration.

      • In the section "CRISPR-A empiric model removes more noise than other approaches" the authors state, incorrectly, that CRISPECTOR only reports the percentage of editing activity per site (there is much more information reported in the HTML report, including the type of edit event detected - deletion, of various lengths, insertions, substitutions etc). (P8) We thank the reviewer for the observation, as indeed the state is incorrect. What we wanted to express is that with CRISPECTOR we cannot trace individually each of the called indels, as any sort of excel or file with this content is given in the output. Therefore we cannot investigate which events have been corrected. To be precise in our statement we changed this sentence to the following:

      “CRISPECTOR, although providing extensive information on the statistics and information about the indels, is not possible to track the reads along their pipeline, thus we cannot know which have been corrected and which have not.”

      • Section "CRISPR-A noise subtraction pipeline" describes a pretty naive method for noise subtraction (P12). Should be rigorously compared, for Tx vs Mock experiments, to CRISPECTOR and to CRISPResso2. In the section "CRISPR-A empiric model removes more noise than other approaches", we perform an exhaustive comparison with a dataset that contains 288 Mock Files vs 864 Tx files. This can be better appreciated in the, now included, figure Sup. 13A. CRISPResso2 was intentionally left out since their pipeline does not use a model to reduce noise but other approaches like reducing the quantification window.

      • "recalculated using a size bias correction model based on spike-in controls empiric data.." (P14). Where is the formula? The formula comes from Equation 1. Now it is correctly referenced.

      • Section "Noise subtraction comparison with ampliCan and CRISPECTOR" - fake mock was generated for comparison. We consider the avoidance of a Mock control in experiments designed to measure editing activity to not be best practice. It is OK to support this approach in CRISPR-A. However - the comparison to tools that predominantly work using a Mock control (including ampliCan and CRISPECTOR) should be done with actual Mock. Not with fake Mock .... (P15) We understand the claims of the reviewer for this point as the use of a “fake” mock may not be the best practice for general comparisons. Nevertheless here what we wanted to compare is the difference in the edition percentages using mock and not using it. Since to make a run for on-target data CRISPECTOR requires a mock, the only way to replicate the conditions of “no mock” was to use a synthetic file with the same characteristics of the treated files in terms of depth, but with no edition/noise events to avoid any correction outside this framework. The other run was made with the 288 real Mocks. This was a solution ad Hoc for CRISPECTOR, with ampliCan we used only real mock since they allow to make runs without a mock for on-target.

      We changed the word fake for synthetic in the Noise subtraction comparison with ampliCan and CRISPECTOR section:

      “As for CRISPECTOR, since it requires a mock file to perform on-target analysis, synthetic mock files were generated”.

      Minor comments:

      • "Also, most of these tools lack important functionalities like reference identification, clustering, or noise subtraction" - bold part incorrect for CRISPECTOR, although it is not aiming only for CRISPECTOR In supplementary table 1, it is already elucidated which are the functionalities that each tool has. We have also added more context to that statement to highlight the differences between different tools:

      “Even not all of them have the same missing functionalities, as can be seen in the Supplementary table 1, CRISPR-A is the only tool that can identifies the amplicon reference from in a reference genome, correct errors through UMI clustering and sequence consensus, correct quantification errors due to differences in amplicon size, and includes interactive plots and a genome browser representation of the alignment.”

      • "Same parameters and probability distributions were fitted for three other cell lines: Hek293, K562, and HCT11626, to make SimGE more generalizable and increase its applicability" (page 3) - how was fitted? It was fitted in the same way as the t-cell samples as specified in methods. We have detailed more methods explaining how SimGE is built.

      • What is the "nature of modification"? (P5) We have changed nature by type for a better understanding.

      • In the section "CRISPR-A effectively calls indels in simulated and edited samples" (P5) towards the end, the authors write that the CRISPR-A algorithm did not give good results for a few examples. They then state that this was corrected and then yielded good results. There is no explanation of what correction was done, if it was implemented in the code and how to avoid/detect it in further cases. The problem was that the used reference sequence was too short. There is no modification in CRISPR-A code, we have just used the whole amplicon reference sequence obtained with the amplicon reference identification functionality of CRISPR-A. We have tried to explain it better in the manuscript: “Once the reference sequence is corrected used is the one corresponding to the whole reference amplicon, obtained with CRISPR-A amplicon sequence discovery function, CRISPR-A shows a perfect edition profile”

      • Cell culture, transfection, and electroporation - explanation only for HEK293, what about the others? (P15) We already had explained it for HEK293 and for C2C12, that are the experiments done by use. In the case of the analysis of the three cell lines and 96 targets we reference the source of the data as this data was not produced in our lab.

      • Typos and unclear wording: ○ "obtention" (P8) → changed by obtaining

      ○ "mico" >> micro (P 7,10) → changed

      ○ "Squema" >> scheme (Fig.1) → changed

      ○ "decombuled" (P10) → changed by separated

      ○ "empiric" >> empirical (P8 and other places) → changed

      ○ "Delins" (P14) → this is not a typo, it is used to indicate that a deletion and insertion has take place (http://varnomen.hgvs.org/recommendations/DNA/variant/delins/)

      ○ "performancer" (P9) → Change to performance

      ○ Change word across all article - "edition" to "editing" → changed. In the case of edition windows it has been changed by quantification windows.

      ○ "...has enough precision to find" (P6) not related to "results" section → We have moved to discussion.

      • Comments on figures: ○ Fig. 2C:

      ■ No CRISPECTOR in the analysis

      It is not included because for on-target analysis this tool requires a mock control sample. For this reason, it is compared in Figure 5D, where samples using negative controls are compared, and in Figure 5E where all tools and their different analysis options are compared.

      ■ It is simulated data only

      Yes, it is. Comparison with real data is done in Figure 2D and 2E. And now we also have added a ground truth data in our comparisons obtained from human validation of the classification of more than 3,000 different reads.

      ■ It is not violin plot as mentioned in the description

      It is a violin plot, but in general there is not much dispersion of the data points making the density curves flat.

      ○ Fig 3A - Is it significant? Yes, it is. We have added this information in the caption of the figure.

      ○ Fig. 4:

      ■ A

      • Each row/column is a vector of 96 guides? No, as it is said in the caption of the figure, it is the “mean between the distances calculated for each of the 96 different targets.”

      • How is the replicate number decided? Is it a different experiment by date? What is separating between experiments? Rep numbers? All this information should be found in the referenced paper from which this dataset comes from as already referenced.

      ■ B - Differential expression:

      We have realized that the caption was not correct, missing the explanations for Fig. 4B and all the following ones moved to a previous letter.

      • How? did you measure RNA? It is already stated in methods that RNAseq data was obtained from SRA database and the analysis was done using nf-core/rnaseq pipeline: “RNAseq differential expression analysis of samples from BioProject PRJNA208620 and PRJNA304717 was performed using nf-core/rnaseq pipeline⁵².”

      • Is the observed data in the figure sufficiently strong in terms of P-value? Yes, at is it is highlighted in the plot with ** and ***. We have also added the p-value in the cation of the figure.

      • Where is the third cell-line? As mentioned in the text, we have just chosen the cell lines that show us higher differences in the the percentage of MMEJ: “HCT116 than in K562, which are the cell lines with the major and minor ratios of MMEJ compared with NHEJ, respectively”.

      ○ Fig.13 - There is no A and B as mentioned in the text

      We thank the reviewer for the observation as we mistakenly uploaded the wrong figure. We corrected it.

      Reviewer #2 (Significance (Required)):

      We repeat the aspects of contribution, as listed in the first part of the review, and comment about significance:

      • NextFlow pipeline implementation is an important engineering contribution. Same is true for the interactive web application

        Significant engineering contribution. Nonetheless, we were not able to run the analysis. So - needs to be checked.

      Hopefully now that the documentation is properly added to the repository it will be easier to run analysis.

      • The option to simulate an experiment to assess it is a nice feature and can help experiment design

        An important methodology contribution

      • Identification of amplicons when not provided as input

        Not important in the context of multiplex PCR and NGS measurement assays, as amplicons will be known. Not clear what other contexts the authors were aiming at.

      It is useful to save time, no need to look for the sequence of each amplicon and add it as input. Also, it can help to detect unspecific amplification, since all amplicons of the same genome can be retrieved from the discovery amplicon process. In addition, we have already found one example where this avoids getting incorrect results: “Once the reference sequence used is the one corresponding to the whole reference amplicon, obtained with CRISPR-A amplicon sequence discovery function, CRISPR-A shows a perfect edition profile”. We have added this to the discussion of the manuscript.

      • CRISPR-A seeks substitutions along the entire amplicon sequence and is less dependent on the quantification window and on the putative cutsite

        Importance/significance needs to be demonstrated

      In figure 3 are shown the results of template-based and substitutions detection. CRISPR-A is a versatile and agnostic tool for gene editing analysis. This means that it can be prepared for the analysis of gene editing of future tools, since the cut site or other elements of experiment design are not required. In addition, it has been shown that when a mock is used its performance is comparable to filtering by edition windows, avoiding the loss of edits when the cut site is slided.

      • Analysis of the difference, in edit activity, comparing different cell lines

        Significant contribution. However - the methods need to be much better explained and the results better described in order for this to be useful to the community.

      We have made an effort to try to be more clear in the description of the results.

      • CRISPR-A supports the use of UMIs

        Mildly significant technical contribution. However - only addresses on-target. Also addressing off-target would have been significant.

      The use of UMIs is something that has never been done before in this context. Sequencing biases are not taken into account and editing percentages are reported as observed. Being able to differentiate between different molecules at the beginning of the amplification sequence, allows a higher precision avoiding under or overestimation of each of the species in a bulk of cells.

      In the case of off-targets, can be for sure done using sequencing the predicted off-target sites. In addition, there are other methods, like GuideSeq that can be used to discover off-targets, but this kind of data is out of the scope of CRISPR-A. Even that, we are aware of the importance of being able to analyse off-targets when in a context of a broad analysis platform and we will take these into consideration when participating in the building of crisprseq pipeline from nf-core.

      • Interesting sequence pattern insights - like "...found certain patterns associated with low diversity outcomes: free thymine or adenine at the 3' nucleotide upstream of the cut site that leads to insertions of the same nucleotide, a free cytosine at the same place that leads to its loss, and strong micro-homology patterns that lead to long deletions "

        As stated - interesting.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      CRISPR-Analytics, abbreviated as CRISPR-A, is a web application implementing a tool for analyzing editing experiments. The tool can analyze various experiment types - single cleavage experiments, base editing, prime editing, and HDR. The required data for the analysis consists of NGS raw data or simulated data, in fastq, protospacer sequence and cut site. Amplicon sequence is also needed in cases where the amplified genome is absent from the genome reference list. The tool pipeline is implemented in NextFlow and has an interactive web application for visualizing the results of the analysis, including embedding the results into an IGV browser. The authors developed a gene editing simulation mechanism that enables the user to assess an experiment design and to predict expected outcomes. Simulated data was generated by SimGE over primary T-cells. The parameters and distributions were also fitted for 3 cell lines to make it more generalized (Hek293, K562, and HCT116). The process simulated CRISPR-CAS9 activity and the resulting insertions, deletions, and substitutions. The simulation results are then compared to the experimental results. The authors report the Jensen-Shannon (JS) divergence between the results. The exact distributions that served as input to the JS are not well defined in the manuscript (see below).

      The authors also report an investigation of different alignment approaches and how they may affect the resulting characterization of editing activity. The authors examine three different approaches to increase what they call "edit quantification accuracy" (aka, in a different place - "precise allele counts determination" - what is this???): (1) spike-in controls (2) UMI's and (3) using mock to denoise the results. See below for our comments about these approaches. Moreover, the authors developed an empirical model to reduce noise in the detection of editing activity. This is done by using mock (control), and by normalization and alignment of reads with indels, with the notion and observation that indels that are far from the cut site tend to classify as noise. The authors then perform a comparison between 6 different tools, in the context of determining and quantifying editing activities. One important comparison approach uses manually curated data. However - the description of how this dataset was created is far from being sufficiently clear. The comparison is also performed for HDR experiment type, which can be compared only to 2 other tools. We see the potential contribution aspects of the paper to be the following:

      1. NextFlow pipeline implementation is an important engineering contribution. Same is true for the interactive web application
      2. The option to simulate an experiment to assess it is a nice feature and can help experiment design
      3. Identification of amplicons when not provided as input
      4. CRISPR-A seeks substitutions along the entire amplicon sequence and is less dependent on the quantification window and on the putative cutsite
      5. Analysis of the difference, in edit activity, comparing different cell lines
      6. CRISPR-A supports the use of UMIs
      7. Interesting sequence pattern insights - like "...found certain patterns associated with low diversity outcomes: free thymine or adenine at the 3' nucleotide upstream of the cut site that leads to insertions of the same nucleotide, a free cytosine at the same place that leads to its loss, and strong micro-homology patterns that lead to long deletions " We further comment on the soundness of these contributions in our comments below and on their significance in our comments related to the general potential significance of the paper.

      Major comments:

      • Upon attempting to run an analysis from the web interface (https://synbio.upf.edu/crispr-a) and using: fastq of Tx and mock (control), the human genome and the gRNA sequence provided as input for the protospacer field, our run was not successful. In fact the site crashed with no interpretable error message from CRISPR-A.
      • Moreover, there should be more clear context. There is no information regarding the type of experiments that can be analyzed with the tool. We figure it is multiplex PCR and NGS but can the tool also be used for GUIDESeq, Capture, CircleSeq etc.?
      • No off target analysis. Only on-target
      • No translocations and long/complex deletions
      • We view the use of a mock experiment as control as a must for any sound attempt to measure edit activity. This is even more so when off-target events need to be assessed (any rigorous application of GE, certainly any application aiming for clinical or crop engineering purposes). We therefore think that all investigation of other approaches should be put in this context.
      • It's a nice feature to have simulated data, however, it is not a good approach to rely on it.
      • In p7 the authors indicate the implementation of three approaches to improve quantification. They should be clear as to the fact that many other tools and experimental protocols are also using these approaches. for example, ampliCan, CRipresso2 and CRISPECTOR all take into account a mock experiment run in parallel to the treatment.
      • Figure1:
        • The figure certainly provides what seems to be a positive indication of the simulations approach being close to measured results. Much more details are needed, however, to fully understand the results.
        • Squema = scheme ??
        • What was the clustering approach?
        • What is the input to the JS calculation? What is the dimension of the distributions compared? These details need to be precisely provided.
        • What clustering/aggregation approach did the authors use here (average dist, min dist, dist of centers?)
        • 5 pairs were selected out of how many? Call that number K.
        • What does the order of the samples in 1C mean? Is 98_real closer to 22_sim than to 98_sim? If so then state it. If not - what is the meaning of the order? Furthermore - how often, over K choose 2 pairs does this mis-matching occur for the CRISPR-A simulator??
      • "From the characterized data we obtained the probability distribution of each class" (page 3) - How is this done? how many guides? how many replicates? what is class? where do you elabore regarding it? how you obtain the distributions? More details of the methods need to be provided.
      • The 96 samples used for development here - where are they taken from? This should be indicated in the first time these samples are mentioned. Namely - bottom of P6
      • CRISPECTOR is not mentioned in the comparison in the section: "CRISPR-A effectively calls indels in simulated and edited samples" (Table S2). Is there a specific reason for having left it out?
      • In the section "Improved discovery and characterization of template-based alleles or objective modifications" - part of the analysis was made over simulated data and then over real data. The authors state "it is difficult to explain the origin of these differences...". Thus, needs to be investigated in more detail ... :) (P5) Moreover - the performance over real data is, at the end of the day, the more interesting one for comparison purposes.
      • We found no explanation of "spike-in"/"spike experimental data" across the entire article. There is some general language about lengths but the scheme is still totally unclear.
      • Description of the 96 gRNAs? Is this data from REF26? If so - where do you state this? If so - how do the methods described herein avoid the unique characteristics of the data of REF26?
      • "distance between the percentage of microhomology mediated end-joining deletions of samples with the same target was calculated and the mean of all these distances was used to reduce the information of the 96 different targets to a single one." (P6) What is the exact calculation used? which distance? How was clustering performed? What is the connection for gene expression?
      • "we have fitted a linear model to transform the indels count depending on its difference in relation to the reference amplicon" (P7) - needs more explanation. Is this part of the pipeline?
      • What is it "...manually curated data set"? (page 8)
      • Section "CRISPR-A empiric model removes more noise than other approaches" - with what data were the comparisons performed? Moreover, how were the comparison criteria selected (efficiency and sensitivity)? The literature already used several approaches to compare data analysis tools for editing experiments. See for example ampliCan, Crispresso (1 and 2) and CRISPECTOR. Maybe the authors should follow similar lines.
      • In the section "CRISPR-A empiric model removes more noise than other approaches" the authors state, incorrectly, that CRISPECTOR only reports the percentage of editing activity per site (there is much more information reported in the HTML report, including the type of edit event detected - deletion, of various lengths, insertions, substitutions etc). (P8)
      • Section "CRISPR-A noise subtraction pipeline" describes a pretty naive method for noise subtraction (P12). Should be rigorously compared, for Tx vs Mock experiments, to CRISPECTOR and to CRISPResso2.
      • "recalculated using a size bias correction model based on spike-in controls empiric data.." (P14). Where is the formula?
      • Section "Noise subtraction comparison with ampliCan and CRISPECTOR" - fake mock was generated for comparison. We consider the avoidance of a Mock control in experiments designed to measure editing activity to not be best practice. It is OK to support this approach in CRISPR-A. However - the comparison to tools that predominantly work using a Mock control (including ampliCan and CRISPECTOR) should be done with actual Mock. Not with fake Mock .... (P15)

      Minor comments:

      • "Also, most of these tools lack important functionalities like reference identification, clustering, or noise subtraction" - bold part incorrect for CRISPECTOR, although it is not aiming only for CRISPECTOR
      • "Same parameters and probability distributions were fitted for three other cell lines: Hek293, K562, and HCT11626, to make SimGE more generalizable and increase its applicability" (page 3) - how was fitted?
      • What is the "nature of modification"? (P5)
      • In the section "CRISPR-A effectively calls indels in simulated and edited samples" (P5) towards the end, the authors write that the CRISPR-A algorithm did not give good results for a few examples. They then state that this was corrected and then yielded good results. There is no explanation of what correction was done, if it was implemented in the code and how to avoid/detect it in further cases.
      • Cell culture, transfection, and electroporation - explanation only for HEK293, what about the others? (P15)
      • Typos and unclear wording:
        • "obtention" (P8)
        • "mico" >> micro (P 7,10)
        • "Squema" >> scheme (Fig.1)
        • "decombuled" (P10)
        • "empiric" >> empirical (P8 and other places)
        • "Delins" (P14)
        • "performancer" (P9)
        • Change word across all article - "edition" to "editing"
        • "...has enough precision to find" (P6) not related to "results" section
      • Comments on figures:
        • Fig. 2C:
      • No CRISPECTOR in the analysis
      • It is simulated data only
      • It is not violin plot as mentioned in the description
        • Fig 3A - Is it significant?
        • Fig. 4:
      • A
      • Each row/column is a vector of 96 guides?
      • How is the replicate number decided? Is it a different experiment by date? What is separating between experiments? Rep numbers?
      • B - Differential expression:
      • How? did you measure RNA?
      • Is the observed data in the figure sufficiently strong in terms of P-value?
      • Where is the third cell-line?
        • Fig.13 - There is no A and B as mentioned in the text

      Significance

      We repeat the aspects of contribution, as listed in the first part of the review, and comment about significance:

      • NextFlow pipeline implementation is an important engineering contribution. Same is true for the interactive web application
        • Significant engineering contribution. Nonetheless, we were not able to run the analysis. So - needs to be checked.
      • The option to simulate an experiment to assess it is a nice feature and can help experiment design
        • An important methodology contribution
      • Identification of amplicons when not provided as input
        • Not important in the context of multiplex PCR and NGS measurement assays, as amplicons will be known. Not clear what other contexts the authors were aiming at.
      • CRISPR-A seeks substitutions along the entire amplicon sequence and is less dependent on the quantification window and on the putative cutsite
        • Importance/significance needs to be demonstrated
      • Analysis of the difference, in edit activity, comparing different cell lines
        • Significant contribution. However - the methods need to be much better explained and the results better described in order for this to be useful to the community.
      • CRISPR-A supports the use of UMIs
        • Mildly significant technical contribution. However - only addresses on-target. Also addressing off-target would have been significant.
      • Interesting sequence pattern insights - like "...found certain patterns associated with low diversity outcomes: free thymine or adenine at the 3' nucleotide upstream of the cut site that leads to insertions of the same nucleotide, a free cytosine at the same place that leads to its loss, and strong micro-homology patterns that lead to long deletions "
        • As stated - interesting.
  8. docdrop.org docdrop.org
    1. Many issues in education policy have therefore come down to an apparent choice between the individual success of comparatively privileged students and the collective good of all students or the nation as a whole.

      I have always believed that individualism is a very selfish behavior pattern. I agree that individualism can make one's life better, or let one have a higher quality of life. But people live in a collective. In a collective, too strong individualism may harm the interests of the collective. There is an old Chinese saying that when there is no lip, the teeth will be very cold. It means that the collective is the umbrella of the individual. Individualism can exist, but I think we should put collective interests first.

    1. so you’ve ignored the potential benefits to your current situation as university students.

      I think another relevant thing here is what the student's goal is. We talked a bit about this in class, but some students just want to get the paper/assignment done in the shortest time and with the least effort. They probably won't use this process because it's not a requirement like it was in high school. They may not even consider that it's worth writing the best paper they can and considering strategies to help them do so.

    1. Reviewer #2 (Public Review):

      By now, the public is aware of the peculiarities underlying the omicron variants emergence and dissemination globally. This study investigates the mutational biography underlying how mutation effects and epistasis manifest in binding to therapeutic receptors.

      The study highlights how epistasis and other mutation effect measurements manifest in phenotypes associated with antibody binding with respect to spike protein in the omicron variant. It rigorously tests a large suite of mutations in the omicron receptor binding domain, highlighting differences in how mutation effects affect binding to certain therapeutic antibodies.

      Interestingly, mutations of large effect drive escape from binding to certain antibodies, but not others (S309). The difference in the mutational signature is the most interesting finding, and in particular, the signature of how higher-order epistasis manifests in the partial escape in S309, but less so in the full escape of other antibodies.

      The results are timely, the scope enormous, and the analyses responsible.

      My only main criticisms walk the stylistic/scientific line: many of the others have pioneered discussions and methods relating to the measurement of epistasis in proteins and other biomolecules. While I recognize that the purpose of this study is focused on the public health implications, I would have appreciated more of a dive into the peculiarity of the finding with respect to epistasis. I think the authors could achieve this by doing the following:

      a) Reconciling discussions around the mutation effects in light of contemporary discussions of global epistasis "vs" idiosyncratic epistasis, etc. Several of the authors of the manuscript have written other leading manuscripts of the topic. I would appreciate it if the authors couched the findings within other studies in this arena.

      B)While the methods used to detect epistasis in the manuscript make sense, the authors surely realize that methods used to measure is a contentious dimension of the field. I'd appreciate an appeal/explanation as to why their methods were used relative to others. For example, the Lasso correction makes sense, but there are other such methods. Citations and some explanation would be great.

      Lastly (somewhat relatedly), I found myself wanting the discussion to be bolder and more ambitious. The summary, as I read it, is on the nose and very direct (which is appropriate), but I want more: What do the findings say for greater discussions surrounding evolution in sequence space? For discussions of epistasis in proteins of a certain kind? In, my view, this data set offers fodder for fundamental discussion in evolutionary biology and evolutionary medicine. I recognize, however, the constraints: such topics may not be within the scope of a single paper, and such discussions may distract from the biomedical applications, which are more relevant for human health.

      But I might say something similar about the biomedical implications: the authors do a good job outlining exactly what happened, but what does this say about patterns (the role of mutations of large effect vs. higher-order epistasis) in some traits vs others? Why might we expect certain patterns of epistasis with respect to antibody binding relative to other pathogenic virus phenotypes?

      In summary: rigorous and important work, and I congratulate the authors.

    1. Author Response

      Reviewer #3 (Public Review):

      In this manuscript, Kim et al. use a deep generative model (a Variational Auto Encoder previously applied to adult data) to characterize neonatal-fetal functional brain development. The authors suggest that this approach is suitable given the rapid non-linear development taking place in the human brain across this period. Using two large neonatal and one fetal datasets, they describe that the resultant latent variables can lead to improved characterization of prenatal-neonatal development patterns, stable age prediction and that the decoder can reveal resting state networks. The study uses already accessible public datasets and the methods have been also made available.

      The manuscript is clearly written, the figures excellent and the application in this group novel. The methods are generally appropriate although there are some methodological concerns which I think would be important to address. Although the authors demonstrate that the methods are broadly generalisable across study populations - however, I am unsure about the general interest of the work beyond application of their previously described VAE approach to a new population and what new insight this offers to understanding how the human brain develops. This is a particular consideration given that the major results are age prediction (which is easily done with various imaging measures including something as simple as whole brain volume) and recapitulation of known patterns of functional activity in neonates. As such, the work will be of interest to researchers working in fMRI analysis methods and deep learning, but perhaps less so to a wider neuroscience/clinical readership.

      Specific comments:

      1) (M1) If I understand correctly, the method takes the functional data after volume registration into template space and then projects this data onto the surface. Given the complexities of changing morphology of the development brain. would it not be preferable to have the data in surface space for standard space alignment (rather than this being done later?). This would certainly help with one of the concerns expressed by the authors of "smoothing" in the youngest fetuses leading to a negative relationship between age and performance.

      While projecting onto the cortical surface has its advantages, as suggested here18, several studies have also shown that with careful registration, such as in the current study, volumetric registration can yield comparable performance19. Regardless, we did attempt to directly generate cortical surfaces for our fetuses. We refer the reviewer to our response to the RE-M2 [page 9].

      Regarding the “smoothing” effect in the youngest fetuses, we want to clarify that the smoothing effect in the scans of young fetuses is not unique to the choice of registration method. In other words, the same smoothing effect must be seen with cortical registration as well. Regarding this perspective, we kindly refer the reviewer to our response to RE-M1 [page 7]. Regarding the specific change made in the revised manuscript, we kindly refer to our response to R1-m5 [p21] or [page 9 line 191-213] in the main manuscript.

      2) (M2) A key limitation which I feel is important to consider if the method is aiming to be used for fetuses is the effects of the analysis being limited only to the cortical surface - and therefore the role of subcortical tissue (such as developmental layers in the immature white matter and key structures like the thalami) cannot be included. This is important, as in the fetal (and preterm neonatal) brain, the cortex is still developing and so not only might there be not the same kind of organisation to the activity, but also there is likely an evolving relationship with activity in the transient developmental layers (like the subplate) and inputs from the thalamus.

      The reviewer raises an important point. We agree with the reviewer that the subcortical region plays a critical role in fetal and newborn neurodevelopment. Unfortunately, our current VAE model cannot utilize such information without a major change in the model structure. We added this as a limitation of our study and discussed why our VAE model, in its current form, did not include subcortical areas. Please see our detailed response to RE-M1 [page 4] or [page 25 line 558-570] in the main manuscript.

      3) (M3) As the authors correctly describe, brain development and specifically functional relationships are likely evolving across the study time window. Beyond predicting age and a different way of estimating resting state networks using the decoding step, it is not clear to me what new insight the work is adding to the existing literature - or how the method has been specifically adapted for working with this kind of data. Whilst I agree that these developmental processes are indeed likely non-linear, to put the work in context, I think the manuscript would benefit from explaining how (or if) the method has been adapted and explicitly mentioning what additional neuroscientific/biological gains there are from this method.

      We appreciate the reviewer’s critical insights. In the revised paper, we included additional results that, we hope, can address the reviewer’s concerns. We believe that the strength of the VAE model is that, relative to linear models, it can be more generalizable across different datasets and ages (adult vs. full-term babies vs. preterm babies vs. fetuses). In the original manuscript, this was supported by the superior age prediction performance of the VAE over linear models when applied to different datasets covering the fetal to neonatal periods. Age prediction could also be done using other imaging modalities, as the reviewer pointed out. However, we do not think this undermines the potential impact of having the ability to accurately estimate age based on functional connectivity patterns. Brain function-structure relationships may not exactly be one-to-one20. It is entirely possible that for one disease, brain functional connectivity alterations precede structural changes such that delayed growth trajectories will first manifest in the functional space. There are also certain aspects of brain function that cannot be mapped directly to its structural characteristics (i.e., structural connectivity patterns). For example, brain changes its functional connectivity patterns dynamically over different brain states (resting vs. task-engaging)21, mental disorders (depression22, anxiety23, Schizophrenia24), cognitive traits25, 26, and individual uniqueness25, etc. Therefore, we believe that estimating the functional age of fetuses and neonates given their functional connectivity profiles may provide a biomarker for tracking neurodevelopment trajectories, allowing clinicians to identify deviations early and intervene in a timely manner if necessary. For these reasons, we believe that superior age prediction performance of the VAE model compared to linear models is scientifically significant.

      The value of the VAE lies in its ability to capture FC features that are otherwise not modeled by linear strategies. For example, here, we showed that only the VAE model can extract latent variables representing brain networks that are similar across different datasets. In contrast, linear models, showed higher network pattern similarity between full-term and preterm infants within the dHCP dataset. This suggests that the VAE model can be a very useful tool for capturing common brain networks in datasets acquired using different recording parameters and preprocessing steps. Moreover, the VAE representations predicted age with higher accuracy compared to linear representations. Together, these findings show that the methodology is effective in extracting functionally relevant features of the brain. Please see RE-M1 [page 3] and R1-m13 regarding the specific changes made in the revised manuscript.

      4) (M4) The unavoidable smoothing effect of VAE is very noticeable in the figures - does this suggest that the method will be relatively insensitive to the fine granularity which is important to understand brain development and the establishment of networks (such as the evolving boundaries between functional regions with age) - reducing inference to only the large primary sensory and associative networks? This will also be important to consider for the individual "reconstruction degree" - (which it would likely then overstate - and would need careful intersubject comparison also) if it was to be used as a biomarker or predictor of cognition as suggested by the authors.

      Regarding the first concern, yes. Greater smoothing will tend to yield less granular network patterns; this is true for all representational models (not only VAE, but also models like ICA or PCA). This effect becomes ever more pronounced when representations consist of fewer components (e.g., IC50); the smoothing effect becomes stronger, leading to coarser brain patterns (see Fig. 3 in the revised manuscript). In this regard, higher number of components is desired, but on the flipside, IC maps with higher components are generally less interpretable. In short, there will always be trade-offs between interpretability and spatial resolution. Also, higher components tend to cause over-fitting issue, as shown in our age prediction performance across different datasets (worse performance in the IC300 vs. IC50). In this sense, what matters for the representations is how informative each latent variable (or component) is. In the revised Fig. 2, we showed that latent variables from the VAE model were more informative in representing rsfMRI than linear representations. It is also noteworthy that the smoothing effect of the VAE is comparable to IC300 (similar effect to manual smoothing at the level of FWHM=5mm; revised Fig. 3). Given above results, we believe the VAE model may be more suitable for investigating finer scale of brain networks, than linear models. The above perspective was updated in the revised manuscript as [page 23 line 506-511]:

      "Another interesting observation was that the smoothing effect of the VAE is comparable to IC300 (similar effect to manual smoothing at the level of FWHM=5mm; Fig. 3). Given the above, we believe the VAE model may be more suitable for investigating finer scale of brain networks, than linear models. Perhaps, the VAE model with a greater number of latent variables (e.g., 512 or 1024 instead of 256 in the current VAE) can be utilized to find brain networks at finer scale."

      On top of the points raised above, network mapping with linear models is limited when it comes to mapping the spatial evolution of brain networks over aging due to their linear nature. This limitation can be observed in the ICA study with dHCP dataset (Fig. 4 in 7). On the other hand, thanks to its nonlinearity nature, the VAE model may have a potential to observe the spatial gradient of brain network over aging, while this expectation needs confirmation. To that end, we revised our discussion to reflect our perspective. We refer the full change made in the revised manuscript to our response to R1-m13.

    1. Author Response

      We thank the reviewers for their positive feedback and thoughtful suggestions that will improve our manuscript. Here we summarise our plan for immediate action. We will resubmit our manuscript once additional experiments have been performed to clarify all the major and minor concerns of the reviewers and the manuscript has been revised. At that point, we will respond to all reviewer’s points and highlight the changes made in the text.

      Reviewer #1 (Public Review):

      The authors have tried to correlate changes in the cellular environment by means of altering temperature, the expression of key cellular factors involved in the viral replication cycle, and small molecules known to affect key viral protein-protein interactions with some physical properties of the liquid condensates of viral origin. The ideas and experiments are extremely interesting as they provide a framework to study viral replication and assembly from a thermodynamic point of view in live cells.

      The major strengths of this article are the extremely thoughtful and detailed experimental approach; although this data collection and analysis are most likely extremely time-consuming, the techniques used here are so simple that the main goal and idea of the article become elegant. A second major strength is that in other to understand some of the physicochemical properties of the viral liquid inclusion, they used stimuli that have been very well studied, and thus one can really focus on a relatively easy interpretation of most of the data presented here.

      There are three major weaknesses in this article. The way it is written, especially at the beginning, is extremely confusing. First, I would suggest authors should check and review extensively for improvements to the use of English. In particular, the abstract and introduction are extremely hard to understand. Second, in the abstract and introduction, the authors use terms such as "hardening", "perturbing the type/strength of interactions", "stabilization", and "material properties", for just citing some terms. It is clear that the authors do know exactly what they are referring to, but the definitions come so late in the text that it all becomes confusing. The second major weakness is that there is a lack of deep discussion of the physical meaning of some of the measured parameters like "C dense vs inclusion", and "nuclear density and supersaturation". There is a need to explain further the physical consequences of all the graphs. Most of them are discussed in a very superficial manner. The third major weakness is a lack of analysis of phase separations. Some of their data suggest phase transition and/or phase separation, thus, a more in-deep analysis is required. For example, could they calculate the change of entropy and enthalpy of some of these processes? Could they find some boundaries for these transitions between the "hard" (whatever that means) and the liquid?

      The authors have achieved almost all their goals, with the caveat of the third weakness I mentioned before. Their work presented in this article is of significant interest and can become extremely important if a more detailed analysis of the thermodynamics parameters is assessed and a better description of the physical phenomenon is provided.

      We thank reviewer 1 for the comments and, in particular, for being so positive regarding the strengths of our manuscript and for raising concerns that will surely improve the manuscript. At this point, we propose the following actions to address the concerns of Reviewer 1:

      1) We will extensively revise the use of English, particularly, in the abstract and introduction, defining key terms as they come along in the text to make the argument clearer.

      2) We acknowledge the importance of discussing our data in more detail and we propose the following. We will discuss the graphs and what they mean as exemplified in the paragraph below.

      Regarding Figure 3 - As the concentration of vRNPs increases, we observe an increase in supersaturation until 12hpi. This means that contrary to what is observed in a binary mixture, in which the Cdilute is constant (Klosin et al., 2020), the Cdilute in our system increases with concentration. It has been reported that Cdilute increases in a multi-component system with bulk concentration (Riback et al., 2020). Our findings have important implications for how we think about the condensates formed during influenza infection. As the 8 different genomic vRNPs have a similar overall structure, they could, in theory, behave as a binary system between units of vRNPs and Rab11a. However, a change in Cdilute with concentration shows that our system behaves as a multi-component system. This means that the differences in length, RNA sequence and valency that each vRNP have are key for the integrity of condensates.

      3) The reviewer calls our attention to the lack of analysis of phase separations. We think that phase separation (or percolation coupled to phase separation) governs the formation of influenza A virus condensates. However, we think we ought to exert caution at this point as the condensates we are working with are very complex and that the physics of our system in cells may not be sufficient to claim phase separation without an in vitro reconstitution system. In fact, IAV inclusions contain cellular membranes, different vRNPs and Rab11a. So far, we can only speculate that the liquid character of IAV inclusions may arise from a network of interacting vRNPs that bridge several cognate vRNP-Rab11 units on flexible membranes, similarly to what happens in phase separated vesicles in neurological synapses. However, the speculative model for our system, although being supported by correlative light and electron microscopy, currently lacks formal experimental validation.

      For this reason, we thought of developing the current work as an alternative to explore the importance of the liquid material properties of IAV inclusions. By finding an efficient method to alter the material properties of IAV inclusions, we provide proof of principle that it is possible to impose controlled phase transitions that reduce the dynamics of vRNPs in cells and negatively impact progeny virion production. Despite having discussed these issues in the limitations of the study, we will make our point clearer.

      We are currently establishing an in vitro reconstitution system to formally demonstrate, in an independent publication, that IAV inclusions are formed by phase separation. For this future work, we teamed up with Pablo Sartori, a theorical physicist to derive in- depth analysis of the thermodynamics of the viral liquid condensates. Collectively, we think that cells have too many variables to derive meaningful physics parameters (such as entropy and enthalpy) as well as models and need to be complemented by in vitro systems. For example, increasing the concentration inside a cell is not a simple endeavour as it relies on cellular pathways to deliver material to a specific place. At the same time, the 8 vRNPs, as mentioned above, have different size, valency and RNA sequence and can behave very differently in the formation of condensates and maintenance of their material properties. Ideally, they should be analysed individually or in selected combinations. For the future, we will combine data from in vitro reconstitution systems and cells to address this very important point raised by the reviewer.

      From the paper on the section Limitations of the study: “Understanding condensate biology in living cells is physiologically relevant but complex because the systems are heterotypic and away from equilibria. This is especially challenging for influenza A liquid inclusions that are formed by 8 different vRNP complexes, which although sharing the same structure, vary in length, valency, and RNA sequence. In addition, liquid inclusions result from an incompletely understood interactome where vRNPs engage in multiple and distinct intersegment interactions bridging cognate vRNP-Rab11 units on flexible membranes (Chou et al., 2013; Gavazzi et al., 2013; Haralampiev et al., 2020; Le Sage et al., 2020; Shafiuddin & Boon, 2019; Sugita, Sagara, Noda, & Kawaoka, 2013). At present, we lack an in vitro reconstitution system to understand the underlying mechanism governing demixing of vRNP-Rab11a-host membranes from the cytosol. This in vitro system would be useful to explore how the different segments independently modulate the material properties of inclusions, explore if condensates are sites of IAV genome assembly, determine thermodynamic values, thresholds accurately, perform rheological measurements for viscosity and elasticity and validate our findings”.

      Reviewer #2 (Public Review):

      During Influenza virus infection, newly synthesized viral ribonucleoproteins (vRNPs) form cytosolic condensates, postulated as viral genome assembly sites and having liquid properties. vRNP accumulation in liquid viral inclusions requires its association with the cellular protein Rab11a directly via the viral polymerase subunit PB2. Etibor et al. investigate and compare the contributions of entropy, concentration, and valency/strength/type of interactions, on the properties of the vRNP condensates. For this, they subjected infected cells to the following perturbations: temperature variation (4, 37, and 42{degree sign}C), the concentration of viral inclusion drivers (vRNPs and Rab11a), and the number or strength of interactions between vRNPs using nucleozin a well-characterized vRNP sticker. Lowering the temperature (i.e. decreasing the entropic contribution) leads to a mild growth of condensates that does not significantly impact their stability. Altering the concentration of drivers of IAV inclusions impact their size but not their material properties. The most spectacular effect on condensates was observed using nucleozin. The drug dramatically stabilizes vRNP inclusions acting as a condensate hardener. Using a mouse model of influenza infection, the authors provide evidence that the activity of nucleozin is retained in vivo. Finally, using a mass spectrometry approach, they show that the drug affects vRNP solubility in a Rab11a-dependent manner without altering the host proteome profile.

      The data are compelling and support the idea that drugs that affect the material properties of viral condensates could constitute a new family of antiviral molecules as already described for the respiratory syncytial virus (Risso Ballester et al. Nature. 2021).

      Nevertheless, there are some limitations in the study. Several of them are mentioned in a dedicated paragraph at the end of a discussion. This includes the heterogeneity of the system (vRNP of different sizes, interactions between viral and cellular partners far from being understood), which is far from equilibrium, and the absence of minimal in vitro systems that would be useful to further characterize the thermodynamic and the material properties of the condensates.

      We thank reviewer 2 for highlighting specific details that need improving and raising such interesting questions to validate our findings. We will address all the minor comments of Reviewer 2. To address the comments of Reviewer 2, we propose the actions described in blue below each point raised that is written in italics.

      1) The concentrations are mostly evaluated using antibodies. This may be correct for Cdilute. However, measurement of Cdense should be viewed with caution as the antibodies may have some difficulty accessing the inner of the condensates (as already shown in other systems), and this access may depend on some condensate properties (which may evolve along the infection). This might induce artifactual trends in some graphs (as seen in panel 2c), which could, in turn, affect the calculation of some thermodynamic parameters.

      The concern of using antibodies to calculate Cdense is valid. We will address this concern by validating our results using a fluorescent tagged virus that has mNeon Green fused to the viral polymerase PA (PA-mNeonGreen PR8 virus). Like NP, PA is a component of vRNPs and labels viral inclusions, colocalising with Rab11 when vRNPs are in the cytosol without the need of using antibodies.

      This virus would be the best to evaluate inclusion thermodynamics, where it not an attenuated virus (Figure 1A below) with a delayed infection as demonstrated by the reduced levels of viral proteins (Figure 1B below). Consistently, it shows differences in the accumulation of vRNPs in the cytosol and viral inclusions form later in infection. After their emergence, inclusions behave as in the wild-type virus (PR8-WT), fusing and dividing (Figure 1C below) and displaying liquid properties. The differences in concentration may shift or alter thermodynamic parameters such as time of nucleation, nucleation density, inclusion maturation rate, Cdense, Cdilute. This is the reason why we performed the thermodynamics profiling using antibodies upon PR8-WT infection. For validating our results, and taking into account a possible delayed kinetics, and differenced that may occur because of reduced vRNP accumulation in the cytosol, this virus will be useful and therefore we will repeat the thermodynamics using it.

      As a side note, vRNPs are composed of viral RNA coated with several molecules of NP and each vRNP also contains 1 copy of the trimeric RNA dependent RNA polymerase formed by PA, PB1 and PB2. It is well documented that in the cytosol the vast majority of PA (and other components of the polymerase) is in the form of vRNPs (Avilov, Moisy, Munier, et al., 2012; Avilov, Moisy, Naffakh, & Cusack, 2012; Bhagwat et al., 2020; Lakdawala et al., 2014), and thus we can use this virus to label vRNPs on condensates to corroborate our studies using antibodies.

      Figure 1 – The PA- mNeonGreen virus is attenuated in comparison to the WT virus. A. Cells (A549) were infected or mock-infected with PR8 WT or PA- mNeonGreen (PA-mNG) viruses, at a multiplicity of infection (MOI) of 3, for the indicated times. Viral production was determined by plaque assay and plotted as plaque forming units (PFU) per milliliter (mL) ± standard error of the mean (SEM). Data are a pool from 2 independent experiments. B. The levels of viral PA, NP and M2 proteins and actin in cell lysates at the indicated time points were determined by western blotting. C. Cells (A549) were transfected with a plasmid encoding mCherry-NP and co-infected with PA-mNeonGreen virus for 16h, at an MOI of 10. Cells were imaged under time-lapse conditions starting at 16 hpi. White boxes highlight vRNPs/viral inclusions in the cytoplasm in the individual frames. The dashed white and yellow lines mark the cell nucleus and the cell periphery, respectively. The yellow arrows indicate the fission/fusion events and movement of vRNPs/ viral inclusions. Bar = 10 µm. Bar in insets = 2 µm.

      2) Although the authors have demonstrated that vRNP condensates exhibit several key characteristics of liquid condensates (they fuse and divide, they dissolve upon hypotonic shock or upon incubation with 1,6-hexanediol, FRAP experiments are consistent with a liquid nature), their aspect ratio (with a median above 1.4) is much higher than the aspect ratio observed for other cellular or viral liquid compartments. This is intriguing and might be discussed.

      IAV inclusions have been shown to interact with microtubules and the endoplasmic reticulum, that confers movement, and also undergo fusion and fission events. We propose that these interactions and movement impose strength and deform inclusions making them less spherical. To validate this assumption, we compared the aspect ratio of viral inclusions in the absence and presence of nocodazole (that abrogates microtubule-based movement). The data in figure 2 shows that in the presence of nocodazole, the aspect ratio decreases from 1.42±0.36 to 1.26 ±0.17, supporting our assumption.

      Figure 2 – Treatment with nocodazole reduces the aspect ratio of influenza A virus inclusions. Cells (A549) were infected PR8 WT and treated with nocodazole (10 µg/mL) for 2h time after which the movement of influenza A virus inclusions was captured by live cell imaging. Viral inclusions were segmented, and the aspect ratio measured by imageJ, analysed and plotted in R.

      3) Similarly, the fusion event presented at the bottom of figure 3I is dubious. It might as well be an aggregation of condensates without fusion.

      We will change this, thank you for the suggestion.

      4) The authors could have more systematically performed FRAP/FLAPh experiments on cells expressing fluorescent versions of both NP and Rab11a to investigate the influence of condensate size, time after infection, or global concentrations of Rab11a in the cell (using the total fluorescence of overexpressed GFP-Rab11a as a proxy) on condensate properties.

      We will try our best to be able to comply with this suggestion as we think it is important.

      Reviewer #3 (Public Review):

      This study aims to define the factors that regulate the material properties of the viral inclusion bodies of influenza A virus (IAV). In a cellular model, it shows that the material properties were not affected by lowering the temperature nor by altering the concentration of the factors that drive their formation. Impressively, the study shows that IAV inclusions may be hardened by targeting vRNP interactions via the known pharmacological modulator (also an IAV antiviral), nucleozin, both in vitro and in vivo. The study employs current state-of-the-art methodology in both influenza virology and condensate biology, and the conclusions are well-supported by data and proper data analysis. This study is an important starting point for understanding how to pharmacologically modulate the material properties of IAV viral inclusion bodies.

      We thank this reviewer for all the positive comments. We will address the minor issues brought to our attention entirely, including changing the tittle of the manuscript and we will investigate the formation and material properties of IAV inclusions in the presence and absence of nucleozin for the nucleozin escape mutant NP-Y289H.

      References

      Avilov, S. V., Moisy, D., Munier, S., Schraidt, O., Naffakh, N., & Cusack, S. (2012). Replication- competent influenza A virus that encodes a split-green fluorescent protein-tagged PB2 polymerase subunit allows live-cell imaging of the virus life cycle. J Virol, 86(3), 1433- 1448. doi:10.1128/JVI.05820-11

      Avilov, S. V., Moisy, D., Naffakh, N., & Cusack, S. (2012). Influenza A virus progeny vRNP trafficking in live infected cells studied with the virus-encoded fluorescently tagged PB2 protein. Vaccine, 30(51), 7411-7417. doi:10.1016/j.vaccine.2012.09.077

      Bhagwat, A. R., Le Sage, V., Nturibi, E., Kulej, K., Jones, J., Guo, M., . . . Lakdawala, S. S. (2020). Quantitative live cell imaging reveals influenza virus manipulation of Rab11A transport through reduced dynein association. Nat Commun, 11(1), 23. doi:10.1038/s41467-019-13838-3

      Chou, Y. Y., Heaton, N. S., Gao, Q., Palese, P., Singer, R. H., & Lionnet, T. (2013). Colocalization of different influenza viral RNA segments in the cytoplasm before viral budding as shown by single-molecule sensitivity FISH analysis. PLoS Pathog, 9(5), e1003358. doi:10.1371/journal.ppat.1003358

      Gavazzi, C., Yver, M., Isel, C., Smyth, R. P., Rosa-Calatrava, M., Lina, B., . . . Marquet, R. (2013). A functional sequence-specific interaction between influenza A virus genomic RNA segments. Proc Natl Acad Sci U S A, 110(41), 16604-16609. doi:10.1073/pnas.1314419110

      Haralampiev, I., Prisner, S., Nitzan, M., Schade, M., Jolmes, F., Schreiber, M., . . . Herrmann, A. (2020). Selective flexible packaging pathways of the segmented genome of influenza A virus. Nat Commun, 11(1), 4355. doi:10.1038/s41467-020-18108-1

      Klosin, A., Oltsch, F., Harmon, T., Honigmann, A., Julicher, F., Hyman, A. A., & Zechner, C. (2020). Phase separation provides a mechanism to reduce noise in cells. Science, 367(6476), 464-468. doi:10.1126/science.aav6691

      Lakdawala, S. S., Wu, Y., Wawrzusin, P., Kabat, J., Broadbent, A. J., Lamirande, E. W., . . . Subbarao, K. (2014). Influenza a virus assembly intermediates fuse in the cytoplasm. PLoS Pathog, 10(3), e1003971. doi:10.1371/journal.ppat.1003971

      Le Sage, V., Kanarek, J. P., Snyder, D. J., Cooper, V. S., Lakdawala, S. S., & Lee, N. (2020). Mapping of Influenza Virus RNA-RNA Interactions Reveals a Flexible Network. Cell Rep, 31(13), 107823. doi:10.1016/j.celrep.2020.107823

      Riback, J. A., Zhu, L., Ferrolino, M. C., Tolbert, M., Mitrea, D. M., Sanders, D. W., . . . Brangwynne, C. P. (2020). Composition-dependent thermodynamics of intracellular phase separation. Nature, 581(7807), 209-214. doi:10.1038/s41586-020-2256-2

      Shafiuddin, M., & Boon, A. C. M. (2019). RNA Sequence Features Are at the Core of Influenza a Virus Genome Packaging. J Mol Biol. doi:10.1016/j.jmb.2019.03.018

      Sugita, Y., Sagara, H., Noda, T., & Kawaoka, Y. (2013). Configuration of viral ribonucleoprotein complexes within the influenza A virion. J Virol, 87(23), 12879- 12884. doi:10.1128/JVI.02096-13

    1. Author Response

      Reviewer #1 (Public Review):

      The manuscript by Shaikh and Sunagar addresses the question of the origin of spider venom proteins. It has been known for many years that an important component of spider venoms is a diverse group of small proteins known as disulfide-rich peptides (DRPs). However, it has not been clear whether this group of proteins has a common origin or evolved convergently in different lineages. The authors collected sequences of the genes encoding these proteins from publicly available genomes of spiders from a range of families. They aligned the sequences using the structural cysteines as guides and carried out a phylogenetic analysis of the different sequences, ultimately classifying the different proteins into over 50 super-families. One thing that is not clear from the text or from the references cited (I am not an expert on spider venom) is how many of these superfamilies were known before and how many are novel. There is also no clear indication of what criteria were used to define a subset of sequences as a superfamily. Nonetheless, the authors show that all these superfamilies have a single common ancestor, predating the divergence of araneomorphs and mygalomorphs and that the DRPs underwent independent diversification in each of these two lineages.

      We have identified 78 novel superfamilies in this study and 33 were previously identified (Pineda et al. 2020 PNAS). We had previously described information in lines 90, 101 and 106 regarding the description of novel superfamilies from previous studies and the ones described in this study.

      Line 90 “Recently, using a similar approach, 33 novel spider toxin superfamilies have been identified from the venom of the Australian funnel-web spider, Hadronyche infensa (9).”

      Line 101 “This approach enabled the identification of 33 novel toxin superfamilies along the breadth of Mygalomorphae (Figures S1 and S2).”

      Line 106 “Moreover, analyses of Araneomorphae toxin sequences using the strategy above resulted in the identification of 45 novel toxin superfamilies from Araneomorphae, all of which but one (SF109) belonged to the DRP class of toxins (Figures S3 and S4).”

      Spider toxin superfamilies have been named after gods/deities of death, destruction and the underworld based on nomenclature introduced by Pineda et al. (2014 BMC genomics). We have now included this explanation in the manuscript under the methods and results sections. We have also provided additional details pertaining to this nomenclature in Table S1.

      The authors also looked at selective forces acting on the sequences using dN/dS analyses. They reach the conclusion that there are different modes of selection acting on different sequences based on their role - defensive or predatory venoms - building on previous work by the lead author on venom sequence evolution in diverse animals.

      All in all, this is an admirable piece of molecular evolution work, providing new data on the evolution of spider venom proteins. There are some confusions in terminology that need to be cleared up, and somewhat more context needs to be given for non-specialists as detailed in the points below:

      We thank the reviewer for their constructive and critical suggestions, as well as the kind words of encouragement. Their suggestions have helped us in significantly improving the quality of our work.

      Suggestion 1) Common names of the main spider infraorders should be given.

      We thank the reviewer for their helpful input. We have now introduced spider infraorders with well-known spiders and their common names under the introduction section. Furthermore, we have also included a schematic representation of the spider phylogeny, and highlighted lineages under investigation as Figure 1.

      Suggestion 2) Opisthothelae is not the common ancestor of Mygalomorphae and Araneamorphae, but the clade that encompasses those two clades. This incorrect statement appears in several places. Further on, it is stated that Opisthothelae is the common ancestor of all extant spiders. This is wrong both from a terminological point of view (a clade cannot be ancestral to another clade) and from a factual point of view, since there are extant spiders not included in Opisthothelae.

      We thank the reviewer for pointing out this oversight. We have now corrected it to suborder Opisthothelae as the clade encompassing Mygalomorphae and Araneomorphae spiders.

      Suggestion 3) Several proteins and proteins families are mentioned without being introduced, e.g. knottin. Please provide short descriptions.

      We have now provided a short introduction to terms such as Knottin.

      Reviewer #2 (Public Review):

      This interesting study looks into the evolution of putative spider venom toxins, specifically disulfide-rich peptides (DRPs). The authors use published sequence data to gain new insights into the evolution of DRPs, which are the major component of most spider venoms. Through a series of sequence comparisons and phylogenetic analyses they identify a substantial number of new spider toxin superfamilies with distinct cysteine scaffolds, and they trace these back to a primitive scaffold that must have been present in the last common ancestor of mygalomorph and araneomorph spiders. Looking at the taxonomic distribution of these putative venom DRPs, they conclude that mygalomorph and araneomorph DRPs have evolved in different ways, with the former being recruited into venom at the level of genera, and the latter at the level of families. In addition, they perform selection analyses on the DRP superfamilies to uncover the surprising result that mygalomorph and araneomorph DRPs have evolved under different selective regimes, with the evolution of the former being characterised by positive selection, and the latter by purifying (negative) selection.

      However, I don't think that in the current state of the manuscript these conclusions are robustly supported for several reasons. First, it seems that not all previously published data were included in the phylogenetic analyses that were used to identify new superfamilies of DRPs.

      We have, indeed, analysed all spider toxin sequences available to date. We have relied on the signal and propeptide regions for identifying novel superfamilies, which is an accepted convention: Pineda et al. (2014 BMC Genomics); Pineda et al. (2020 PNAS).

      Although many additional superfamilies can be identified, we have only retained those sequences for which there were at least 5 representatives for the identification of toxin superfamilies, and 15 representatives for selection analyses to ensure robustness. This filtering step ensured that the generated alignments, phylogenetic trees, and evolutionary assessments were robust and devoid of noise that stems from single-representative groups. Adding in those sequences would have enabled us to identify many more superfamilies, solely based on the signal and propeptide examination, but it wouldn’t have been possible to support them with other lines of evidence that were provided for all other superfamilies in this study, jeopardising the overall quality of the manuscript. Nonetheless, there is strong evidence that the left-out sequences are also related to the ones analysed in this study (Figure S10). In future, when more transcriptomes are sequenced, it would be possible to designate these newer toxin superfamilies with much stronger support.

      Second, much of the data were obtained from whole-body transcriptome data, which leaves a degree of uncertainty that these data indeed derive from the venom glands that produce the toxins.

      We respectfully disagree with the reviewer that ‘much of the data’ are from the whole-body transcriptomes. Nearly all sequences in our study are sourced from Pineda et al. (2014 BMC Genomics and 2020 PNAS), Sunagar et al (2013 Toxins), Cole and Brewer (2020 bioRxiv) and transcriptome sequence assembly data from established online repositories NCBI (NR and TSA) and ENA. All the above-mentioned studies (KS is a part of many of these) under their methods section clearly state that the transcriptomes were generated using mRNA isolated from venom gland tissue (BioProject accessions: PRJEB14734; PRJEB6062; PRJNA189679, PRJNA587301 and PRJNA189679, where source tissue type is designated as venom gland).

      We would like to direct the reviewer’s attention to the following excerpts from reference papers from which data for this study has been sourced:

      1. Pineda S et al. (2020 PNAS): “Three days later, they were anesthetized, and their venom glands were dissected and placed in TRIzol reagent (Life Technologies). Total RNA from pooled venom glands was extracted following the standard TRIzol protocol.”
      2. Sunagar et al (2013 Toxins): “Paired venom glands were dissected out and pooled from nine mature females on the fourth day after venom depletion by electrostimulation. Total RNA was extracted using the standard TRIzol Plus method ...”
      3. Cole and Brewer (2020 bioRxiv): “... the venom glands of each ctenid were dissected out, whole RNA was isolated from the venom glands …”

      We would also like to point out that hexatoxins are widely studied and are some of the most well-understood spider venom toxins. Many representatives have been functionally characterised and shown to be potent in affecting prey and predatory species [Sunagar et al (2013 Toxins); Pineda et al. (2014 BMC Genomics and 2020 PNAS); Volker, et al. (2020 PNAS) - KS is a part of most of these studies as well]. However, the current technologies do not permit the high-throughput screening of the enormous diversity of toxins in spiders, which is why not every toxin sequence identified from the venom gland is functionally characterised. Nonetheless, venom researchers will not contest the role of these highly expressed venom gland proteins in envenoming, especially given that they share significant sequence identities with toxins that are functionally well-characterised.

      The only exception to the above is non-ctenid araneomorph toxin superfamily sequences, which are retrieved from whole-body transcriptomes (Cole and Brewer; 2020 bioRxiv). The authors of the paper indicated these as putative toxins. As explained above, homologs of these peptides are well-characterised to be venom toxins. Additionally, in our phylogenetic trees (Figures 3, 4, S6 and S9), they are nested within the toxin clades, reaffirming their identity.

      Third, the taxonomic representation of mygalomorph and araneomorph diversity in this study is so sparse that it becomes impossible to distinguish whether toxin recruitments have happened at the level of genera, families, or even higher-level taxa.

      We respectfully disagree with this suggestion. The taxonomic breadth investigated in this study isn’t sparse. Analysed sequences belong to groups across the breadth of the spider phylogeny. To address this criticism, we are now including a schematic representation of spider phylogeny, where lineages under investigation are highlighted (Figure 1A). Given this broader taxonomic breadth, all of our interpretations are parsimoniously extendable to their common ancestors. For instance, we establish the common origin of all DRPs in the members of these widespread spider families. Therefore, not including sequences from other sister groups will not invalidate this hypothesis, and the most parsimonious explanation will be that the missing members too are likely to have DRPs in their venom (which is also a common understanding of the spider venom research). Whether DRPs dominate the venoms of these missing groups will only come to light upon investigation, but their presence in the venom is highly likely. Moreover, please do note that we have analysed nearly all sequences available in the literature to date.

      As for the recruitment of the toxin superfamily at the taxon level, we would like to point out the phylogenies in Figures 2 and 3 that clearly show the differential recruitment events. We would also like to point out lines 120 and 136 state that this may not only be a result of recruitment and could arise from differential rates of diversification (also evident in other analyses presented in Figures 5 and Tables S2 and S3).

      Line 120 “Interestingly, the plesiotypic DRP scaffold seems to have undergone lineage-specific diversification in Mygalomorphae, where the selective diversification of the scaffold has led to the origination of novel toxin superfamilies corresponding to each genus (Figure 2).”

      Line 136 “However, we also documented a large number of DRP toxins (n=32) that were found to have diversified in a family-specific manner, wherein, a toxin scaffold seems to be recruited at the level of the spider family, rather than the genus. As a result, and in contrast to mygalomorph DRPs, araneomorph toxin superfamilies were found to be scattered across spider lineages (Figure 3; Figure S6; node support: ML: >90/100; BI: >0.95).”

      Adding any number of missing lineages will neither change the fact that araneomorphs ‘appear’ to have recruited these superfamilies at the genera level, nor the family-level recruitment of toxin superfamilies in a large number of examined mygalomorphs.

      We have now introduced a new figure (Figure 7) that highlights the different scenarios that explain the observed differences in the evolution of mygalomorph and araneomorph spider toxins. We have also included additional text in the manuscript to explain this better.

      Fourth, only a selection of DRP superfamilies was used for natural selection analyses, without the authors explaining how this selection was made. Yet, they attempted to draw general conclusions about toxin evolution in mygalomorphs and araneomorphs, even though most of the striking differences they found were restricted to just two mygalomorph genera, and one family of araneomorphs.

      From our experience and previous reports [Sunagar and Moran (2015, PLoS genetics); Sunagar, et al. (2012, MBE); Yang, Z. (2007, MBE)], the unavailability of enough sequences from datasets results in inaccurate estimation of omega values. For instance, if there are only a couple of sequences in a superfamily, both of which are slightly different from one another, then even these minor differences in them would be exaggerated. Hence, we have resorted to performing selection analysis on datasets for which there are at least 15 sequences. No doubt that this conservative approach reduces the number of datasets analysed, but it also ensures that our findings are well-supported. We have now clarified this in our manuscript under the methods section.

      However, we did previously include sequences from all toxin superfamilies described to date in our alignment figure (Fig S10) and analysed their signal and propeptide regions. They were only excluded from selection analyses. It can be seen that they too are DRPs, but they belong to distinct superfamilies from the ones being described here.

      If these concerns are addressed this study can shed important new light on venom toxin evolution in one of the most diverse venomous taxa on Earth.

      We thank the reviewer for their constructive inputs and suggestions which have enabled us to make this manuscript more accessible to a wider audience.

      Reviewer #3 (Public Review):

      This work aims to elucidate the evolutionary origins of disulfide-rich spider toxin superfamilies and to determine the modes of natural selection and associated ecological pressures acting upon them. The authors provide a compelling line of evidence for a single evolutionary origin and differing factors (e.g., prey capture strategies and methods of anti-predator defense) that have shaped the evolution of these toxins. Additionally, the two major spider infraorders are claimed to have experienced differing selective pressures regarding these toxins.

      The results presented here are novel and generally well-presented. The evidence for a single origin of DRP toxins in spiders is exciting and changes the paradigm of spider venom evolution.

      The data are well analyzed, but the methods lack enough detail to reproduce the results. More information regarding the parameters passed to each software package, version numbers of all software employed, and models of molecular evolution employed in phylogenetic analyses are among the necessary missing information.

      We thank the reviewer for their kind words and constructive and critical suggestions. Their suggestions have contributed towards improving the quality of our work. Upon their suggestion, we have now expanded the methods section to include more details.

      The differences in the evolutionary pressures between mygalomorphs and RTA-clade spider DRP toxins are clear, but expanding RTA results to all araneomorphs may be overreaching. Additional araneomorph sequence data is available, despite the claims within this manuscript (e.g., see Jiang et al.. 2013 Toxins; He et al.. 2013 PLoS ONE; and Zobel-Thropp et al.. 2017 PEERJ). These papers include cDNA sequences of spider venom glands and contain representatives of inhibitory cysteine knot toxins, which are DRP toxins. These data would greatly enhance the strengths of the results presented herein.

      In response to the expansion of RTA results to araneomorphs, we would like to point out that RTA comprises about 50% of the diversity recorded in Araneomorphae. The araneomorph data analysed in our study covers a range of araneomorph family divergence time Agelenidae (<70 MYA), Pisauridae (<50 MYA) and Theridiidae (~200 MYA, Magalhaes 2020, Biological Reviews 95.1). We report a strong signature of purifying selection influencing the evolution of araneomorph toxin SFs, despite the long evolutionary time separating them (50 - 200 MYA). We firmly believe that further addition of toxin sequence data from other groups will not deviate from the general trend of molecular evolution observed in both these lineages across such large period of time; barring certain certain exceptions (such as SF13 a defensive toxin identified from Hadronyche experiencing purifying selection; Volker, et al. 2020 PNAS).

      We had initially excluded non-ctenid datasets from our analyses on account of poor sequence annotation and lack of representative sequence data. However, we have now incorporated Dolomedes mizhoanus (DRP) (Jiang et al. 2013 Toxins) and Latrodectus tredecimguttatus (non-DRP) (He et al. 2013 PLoS ONE) toxin dataset into our analyses, following reviewer’s suggestion. This has led to identification of 5 novel superfamilies, providing additional support to our spider venom evolution hypothesis.

    1. "Individuals are social animals who are influenced by social preferences, social networks, social identities, and social norms: most people care about what those around them are doing and how they fit into their groups..."

      https://www.theguardian.com/global/2019/nov/24/fear-of-missing-out-fomo-making-decision-biology-fobo-christmas-turkey#:~:text=Indecision%20when%20the%20decision%20is,or%20fear%20of%20missing%20out.

      This idea is one which we are all familiar with: the fear of missing out, more commonly known as FOMO. The referenced article extends this idea beyond to FOBO, or the fear of better options. This is a great example exhibiting the rationale behind why many of us tend to think socially, as we are afraid of missing out on the better choices that may be presented to us. Consequentially, in decision-making, we may rely on others' decisions, as we are all aware, whether consciously or subconsciously, that collective thinking is "smarter".

    1. Author Response

      Reviewer #2 (Public Review):

      The authors seek to determine how various species combine their effects on the growth of a species of interest when part of the same community.

      To this end, the authors carry out an impressive experiment containing what I believe must be one of the largest pairwise + third-order co-culture experiments done to date, using a high-throughput co-culture system they had co-developed in previous work. The unprecedented nature of this data is a major strength of the paper. The authors also discover that species combine their effect through "dominance", i.e. the strongest effect masks the others. This is important as it calls into question the common assumption of additivity that is implicit in the choice of using Lotka-Volterra models.

      A stronger claim (i.e. in the abstract) is that joint effect of multiple species on the growth of another can be derived from the effect of individual species. Unless I am misunderstanding something, this statement may have to be qualified a little, as the authors show that a model based on pairwise dominance (i.e. the strongest pairwise) does a somewhat better job (lower RMSD, though granted, not by much, 0.57 vs 0.63) than a model based on single species dominance. This is, the effect of the strongest pair predicts better the effect of a trio than the effect of the larger species.

      This issue makes one wonder whether, had the authors included higher-order combinations of species (i.e. five-member consortia or higher), the strongest-effect trio would have predicted better than the strongest-effect pair, which in turn is better predictor than the strongest-effect species. This is important, as it would help one determine to what extent the strongest-effect model would work in more diverse communities, such as those one typically finds in nature. Indeed, the authors find that the predictive ability of the strongest effect species is much stronger for pairs than it is for trios (RMSD of 0.28 vs 0.63). Does the predictive ability of the single species model decline faster and faster as diversity grows beyond 4-member consortia?

      Thank you for raising this important point. It is true that in our study we see that single species predict pairs better than trios, and that pairs predict trios better than single species. As we did not perform experiments on more diverse communities (n>4), we are not sure if or how these rules will scale up. We explicitly address these caveats in our revised discussion.

      Reviewer #3 (Public Review):

      A problem in synthetic ecology is that one can't brute-force complex community design because combinatorics make it basically impossible to screen all possible communities from a bank of possible species. Therefore, we need a way to predict phenomena in complex communities from phenomena in simple communities. This paper aims to improve this predictive ability by comparing a few different simple models applied to a large dataset obtained with the use of the author's "kchip" microfluidics device. The main question they ask is whether the effect of two species on a focal species is predicted from the mean, the sum, or the max of the effect of each single "affecting" species on the focal species. They find that the max effect is often the best predictor, in the sense of minimizing the difference between predicted effect and measured effect. They also measure single-species trait data for their library of strains, including resource niche and antibiotic resistance, and then find that Pearson correlations between distance calculations generated from these metrics and the effect of added species are weak and unpredictive. This work is largely well-done, timely and likely to be of high interest to the field, as predicting ecosystem traits from species traits is a major research aim.

      My main criticism is that the main take-home from the paper (fig 3B)-that the strongest effect is the best predictor-is oversold. While it is true that, averaged over their six focal species, the "strongest effect" was the best overall predictor, when one looks at the species-specific data (S9), we see that it is not the best predictor for 1/3 of their focal species, and this fraction grows to 1/2 if one considers a difference in nRMSE of 0.01 to be negligible.

      As suggested, we have softened our language regarding the take-home message. This matter is addressed in detail above in response to 'Essential Revisions'. Briefly, we see that the strongest model works best when both single species have qualitatively similar effects, but is slightly less accurate when effects are mixed. We also see overall less accurate predictions for positive effects. In light of these findings, we propose that focal species for which the strongest model is not the most accurate is due to the interaction types, and not specific to the focal species.

      We made substantial changes to the manuscript, including the first paragraph of the discussion which more accurately describes these findings and emphasizes the relevant caveats:

      "By measuring thousands of simplified microbial communities, we quantified the effects of single species, pairs, and trios on multiple focal species. The most accurate model, overall and specifically when both single species effects were negative, was the strongest effect model. This is in stark contrast to models often used in antibiotic compound combinations, despite most effects being negative, where additivity is often the default model (Bollenbach 2015). The additive model performed well for mixed effects (i.e. one negative and one positive), but only slightly better than the strongest model, and poorly when both species had effects of the same sign. When both single species’ effects were positive, the strongest model was also the best, though the difference was less pronounced and all models performed worse for these interactions. This may be due to the small effect size seen with positive effects, as when we limited negative and mixed effects to a similar range of effects strength, their accuracy dropped to similar values (Figure 3–Figure supplement 5). We posit that the difference in accuracy across species is affected mainly by the effect type dominating different focal species' interactions, rather than by inherent species traits (Figure 3–Figure supplement 6)." (Lines 288-304)

      The same criticism applies to the result from figure 2-that pairs of affecting species have more negative effects than single species. Considered across all focal species this is true (though minor in effect size, Fig 2A). But there is only a significant effect within two individual species. Again, this points to the effects being focal-species-specific, and perhaps not as generalizable as is currently being claimed.

      Upon more rigorous analysis, and with regard to changes in the dataset after filtering, we see that the more accurate statement is that effects become stronger, not necessarily more negative (in line with the accuracy of the strongest model). The overall trend is towards more negative interactions, due to the majority of interactions being negative, but as stated this is not true for each individual focal. As such the following sentence in the manuscript has been changed:

      "The median effect on each focal was more negative by 0.28 on average, though the difference was not significant in all cases; additionally, focals with mostly positive single species interactions showed a small increase in median effect (Fig. 2D)" (Lines 151-154)

      As well as the title of this section: "Joint effects of species pairs tend to be stronger than those of individual affecting species" (Lines 127-128)

      Another thing that points to a focal-species-specific response is Fig 2D, which shows the distributions of responses of each focal species to pairs. Two of these distributions are unimodal, one appears bimodal, and three appear tri-modal. This suggests to me that the focal species respond in categorically different ways to species addition.

      We believe this distribution of pair effects is related to the distribution of single species effects, and not to the way in which different focal species respond to the addition of second species. Though this may be difficult to see from the swarm plots shown in the paper, below is a split violin plot that emphasizes this point.

      Fig R1: Distribution of single species and pair effects. Distribution of the effect of single and pairs of affecting species for each focal species individually. Dashed lines represent the median, while dotted lines the interquartile range.

      These differences occur even though the focal bacteria are all from the same family. This suggests to me that the generalizability may be even less when a more phylogenetically dispersed set of focal species are used.

      We have added the following sentence to the discussion explicitly emphasizing the phylogenetic limitations of our study:

      "Lastly, it is important to note that our focal species are all from the same order (Enterobacterales), which may also limit the purview of our findings." (Lines 364-366)

      Considering these points together, I argue that the conclusion should be shifted from "strongest effect is the best" to "in 3 of our focal species, strongest effect was the best, but this was not universal, and with only 6 focal species, we can't know if it will always be the best across a set of focal species".

      As mentioned above, we have softened our language regarding the take-home message in response to these evaluations.

      My second main criticism is that it is hard to understand exactly how the trait data were used to predict effects. It seems like it was just pearson correlation coefficients between interspecies niche distances (or antibiotic distances) and the effect. I'm not very surprised these correlations were unpredictive, because the underlying measurements don't seem to be relevant to the environment tested. What if, rather than using niche data across 20 nutrients, only the growth data on glucose (the carbon source in the experiments) was used? I understand that in a field experiment, for example, one might not know what resources are available, and so measuring niche across 20 resources may be the best thing to do. Here though it seems imperative to test using the most relevant data.

      It is true that much of the profiling data is not directly related to the experimental conditions (different carbon sources and antibiotics), but in addition to these we do use measurements from experiments carried out in the same environment as the interactions assays (i.e. growth rate and carrying capacity when growing on glucose), which also showed poor correlation with the effects on focals. Additionally, we believe that these profiles contain relevant information regarding metabolic similarity between species (similar to metabolic models often constructed computationally). To improve clarity, we added the following sentence to the figure legend of Figure 3–Figure supplement 1:

      "The growth rate, and maximum OD shown in panel A were measured only in M9 glucose, similar to conditions used in the interaction assays." (Lines 591-592)

      Additionally and relatedly, it would be valuable to show the scatterplots leading to the conclusion that trait data were uninformative. Pearson's r only works on an assumption of linearity. But there could be strong relationships between the trait data and effect that are monotonic but not linear, or even that are non-monotonic yet still strong (e.g. U-shaped). For the first case, I recommend switching to Spearman's rho over Pearson's r, because it only assumes monotonicity, not linearity. If there are observable relationships that are not monotonic, a different test should be used.

      Per your suggestion, we have changed the measurement of correlation in this analysis from Pearson's r, to Spearman's rho. As we observed similar, and still mostly weak correlations, we did not investigate these relationships further. See Figure 3–Figure supplement 1.

      Additionally, we generated heat maps including scatterplots mapping the data leading to these correlations. We found no notable dependency in these plots, and visually they were quite crowded and difficult to interpret. As this is not the central point of our study, we ultimately decided against adding this information to the plots.

      In general, I think the analyses using the trait data were too simplistic to conclude that the trait data are not predictive.

      We agree that more sophisticated analyses may help connect between species traits and their effects on focal species. In fact, other members of our research group have recently used machine learning to accomplish similar predictions (https://doi.org/10.1101/2022.08.02.502471). As such we have changed the wording in to reflect that this correlation is difficult to find using simple analyses:

      "These results indicate that it may be challenging to connect the effects of single and pairs of species on a focal strain to a specific trait of the involved strains, using simple analysis." (Lines 157-159)

    1. can dive deeper into how you how you practice a spirituality that promotes both an individual well-being and the 00:09:02 health of our society and our environment like well it's interesting you ask that question because at the root of it or you could say the the presumption of that is is the kind 00:09:14 of duality or separation between the two right yeah exactly i mean i'm i'm reminded of something joanna macy uh said um the world has a role to play in our awakening 00:09:28 um i think many of us still have a kind of romanticized idea about the path even the bodhisattva path the idea that somehow you might go off to a cave and meditate really hard or something and 00:09:40 then when you're deeply enlightened then you return to the world and become engaged you know returning to the marketplace and i think frankly that's a bit simplistic if not if not naive it's like 00:09:54 the two go hand in hand uh because they reinforce each other you know um i think that when we start buddhist practice perhaps inevitably there there's a kind 00:10:07 of self-preoccupation because what brings us to it i mean there's some some suffering some dissatisfaction in our own lives why else would we spend so much time energy and money you know 00:10:20 making sore legs and backs for ourselves um but as we progress you know as as we get more insight into what's going on then if things are going well we eventually 00:10:33 begin to realize that at the root of our dissatisfaction is the delusion of separation yes from from other people and from the rest of the world

      !- integrating : individual and collective wellbeing - David Loy offers a clear explanation of the entangled nature of self-and-other - we begin the journey of self improvement due to problems in our personal lives, that is the motivation - but as we continue the journey, we may discover that it is our separation from others and from nature herself that is the cause of our dis-satisfaction - David quotes Joanna Macy, who said that "the world has a role to play in our awakening"

    1. you  see a lot of third world debts that uh if the   third world better countries have to pay uh their  foreign debts under as the world economy slows   down they're going to be subject to austerity to  the world banks and the imf's austerity programs   00:35:01 and they're going to be kept in poverty uh is it  really right that they should be kept in poverty   just to enrich the bondholders of the one percent  the one percent will say yes that's why we're   the one percent so that we can impoverish  other people that's our liberty our liberty   is the right to impoverish other people and reduce  them to dependency uh that will happen if you do   not write down the debts uh it's already happening  in the united states to the student debt uh crisis   00:35:30 where students uh have to pay so much money uh as  they fall behind on their student debts that they   can't afford to take out mortgages to buy homes  uh and you're having the home ownership rates   plunge in the united states that's the result of  leaving the debts in place uh the mortgage steps   uh uh are causing shrinkage so there is no way  to get out of this economic polarization without   00:35:54 a debt write down and that's something that  is too radical and uh uh when we talked about   when i was referring to what china's doing i'm  referring to what it's doing today and tomorrow   about uh the uh real estate company evergreen  uh uh china has a choice is it going to leave   evergreen's real estate debts in place and every  grand uh as a real estate company is two to three   00:36:21 percent of the entire chinese economy if it  pays the foreign creditors and the domestic   one percent of china it's going to impoverish the  uh the employees of evergrand it's going to make   housing prices more and more expensive in china  china has had a debt finance housing boom uh   if you leave the debts in place then uh you're  you're going to impoverish china and obviously   00:36:47 china is going to say i'm we're not going to put  the creditors first we're not going to do what   the west does and say the sanctity of debt service  debts are uh that you owe or sacred uh it's worth   sacrificing the economy it's worth plunging the  economy into poverty just to preserve the wealth   of the one percent i think china's uh is going to  make the opposite decision and say we're not going   00:37:12 to commit political suicide we're going to operate  for it's a socialist economy and when it comes to   debt and credit thank god we have our banking in  the public domain and since the public domain the   people's bank of china is the creditor they can  afford to write down the debt without having any   political backlash because it's cancelling that  so do itself uh which is a great advantage uh and   00:37:38 it's also uh as for the private bond holders uh  it's going to say well sorry bondholders you made   loans to a company that was way over leveraged  uh already uh the american bond rating companies   have reduced their bond rating to chunk so you  knew what you were buying if you continued to hold   bonds that uh fitch and other bond raiders moody's  all say or junk and you lose your money well   00:38:03 you took the risk you got a high rate of  interest now you're you're paying the price   that's how markets work uh and uh that really  uh is the argument and i think uh you have to   uh obviously what i'm suggesting is a radical  step just as you're suggesting of taxing wealth   would require the radical step of closing down  offshore banking centers of simply negating uh if   00:38:28 banks would simply erase all of the deposits  they have from the offshore banking centers from   the cayman islands from from panama from uh from  liberia to all the places that began by to be set   up by the mining companies the oil companies  and then were set up beginning in the 1960s   essentially by the cia to finance  the vietnam war by making america   like england the home for criminal capital  for flight capital all this uh all this flight   00:38:57 capital and the kleptocracy that you mentioned  in russia all this really should be wiped out   and if you leave this capital if you leave this  one percent in place the economy is going to be   sacrificed and shrinking is it worth shrinking  the economy just to leave the one percent in place   and if you challenge them that's pretty radical  that's really what i think marx would say today

      !- Micheal Hudson : debt writedown - At a certain point, Governments of 3rd world countries who are so debt trapped may simply decide to write down the debts and start over - They may reach a point where instead of servicing the debt of the 1%, they decide its not worth it and save their own economies, freeing themselves from World Bank and IMF debt conditions - It's just as radical a move as your suggestion to stop tax evasion by closing down all offshore banks

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Dear Editor,

      Please find below our detailed responses (in black font) to the Reviewer's comments (in blue). In addition, and to the request of Reviewer #1, we added a PDF file called “Reply to the reviewers MS data” that shows MS/MS and quantification information of representative peptides which were selected based on their (different) caspase/control abundance ratios. We thank the reviewers for their time and valuable comments.

      NOTE: our original reply includes several tables and graphs that were not incorporated into our reply shown below

      Reviewer #1

      Page 4 - In contrast to the hindrance of N-terminal amine ionization by Nt-acetyl groups concluded by the authors, previous studies reported an improved MS-scoring if α-amino-acetylated (tryptic) peptides by the higher numbers of b and y fragment ions observed as compared to α-amino-free (tryptic) peptides (e.g. (Staes et al., 2008)). It is rather the lack of any N-/C-terminal charged residue in case of Lys-N type N-termini which makes LATE less suitable for studying N-terminal protein acetylation.

      We thank the reviewer for this comment. In the HYTANE and LATE workflows, only peptides with modified N-termini (by dimethylation or acetylation) are observed after negative selection, hence we argue that the important comparison here is between Nt-acetylated peptides and Nt-dimethylated peptides with (as in HYTANE) or without basic residue (as in LATE). While we are aware of the study by Staes et al 2008 (PMID: 18318009), we do not believe it contradicts our claim as it discusses the difference between Nt-acetylated peptides and peptides with free N-termini.

      As we indicated in the manuscript (page 5 in the last sentence of 1st paragraph), we observed less overall peptide identifications in LATE, which was expected due the lack of basic C-term residue. The reduction of identification was more pronounced for Nt-acetylated peptides. However, this still does not exclude LATE as a useful tool for identification of such peptides.

      Of note, the overall fragmentation coverage we obtained by LATE and HYTANE for Nt-acetylated and Nt-dimethylated peptides was similar. See the figure below.

      Hence, following Cho et al 2016 (PMID: 26889926), we suggest that the difference in ionization of Nt-dimethylated peptides vs Nt-acetylated peptides is the more dominant factor in peptide identifications.

      Figure 1:relative Ion coverage for modified peptides in LATE and HYTANE

      Page 4 - Besides indication the retained N-termini with high relative caspase-3/control abundance ratio's as putative caspase-3 proteolytic products, also indicate that unique peptides were retained, as many such singletons were reported in previous (caspase-focussed) degradomics studies making use of differential proteomics (e.g. (Van Damme et al., 2005)). Therefore the cut-off ratio of 2 rather seems unsubstantiated, unless the cellular proteomes of so-called control cells were affected by caspase activation. As such, showing some representative MS-spectra of neo-N-termini would be informative.

      We thank the reviewer for this comment. We agree that caspase-3 cleavage generates many singletons. This is indeed what we observed in the in vitro experiment as shown in Figure 2B by the long straight lines at Log2(caspase-3/control) >10. We also add here histograms of the obtained ratios that we hope will make this clearer. We chose a cut-off of 2 due to the basal activity of proteases (including caspase-3) as we did not add caspase-3 inhibitors to the cell lysate. In addition, peptides derived from the putative caspase-3 cleavages in the in vitro experiment were required to be identified only in the caspase-3-treated samples (i.e. to appear only with the heavy labeling). Minor changes to Figure 3 legend have been introduced accordingly. As can be seen in the table below, with a cut-off ratio of 2 (Log2=1) and selection of cleavage sites after D or E we identified >98% of the cleavage sites that were identified only in the caspase-3 treated samples (column text in blue). This rate did not change when the cut-off was set to 8 (Log2=3). Therefore, we have chosen to maintain our selection criteria and cut-off ratio as used before for both experiments.

      Figure 2: Histograms of Log2(Caspase/control) ratios indicating the large number of singleton peptides (marked with arrows)

      Table 1: In vitro experiment selection ratio

      Method

      Cutoff

      Time

      Sites

      Sites identified only caspase-3 treated samples

      % of caspase-treated only sites (singleton)

      Sites D/E with light

      Sites after D/E no light

      % of singleton

      LATE

      Log2=1

      18H

      936

      906

      96.8%

      798

      786

      98.5%

      LATE

      Log2=2

      18H

      884

      866

      98.0%

      767

      759

      99.0%

      LATE

      Log2=3

      18H

      819

      810

      98.9%

      722

      716

      99.2%

      HYTANE

      Log2=1

      18H

      1186

      1159

      97.7%

      1037

      1032

      99.5%

      HYTANE

      Log2=2

      18H

      1128

      1110

      98.4%

      998

      993

      99.5%

      HYTANE

      Log2=3

      18H

      1035

      1025

      99.0%

      924

      919

      99.5%

      LATE

      Log2=1

      6H

      755

      732

      97.0%

      656

      645

      98.3%

      LATE

      Log2=2

      6H

      711

      700

      98.5%

      630

      623

      98.9%

      LATE

      Log2=3

      6H

      671

      666

      99.3%

      601

      597

      99.3%

      HYTANE

      Log2=1

      6H

      1028

      988

      96.1%

      899

      890

      99.0%

      HYTANE

      Log2=2

      6H

      955

      931

      97.5%

      851

      844

      99.2%

      HYTANE

      Log2=3

      6H

      882

      871

      98.8%

      795

      791

      99.5%

      LATE

      Log2=1

      1H

      445

      423

      95.1%

      380

      372

      97.9%

      LATE

      Log2=2

      1H

      411

      402

      97.8%

      361

      355

      98.3%

      LATE

      Log2=3

      1H

      386

      380

      98.4%

      344

      338

      98.3%

      HYTANE

      Log2=1

      1H

      593

      559

      94.3%

      513

      506

      98.6%

      HYTANE

      Log2=2

      1H

      544

      532

      97.8%

      488

      482

      98.8%

      HYTANE

      Log2=3

      1H

      508

      498

      98.0%

      461

      455

      98.7%

      In the cell-based experiments of caspase-3, we induced apoptosis on both cell types (over-expressing caspase-3 and the control). Therefore, in this case, as the reviewer has also mentioned, a cut-off of 2 is appropriate because the control cells are also affected by caspase activation. Following the reviewer’s request we have added (in a separate PDF file) several representative MS/MS spectra of neo-N-term peptides and their corresponding quantification data.

      Page 4 - replace 'without labelling of lysine residues (epsilon-amines)' to 'without notable labelling of lysine residues (epsilon-amines)', as residual labelling of lysine side-chains was observed. Also in case of the latter, do note that reduced MS-ionization potential might impact labelling efficiency calculation, and chromatographic detection of labelling efficiency should be considered to conclusify this finding.

      We thank the reviewer for this comment. We have changed the sentence as requested (Page 4 marked in red). Regarding the labeling efficiency calculations, it is true that ionization potential might affect them. We used a common way to test this aspect (see e.g. Hurtado Silva et al 2019 (PMID: 30934878)) and we are not aware of any reduction in ionization potential following lysine dimethylation. Although we did not study this aspect thoroughly, we frequently observe the opposite: that dimethylation improves MS detections.

      Page 6 - The experimental setup comparing caspase-3 overexpressing and ABT-199 induced versus ABT-199 induced cells will be highly biased as it will not be able to detect efficient caspase-3 cleavages (Plasman et al., 2011), as such cleavage events are complete and thus do not require any additional overexpressed capase-3. I see this as an important flaw and the authors should demonstrate that the list also includes efficient caspase-3 cleavages.

      We thank the reviewer for highlighting this important aspect. We agree that with our setup, we can miss some efficient cleavages of caspases-3. We acknowledged this caveat in the original text (page 6), but chose to perform our experiments this way in order to highlight cleavages that are affected by caspase-3 expression. To address the reviewer’s comment we have added new experiment and data on caspase cleavages that occur following ABT-199 treatment in HCT116 cells without overexpression of caspase-3. The focus of this experiment was on the relatively short time points following the ABT-199 treatment when no cell death is observed based on XTT assay (see Supplement Figure 6B). This experiment was used to prove that neo-Nt-acetylation of NACA is an early event in apoptosis (Figure 5 E-F page 12). We also used this experiment as an indication of the appearance of efficient cleavages. As can be seen from Supplement Table S10, if we consider all 3 time points of the ABT-199 treatment, we quantified 106 cleavages with free neo-Nt that were cleavages after D and were identified only in the ABT-treated samples. We refer to such cleavages, which appeared prior to noticeable cell death, as "efficient cleavages". Out of these efficient cleavages, 82 were also identified and quantified in the cell-based experiment with overexpression of caspase-3. Twenty efficient cleavages show a high ratio (≥2) in both experiments. Fifty six efficient cleavages had a high ratio in the new experiment and ratio below 2 in the cell-based experiment with overexpression of caspase-3. This supports our original claim regarding efficient cleavages and addresses the reviewer’s concern regarding our ability to identify efficient caspase-3 cleavages with the experimental setup of HCT116 cells overexpressing caspase-3.

      Page 12 - The setup doesn't permit ORF N-terminal stability per se, rather the cleavage susceptibly of N-termini holding (a) putative caspase-3 cleavage site(s). Please adjust accordingly. Again since the setup might have missed efficient cleavages, the assessment might be biased.

      Thanks for the comment. As requested, the word “stability” has been deleted. As discussed above, we demonstrate that our setup does allow the identification of efficient cleavages and hence our basis for believing that the assessment is not biased. Please also refer to our reply to the next comment.

      The claim that Nt-acetylation is protective for caspase-3 cleavage should be validated by monitoring cleavage efficiency of an Nt-acetylated versus an Nt-free variant (e.g. by introducing a Pro residue at AA position 2, or comparing cleavage efficiencies in corresponding NAT knockdown versus control cells) of an identified caspase substrate (i.e. a substrate holding a caspase-3 cleavage site in its N-terminal sequence) versus its Nt-free counterpart

      Thanks for raising this point. The reviewer's suggestions have some caveats: a mutation at a protein’s N-terminus in order to generate an Nt-free variant can alter its stability or function and NAT knockdown might have a profound biological impact on the cells. Therefore we chose a different way to study this aspect by selecting from our data ORF N-terminal peptides that were identified with both free N-termini and acetylated N-termini (i.e. the same peptide was identified in some PSMs as acetylated and in other as dimethylated). We managed to find 136 ORF N-terminal peptides that were quantified in both forms, and out of these, 122 contained Asp or Glu residues (the putative caspase cleavage motifs). We added the comparison of the abundance ratios of these peptides in Figure 4C (see also below). It shows a remarkable difference between the groups when the Nt-acetylated peptides ratios did not change as a result of caspase-3 overexpression while the peptides with free Nt were identified mostly in the control cells (negative Log2(caspase-3/control)). Comparison of the 14 ORF Nt-peptides that do not have Glu or Asp in their sequence shows no difference (see below).

      Figure 3: Abundance ratio distributions of the ORF Nt peptides identified with both Nt-acetylated and free Nt in HCT116 cells overexpressing caspase-3 and in the control. A. Comparison of peptides that contain putative caspase cleavage in their sequence (D or E) B. comparison of peptides without putative caspase cleavage

      These results provide additional support for the notion of the protective or shielding effect of Nt-acetylation against caspase-3 cleavage.

      Page 12 - Since post-translational Nt-acetylation of neo-N-termini could be reproduced in vitro in the non-dialyzed sample, enzymatic over chemical Nt-acetylation should be demonstrated (e.g. by the use of a (bisubstrate) NAT inhibitor).

      We think this is an interesting idea for future work. Yet, in our opinion, the fact that only very few neo-Nt-acetylated peptides were affected in vitro and that a similar trend of few selected neo-Nt-acetylation targets was shown in the cell-based experiments indicates that this process is enzymatic and not chemical in nature.

      Other concerns:

      Abstract - The abstracts holds complex/incorrect sentence constructions (e.g. simply indicate 'Protein N-termini', '... undergo ... processing by proteases' (currently: 'not be processed by proteases').

      Thanks for pointing this out. We have changed the abstract accordingly.

      Abstract - 'To expand the coverage of the N-terminome' only applies when this is used in conjunction with other negative enrichment strategies as by itself, LATE doesn't intrinsically provide a better coverage of the N-terminome (this is also noted at page 2).

      We thank the reviewer for pointing this out. We have changed the abstract accordingly.

      Change 'that cannot be identified by other methods' to 'that cannot be identified by other negative selection methods'

      Thanks for pointing this out. We believe that our description here is appropriate as we explicitly state “some of which cannot be identified by other methods”.

      Page 1 - Suggestion to change 'Proteases are typically described as degradative enzymes' to 'Proteases used to be described as degradative enzymes'

      Changed as suggested.

      Page 1 - Not really correct how written; 'N-terminomics methods highlight the N-terminal fragment of every protein (N-terminome)'

      Changed as suggested.

      Page 2 - Positive selection techniques .... Enrichment of unblocked (or Nt-free) N-termini

      We are not sure what the reviewer had in mind here but have added the text in the brackets

      Page 2 - Besides altering charge, Nt-acetylation also alters hydrophobicity ...

      Changed as suggested.

      Page 2 - remove 'to better chart'

      Changed as suggested.

      Page 2 etc. - Do note that caspase-3 can potentially activate downstream caspases in vitro

      Following this comment, we have added a sentence on Page 5 with this reservation

      Page 3 - functional crosstalk between proteolysis and neo-Nt-acetylation has already been demonstrated in the case of co-translational acting methionine aminopeptidases and chloroplast N-terminal acetyltransferases. Adjust accordingly.

      We thank the reviewer for highlighting this aspect, although we used the term “neo-Nt-acetylation” which we used to mark that this is not the common (co-translational) acetylation. To assure that this is more clear we have added the words “post-translational” to better define the novelty of our findings.

      Page 3 - when discussing the identification of ORF N-termini, note that some of the strategies of which note when used to enrich for in vivo blocked N-termini, can also be used without blocking/labelling of Lys residues, and thus trypsin will also result in Lys-ending peptides. This is important to consider in this context.

      Following the reviewer's remark we have changed the sentence so it now states: “Many of these N-terminomics methods……”

      Page 3 - remove the following sentence part; '... or run individually which can be useful for quantifying naturally modified N-termini.', since also a differential/labelled proteomics setup enables such assessment. Related to this, the authors should comment on the observation that much fewer (i.e. less than 40%) Nt-acetylated N-termini were identified by LATE as compared to HYTANE. How is this reflected in the number of PSMs? Probably these difference are further intensified when considering PSMs.

      We have changed the sentence as suggested.

      Regarding the reduction of Nt-acetylation, we thank the reviewer for this question as it led us to find typos in the numbers in Figure 1E which are now corrected. These typos did not change the overall observation that with LATE we identify fewer Nt-acetylated peptides than Nt-free (dimethylated) peptides. As the reviewer anticipated (see below), the reduction in LATE-based “contribution” to the identification of Nt-acetylated peptides as opposed to the identification of dimethylated peptides, is pronounced when considering PSMs but this is not much different than the peptide-based data. Therefore, we prefer to keep the current presentation of Figure 1E.

      Figure 4: Comparison of HACAT cells N-terminal peptides identification with LATE and HYTANE when considering peptide sequences and PSMs. Peptides identified with both methods are in green and those that are unique to one method are in blue. Shared peptides were determined based on the sequence of the first 7 amino acids of the identified peptides. A. comparison for peptides with dimethylated N-terminal (free Nt) B. comparison for Nt-acetylated peptides.

      Page 6 - Informative to indicate how many of the in silico predicted putative DEVD P4-P1 cleavages were actually present in the list of 2049 putative cleavages identified.

      In our dataset, we identified 17 cleavages after DEVD motif. 11 were identified only with HYTANE, 3 were identified by both methods, and 3 more were identified only with LATE. Of note, it seems that in large-scale proteomic studies of apoptosis, the number of caspase cleavages after DEVD motif is quite low. For example, in the CASBAH database (PMID: 17273173__) __there are 10 reports of such cleavage out of 391 reported sites, and in DegraBase (PMID: 23264352) that combined many different apoptotic experiments there are 64 reported DEVD sites out of a total of 6896 P1-Asp sites.

      Page 6 - Unclear if any of the of 2049 putative cleavages, included non-canonical P1 cleavages besides the P1 Asp and Glu cleavages identified.

      These are 2049 putative cleavage sites with P1 Asp or Glu. We have changed the text to make it clearer.

      Page 6 - Were the 'regular' cells mock transfected?

      No. The control cells used for the cell-based experiments were the non-transfected cells from the same culture of HCT166. We chose this option to guarantee that exactly the same cells that were grown in the same dish went through the same FACS sorting as a control.

      Page 6 -Important to note that an ORF can have multiple N-termini besides neo-N-termini (e.g. in the case of alternative translation initiation)

      Thanks for the great point. We have added an indication if the neo-N-termini site has been reported as an alternative translation initiation site to all of the results of the cell-based experiments (Supplementary Tables S4, S5, S6, S9). We also changed the Figures and text accordingly. Our analysis of reported/unreported neo-N-temini is based on the TopFind database which includes information about alternative translation initiation sites from TISdb. Of note, since our focus is on caspase cleavages and we further select putative cleavages based on D/E in P1 and fold change, out of 973 peptides that we reported as putative caspase cleavage (Table S6) only one is in the vicinity of an alternative initiation site.

      Page 6 - The authors should be more careful with generalization when comparing LATE and HYTANE (and other degradomics approaches) as in this study LATE was only applied for the identification of caspase-3 neo-N-termini, which by its extended substrate specificity might hold specific features enabling the preferred detection by one technique over the other. Also note that as compared to less recent studies, evidently the MS instrument used is a key factor in the increase in cleavages reported in the current study.

      It is conceivable that caspase cleavage may differ from other proteases and thus theoretically work better with LATE, but we fail to see why this would also be the case for other N-terminomics method (like TAILS, Subtiligase, CoFRADIC, ChaFRADIC etc). We showed that LATE provides additional ORF Nt peptides identifications and demonstrated its effectiveness in E. coli (Supplement Figure S2) also, which has a proteome with a different amino acid composition to the human proteome. Furthermore, using LATE in the cell-based experiment led to the identification of many neo-Nt-peptides that do not match caspase cleavage patterns (as indicated for both HYATNE and LATE in Figures 3E and 3F). We reviewed the text again, and believe that we have used a fair description of the results especially when we compared them to previous studies.

      Page 9 - The authors should provide some info/supporting statistics in the text regarding the new putative substrates showing GO-enrichments (compared to which control?) similar to previously reported caspase-3 substrates.

      The results of the GO enrichment analysis are presented in Fig. S8 and details about how the test was performed are provided in the Materials & Methods. In the revised version, we are including the numerical data that include results of the statistical tests per GO term as Table S12. The enrichment analysis was performed with respect to the whole human proteome.

      Page 11 - Indicate that the 11 neo-N-terminal peptides of which note are the neo-Nt-peptides matching (signal peptide) cleavages indicated in the Uniprot database. Were any corresponding di-methylated neo-N-termini of these cleavages identified? In case of the 'other' proteolytic cleavages of which note, refer to these as not-annotated in UniProt.

      We thank the reviewer for pointing this out. We have added an indication that this analysis is based on UniProt annotations. Yes, all of the reported 11 neo-Nt-Acet peptides shown in Figure 4 were also found as neo-Nt-DiMet peptides.

      Page 11 - post-translational Nt-acetylation is abundant in plant and the responsible NAT has been identified, please reference these studies as well.

      We thank the reviewer for pointing this out regarding page 11. A relevant reference has been added in Page 11. In the discussion, we already referenced Nt-acetylation in plants in the discussion as well (see page 14).

      Page 12 - Define 'undoubtedly dependent on caspase-3 cleavage'

      We thank the reviewer for pointing this out. The word ‘undoubtedly’ has been deleted.

      Page 14 - The NAA30 discussion is not really relevant for the discussion of the post-translational Nt-acetylation of mitochondrial neo-N-termini.

      We thank the reviewer for pointing this out. This sentence has been deleted.

      Viewing the harsh in vitro caspase-3 cleavage condition used, namely 1 µg caspase 3 over 20 µg protein, the P1 specificities of all identified neo-N-termini should clearly be shown.

      The P1 specificities of all neo-N-termini found in the in vitro experiment are listed in the supplementary tables S2 and S3. For the reviewer’s convenience, we are providing the table with the P1 specificities below:

      Since acetylation of serine and threonine residues are reported forms of post-translational modification, and many so-called past-translational Nt-acetylated neo-N-termini harbour such AA residues in their N-terminal sequence, b-ion coverage for these neo-N-termini should be provided/inspected.

      We are not sure that we understand this comment. O-Acetylation of amino acids refers to their side chain. Since we are using Di-methylation labeling in both HYTANE and LATE, if we have a peptide with O-acetylated Ser or Thr at its first position, it is possible to distinguish it from the same peptide with Nt-acetylation by MS1 as illustrated in the following table for a hypothetical peptide SAAANPELKR (mass is MH+1)

      Regardless we include in the manuscript MS/MS spectra of NACA Neo-Nt-acetylated peptide by HYTANE and LATE

      Reviewer #2

      Major suggestions:

      • The LATE method relies on digestion with LysN. Can the authors comment on the digestion efficiency of the samples where the LATE workflow was applied?

      The LysN digestion details that we used were based on vendor (Promega) instructions combined with details from the Nature Protocol paper by Giansanti et al 2016 (PMID: 27123950__)__ that describes optimized digestion protocol for LysN. We tested LysN efficiency in terms of the identification of missed cleavage and found that it performed very well with a missed-cleavage rate of

      • The authors state that the number of peptides with acetylated N-termini was lower compared with HYTANE. Yet, the Nt-acetylation can occur co-translationally in approximately 85% of human proteins.

      Did the authors consider optimizing the method (e.g. by fractionating the sample) for better identification of such peptides?

      We thank the reviewer for this important comment. We are certain that it is possible to improve the output of LATE by fractionation and/or optimization by changes to the LC gradient as it is well established for most, if not all, bottom-up proteomics methods. In this work, we concentrated more on the proof of concept of the methodology and hence chose to work without fractionation. We performed one attempt to optimize the LC gradient but found that the results were not significantly different, and we thus used the same LC-MS methods that have been optimized for trypsin.

      Regarding the reduced identification of Nt-acetylated peptides, as we state in the manuscript following Cho et al 2016 (PMID: 26889926), we believe that this is mainly due to the reduced ionization efficiency of Nt-acetylated peptides compared to Nt-dimethylated peptides which is more pronounced when a C-terminal positive charge is missing (due to LysN digestion).

      Also, were the results of the study compared with searches done using other proteomic pipelines (e.g. FragPipe)?

      Unfortunately, when we started this project, MS-Fragger did not support LysN as the digesting enzyme. At the time TPP also provided better visualization and quantification support than FragPipe. Recently, we found that MSFragger is faster while providing similar identifications but we are not convinced of the quantification output via FragPipe. In addition, we performed comparisons of Comet to X!Tandem and while the searches took longer than with Comet, the total number of IDs did not improve significantly.

      Can the authors provide details on the settings used for searches done in COMET, especially for the samples treated with LysN?

      The settings are provided in Table S10 in the supplementary information (Page 14 of the PDF file).

      "Fractions containing relatively pure caspase-3 were pooled together and dialyzed against 20 mM HEPES 7.5, and 80 mM NaCl. Aliquots of the protein were stored at -80{degree sign}C"

      o What exactly is meant by 'relatively pure'?

      We apologize for the inaccurate description. The relevant text has been updated (Page 17) and now explains that this was based on Coommasse stain SDS-PAGE.

      Minor suggestions:

      • Please check the link for the Github as this reviewer could not open it.

      We thank the reviewer for pointing this out. We corrected the link. In any case, the relevant scripts can be found here: https://github.com/OKLAB2016

      • Please correct the spelling.

      The manuscript was proofread.

      Comments regarding figures:

      • Figure 2:

      o All figures comparing LATE and HYTANE utilize color green for LATE. Yet, in figure 2G, HYTANE is depicted in green-like color. Please consider staying consistent with the color scheme.

      We thank the reviewer for this comment. Done as suggested.

      Reviewer #2 (Significance (Required)):

      Significance:

      • The LATE method provides an excellent way to study proteases in vitro or in cell-based experiments. It enables deep investigation of N-terminome based on a simple and cost-effective workflow that utilizes digestion with LysN followed by chemical derivatization of α-amines. This approach allows for the identification of N-terminal peptides that may escape detection by other N-terminomics methods. With LATE, proteases' cleavage sites that might not so far be reporter due to technical limitations, can be studied and characterized. Hence, LATE is a useful addition to the N-terminomic toolbox.

      We thank the reviewer for the positive comments and general assessment of LATE.

      Reviewer #3

      In this manuscript, Hanna et al. report LATE, an N terminomics method similar to N-TAILS and HYTANE, with modifications that enhance or change coverages of the N-terminal proteome in proteomics datasets. LATE relies on selective N-terminal modification of protease-treated, LysN digested samples, enabling internal peptides to be depleted based on the presence of the unblocked lysine epsilon amine. Using LATE in comparison with HYTANE, the authors identified a large number of both known and unknown caspase-3 cleavage sites, both in vitro and in vivo. Because LATE enables identification of both proteolytic neo-N termini and natively blocked N termini such as those that are acetylated, the authors were able to discover a number of post-translationally acetylated proteolytic neo-N termini. This finding points to potential functional cross talk between apoptotic proteolysis and Nt-acetylation. Overall, this is a very nice manuscript that adds a valuable new tool to the N-terminal proteomics toolbox. However, the manuscript could be improved by addressing the following questions and comments.

      We thank the reviewer for this assessment.

      1. One of the benchmark points used to describe the need for a new technology such as LATE is the idea that there are 134 putative caspase-3 substrates in the human proteome, of which only about half can be identified based on ArgC cleavages. However, the 134 substrates seem to include only those that have the exact canonical DEVD motif. Many more substrates than this are already known for caspase-3. For example, >900 caspase-3 substrates were identified by Araya et al. alone. It might make more sense to apply a position-specific scoring matric to the human proteome to predict a maximum number of possible caspase-2 cleavage sites and how many would be expected to be identified using other technologies. Otherwise, please provide a rationale for why these 134 putative caspase-3 sites are representative.

      The reviewer is correct. Indeed, most of the identified caspase-3 cleavage are not exact matches to the DEVD motif. We used the DEVD as an example to illustrate the added value of using lysine-based digestion together with ArgC. We obtained a similar trend with some variations when we tested the feasibility of the identification of the human ORF Nt-peptides, E. coli ORF Nt-peptides and more. We are quite confident that any prediction will show a relatively similar distribution. To demonstrate this, we show here the relative contribution of each method for the identification of any peptide that begins after Asp in the human proteome.

      While the distributions are not identical, they are very similar, and the specific additions from LATE (LysN) are between 20% to 22% out of the total and it can help to expand the coverage by 42% to 45%.

      It is definitely plausible and have been previously demonstrated that selective N-terminal demethylation can be achieved under the right reaction conditions, and I do not doubt that it has been achieved here. However, I do not understand how the authors were able to conclude that alpha-amines are blocked with 95% efficiency and lysines are blocked at

      This is a very good point. The reviewer is correct and indeed we don’t have a way to establish if the dimethylation is on the side chain amine of lysine or on its N-terminal amine. A partial support for our claim is from labeling experiments that we (and others) conducted with tryptic and LysC peptides that clearly demonstrate that under the specified labeling conditions, 95% of the N-terminal amines are labeled and not the lysine side chain amines. However, at the end of the day, this does not change the outcome of LATE.

      Related to the above comment, Table S10 seems to indicate that MS/MS data from LATE were searched with dimethylation as a fixed modification at the N terminus. Were LATE samples searched with different parameters to generate Figure 1C? Are the dimethylated Ks identified mostly from missed cleavages and therefore not at the N terminus?

      We thank the reviewer for pointing this out. The search parameters used for the generation of Figure 1C have been added to Table S10. The reviewer is correct, the few dimethylated Ks identified in the search used for Figure 1C are mostly from missed cleavages.

      For both the in vitro and in vivo experiments, how many of the new caspase-3 cleavage sites occurred in proteins that were not previously known to be caspase substrates?

      In the in vitro experiments, we identified cleavages of 372 proteins that were not reported as caspase-3 substrates based on the databases we used as references. A line specifying this number was also added to text on page 7. In the cell-based experiment, we identified putative caspase-3 cleavages of 67 proteins that were not reported so far as caspase-3 substrates. This information has been added to the main text on page 10. We have added columns indicating the known/unreported protein substrates to Tables S2, S3, S4, S5, and S6.

      For the experiment in cells, can the authors explain the rationale for comparing cells in which apoptosis is induced with ABT-199 to ABT-199-treated cells with caspase-3 overexpression? What is the advantage over comparing ABT-199 treated cells to untreated cells

      Great question. An N-terminomics study of “common” apoptosis would lead mainly to the identification of effector caspases (caspase-3 and -7) substrates. Our aim was to focus mostly on the caspase-3 cleavages that occur in the cell during apoptosis. In choosing this gain-of-function approach we were motivated by the idea that it couldprovide new insights that would otherwise go undetected when using knockout or other loss-of-function approaches. The advantage of this system over comparing ABT-199 treated to non-treated cells (which we have now added as well) is that it can enhance the identification of caspase-3 specific cleavages.

      Can the authors discuss the timescale of cell death in ABT-199 treated cells vs. ABT-199 treated caspase-3-overexpressing cells. Ideally, data showing cell viability over time (e.g. Cell Titer Glo or MTT assays) would be presented, but if the authors could at least describe whether apoptosis is accelerated in the caspase-3 overexpressing cells, it would be helpful.

      Great suggestion. Following the reviewer’s suggestion we have characterized the effect of caspase-3 overexpression of the cells by XTT assay, and indeed caspase-3 overexpressing cells do show accelerated cell-death in response to ABT199 compared to non-transfected cells. These results are now presented as Supplement Figure S6B and are mentioned in the results section.

      The authors say that in their experimental design, they expect to see no difference between ABT-199 only and ABT-199/caspase-3 overexpression for substrates that are cleaved efficiently by endogenous caspases. If the new caspase-3 substrates are not cleaved efficiently by endogenous caspase-3, this seems to call into question their physiological relevance. Can the authors explain more thoroughly how these new substrates fit into the apoptotic program?

      We thank the reviewer for raising this issue. We are aware that our original cell-based experimental design may have some limitations, yet we chose this gain-of-function setup in order to identify caspase-3 substrates in a cell-based system. We believe that this setup does allow identification of substrates that are efficiently cleaved by endogenous caspase-3, such as cleavage and acetylation of NACA at Ser34 (and neo-Nt-acetylation after caspase-3 cleavage in general). To study the physiological relevance of the neo-Nt-acetylation, we have added to the revised manuscript a time-course N-terminomics characterization of early apoptosis events conducted in HCT116 cells (without caspase-3 overexpression). The results of these experiments are now shown in Figure 5C and also in the Supplementary Table

      The authors convincingly show that cleaved NACA is a neo-substrate for Nt-acetylation, suggesting functional crosstalk between proteolysis and acetylation. However, it is not clear if this acetylation event has a functional consequence, so it seems inaccurate to say at the top of page 3 that "This is the first demonstration of functional crosstalk between neo-Nt-acetylation and proteolytic pathways."

      The author is correct. We changed the text accordingly.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01588

      Corresponding author(s): Erh-Min, LAI

      [Please use this template only if the submitted manuscript should be considered by the affiliate journal as a full revision in response to the points raised by the reviewers.

      If you wish to submit a preliminary revision with a revision plan, please use our "Revision Plan" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements [optional]

      This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

      The authors thank the reviewers for the positive and valuable comments, which have helped us to improve the quality of this work. We have addressed all comments by providing additional data and/or explanation with a detailed point-by-point response. The revised manuscript included new data: 1) viable cell counts of growth inhibition assay (Fig. 2A), 2) Quantitative data of microscope data (Fig. 2C, Fig. 4), 3) quantitative data of interabacterial competition (Fig. 5A, 5B), western blotting data of growth inhibition (Fig. S1A and S1B), secretion assay of single glycine-zipper mutants (Fig. 5C), and inclusion of full gel of western blot results (Fig. S3 and S5). By integrating these new results, we have substantially strengthened the findings that a glycine zipper motif of a type VI secretion effector T6SS Tde1contributes to its translocation across the cytoplasmic membrane of target cells.

      2. Point-by-point description of the revisions

      This section is mandatory. *Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. *

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: In this manuscript, Ali et al. propose that a glycine zipper motif located at the N-terminus of the Agrobacterium tumefaciens T6SS DNase effector, Tde1, can transport the toxin across the cytoplasmic membrane and into the cytoplasm, where its target is found. To support these claims, they perform a series of secretion, competition, toxicity, and fluorescence microscopy assays showing that a mutation in two glycine residues affects toxicity of the effector during competition and its ability to enter a target cell, but not its secretion through the T6SS or its binding to the adaptor protein Tap1. The concept brought forth in this study is quite interesting and important - the notion that T6SS effectors have domains that aid in their transport into the cytoplasm of the target cell. This is similar to a recent finding that a domain common to bacterial pyocins and T6SS effectors can mediate DNase toxin transport through the target cell's cytoplasmic membrane (Atanaskovic et al., mBio, 2022); the authors should mention and discuss this recent work. Nevertheless, it is my impression that the results do not fully support the conclusions and proposed mechanism, even though the general idea seems correct.

      Ans: We thank this reviewer found this work interesting and important. We hope the revised manuscript including the new data and careful interpretation have substantialized the conclusions and proposed mechanisms. We also included the excellent work by Atanaskovic et al., 2022 and discussed the findings in the revision (see lines 344-349).

      Major comments:

      • An experiment that directly demonstrates the ability of the glycine zipper to mediate transport of a toxin across a membrane would greatly support and solidify the conclusions of this work. For example, showing the ability of a purified protein to enter spheroplasts or liposomes in a glycine zipper dependent fashion. Currently, the authors perform experiments that can only indirectly support the proposed function of the glycine zipper to enable the effector to cross the membrane, and as detailed below, some of these experiments are over-interpreted in my opinion. Ans: We agree that the direct evidence for the ability of the glycine zipper to mediate Tde1 transport across target cell membrane is to perform the in vitro translocation assay. Unfortunately, the attempts to purify sufficeint amounts of full-length or N-termial version of Tde1 have not been successful. Therefore, we are unable to perform this experiment. Accoringly, we have tried our best to carefully interpret the data and rephrase the statements accordingly.

      • Lines 153-159: It is not clear how much these results are relevant to the activity of the glycine zipper motif during effector delivery by the T6SS. If I understand correctly, the described experiments are of over-expression of the proteins in the E. coli cytoplasm, where glycine zipper-dependent membrane permeability and toxicity are detected. However, one would expect that if the effector is to be transported from the periplasm to the cytoplasm during T6SS delivery, then the glycine zipper should function from the periplasmic face of the cytoplasmic membrane, and not from its cytoplasmic face, as is the case in these experiments. Is it possible that the observed toxicity and membrane permeability be the result of over-expression in the "wrong place"? Ans: The reviewer is right that Tde1 should permealize cytoplasmic membrane from periplasmic side upon injection from the attacker based on our proposed model. The purpose of ectopic expression of Tde1 and its variant in E. coli is to dissect the region and motif of Tde1 DNase-independent toxicity and the ability in enhancing membrane permeability regardless of which sides of cytoplasmic membrane the Tde1 mediates toxicity and permeability. The results of glycine zipper-dependent toxicity and membrane permeability provide a ground work for the experiments of secretion and interabacterial compeittion in the context of active T6SS action to determine the role of glycine zipper in Tde1 export and translocation.

      • Fig. 4B: This figure appears to be very important, and the authors base a large part of their main conclusion regarding the role of the glycine zipper in membrane crossing on it. However, some controls are missing and part of the results observed in the figure do not match their description in the text. • Lines 233-237 - While the authors state in the text that GFP and mCherry signals did not overlap in E. coli cells co-cultured with Agrobacterium cells expressing Tde1(M)-GLGL, I see many double-colored cells in this sample (bottom panels in Fig. 4B). Actually, all cells appear to have both green and blue colors, except for a few cells that are only green but that also seem to be dead judging by their ghostly appearance in the phase contrast channel.

      Ans: We thank the reviewer pointed this out. By looking at this particular image more carefully, it is striking that the majority of cells seem to emit both green and blue colors from this Tde1(M)GLGL sample. We have performed a total of three indepenent experiments for this translocation assay and all results except this particular sample in this particular experiment are consistent in all three independent experiments. Honestly, we could not explain this result and a possibility is this sample might be accidentally mixed with another sample. Because this is the only sample with inconsistent result with another two independent experiments, we decided NOT to use the results from this independent experiment and instead performed another independent experiment. We now have included the quantitative data from three effective independent experiments and show the representative images in Figure 4.

      How is it that all cells in the bottom panels are blue (indicating that they are E. coli target cells)? Shouldn't a large portion of the cells be Agrobacterium cells that should not be blue, since these are added at the beginning of the competition assay at a 10:1 ratio in their favor? Ans: As explained above, we have no defined answer and decided to perform additional repeats, which are consistent with results of another two independent experiments.

      It is quite remarkable that so much GFP signal is transported into the E. coli target cells so that it is so clearly visible under the microscope. How do the authors know that the GFP signal overlapping with the mCherry is really inside the cell and not outside (for example, proteins secreted to the media that attach to the cell envelope)? Will the GFP signal remain if trypsin is added to the media before visualization under the microscope? Ans: Indeed, our quantitative data show there are ~50% cells have GFP overlapping with mCherry in the translocation positive samples. The signals should be inside the cells because no overlay signals were observed from N-Tde1GLGL or Tde1(M)GLGLeven though they are secreted.

      Can the authors quantify the ratio of E. coli cells that have overlapping green and blue colors over several experiments for each sample, to show that this phenomenon repeats and is statistically significant? Ans: Yes, see quantitative data in Figure 4.

      Can the authors explain why at least some of these E. coli cells should not be dead due to the toxicity mediated by the third effector of the Agrobacterium T6SS, Tae? Ans: In Agrobacterium tumefaciens C58, Tde1/2 are the major effectors contributing to antibacterial activity. Tae effector has little impact on interbacterial competition outcome (see previous publications Ma et al., 2014 doi: 10.1016/j.chom.2014.06.002.; Yu et al., 2020 doi.org/10.1128/JB.00490-20)

      Why were the microscopy competitions performed differently than the regular competition assays? Why wasn't AK media used in these competitions? How active is the T6SS under these conditions compared to the AK media? Ans: We have tried to use AK medium for the translocation assay but only very weak fluorescent signals can be observed likely due to the low expression when grown on this nutrient poor medium. In order to correlate the results of the compeittion assay with translcoation experiment, we have performed E. coli killing assay using LB medium that is used for translocation experiment now. For the interbacterial competition against agrobacterial siblings, we still used AK medium for competition because no detectable interbacterial compettion activity could be observed between two A. tumefaciens strains on LB agar. As reported earlier, stronger interbacterial competition outcome was detected from co-culture on AK than other nutrient rich medium while the secretion activity grown in AK medium is lower (Yu et al., 2020 doi.org/10.1128/JB.00490-20). These results indicate that the factors other than secretion activity also impacted recipient cell susceptibity, which however is not the main focous of this work.

      In the N-Tde1 sample, many Agrobacterium cells appear to have the GFP signal in foci rather than distributed throughout the cell (as it is in other samples), while the E. coli cells have a uniform and strong GFP signal. Can the authors comment on that? Ans: Thanks the reviewer for raising this question.We are also curious about the Tde1 glycine zipper-dependent GFP foci and now include this potential explanation in the Discussion of revised manuscript (line 387-406). To this end, we do not have an answer for it. Because glycine zipper repeats are known to interact with membrane, it is possible that Tde1 proteins may preferntially bind to microdomain of cytoplasmic membrane, which was recently found in A. tumefaciens (Czolkoss et al., 2021). We also found that Tde1 proteins (either tagged with HA or GFP) are proned for truncation when they are ectopically expressed in E. coli or when Tdi1 is absent or not equivalent. Thus, it is possible that Tde1-GFP proteins are truncated after translocation into E. coli cells, in which most GFP signals are emitted from free GFP instead of Tde1-GFP. The stability of free GFP derived from translocated Tde1-GFP may also explain the high percentage of E. coli cells exhibiting overlayed GFP/mCherry signals.

      It might be easier for readers to visualize this figure and see the signal distribution in the different cells if the authors show a zoomed in version in the main text, and provide the wide field images as a supplementary figure. Ans: We have tried to include zoom-in images but the resolution is not good. We have improved the quality of images in the Figure 4 and believe the images are clear to see individual and overlayed fluorescence signals.

      • Fig. 5C-D: The reduced expression and secretion of the GLGL mutant is considerable. How can the authors rule out that this reduction was the cause for the reduced observed toxicity of the mutant in 5A-B? Moreover, the results show that the GLGL double mutant is hampered in expression, secretion, and DNase activity, and it negatively affects overall T6SS activity. Since this mutant was used throughout the paper, and in the absence of a direct assay showing membrane transport mediated by the glycine zipper motif, the claim of the role of this motif in membrane crossing is not well substantiated by the results. If the authors were to show that the single glycine mutants used in Fig. 5D, which are stable and have an intact DNase activity, behave as claimed in the final conclusion sentence (lines 279-283), then the conclusions will be better substantiated by the results. Ans: Thank you very much for suggesting this important experiment. We have now constructed the single G39L and G43L variants expressed together with Tdi1 in A. tumefaciens tdei mutant for both secretion and interbacterial competition assays (see description in lines 259-280 and Fig. 5). As shown in Figure 5, both G39L and G43L variants are expressed and secreted at similar or even higher levels than wild type Tde1 but have no detectable antibacterial activity against either E. coli or A. tumefaciens 1D1609. This result substantializes the role of this glycine zipper motif in translocation.

      Minor comments:

      • Line 93: I am not sure that Ntox15 should still be referred to as a "novel" domain.

      Ans: despite the evidence of this domain as DNase, the name of Ntox15 is used. We think to keep this nomenclatture as it will be easier to be ditinquished from other nuclease or toxin domain.

      • Line 105: The section's heading does not actually describe its content. The results here only show toxicity upon over-expression of the effector or its mutant forms in E. coli. Therefore, this cannot be referred to as a "prey cell" since the effector was not transported into it during competition. Moreover, the results in Fig. 5A do not support DNase-independent toxicity during competition. Ans: The heading is changed to “Tde1 exhibits DNase-independent growth inhibition in E. coli” (line 115).

      • Please consider making all of the symbols in the growth assays semi-transparent. It is impossible to discern between the different, overlapping curves. Ans: The growth curve results are improved by changing line colors and reducing size bars (Fig. 1B, 1C; Fig. 2A, 2D)

      • Please consider making the size bars in all microscopy images more pronounced. They are barely visible in their current form. Also, it would be better to show images of the same magnification/zoom for the different samples, since the current presentation shows cells from different samples at different sizes, and it can be confusing to the readers. Ans: Amended (Fig. 2C; Fig. 4).

      • In Fig. 1B and in Fig. 2A the authors show that expression of Tde1(M) in cells is toxic, yet in Fig. 2D they see no toxicity. Can the authors please comment on this discrepancy? Ans: Fig. 2D showed the viability of E. coli cells after Tde1 variants were induced for 1 hr before ONPG uptake assay to indicate the increased membrane permeability is not due to cell death. In Fig. 1B, the growth inhibition of Tde1(M) is also not evident at 1 hr. So, the results are consistent.

      • I am not convinced that the assay in Fig. 2E can be used to determine bacteriostatic/bacteriolytic effect. It is not clear how such a distinction can be made from OD measurements, since an increase in OD can result from the entire population growing after removal of the stressor, or just part of the population that did not lyse/die. To make such a claim, the authors can spot bacteria on repressing media at different timepoints after protein induction, and then determine CFU.

      Ans: The increased OD600 value during recovery could be caused by either resumed cell division or cell elongation. Based on the newly added growth inhibition assay of all Tde1 variants which we showed nice correlation between CFU counting and OD600value (Fig. 2A, S2) and no increased cell size/length of E. coli cells expressing N-Tde1 or Tde1(M), we think the recovered OD600 value is supportive of N-Tde1 or Tde1(M) exhibiting bacteriostatic toxicity. In addition to that, our interbacterial competition data showed that Tde1(M)-Tdi1 which is still having intact glycine zipper doesn’t show significant detectable killing, supporting the bacteriostatic function of Tde1 glycine zippers. In fact, we performed this experiment based on Mariano et al.(Nat. Commun. 2019 doi: 10.1038/s41467-019-13439-0), which showed the recovery of OD600 value after removal of inducer as the evidence that the Ssp6 toxin is not bacteriolytic.

      • Fig. 3A: A control is missing. To verify that the N-terminal part of Tde1 is not promiscuously interacting with proteins, the authors should include a control sample testing its inability to precipitate a protein other than Tap1 in the same experiment. Ans: Our previous study has showed that Tde1 can co-immunprecipiate Tap1 but not a non-T6SS protein RpoA (Bondage et al., 2016 doi:10.1073/pnas.1600428113), indicating that Tde1 is not promiscuously interacting with proteins. Considering the tight biochemical interaction between Tap1 with N-Tde1 but not C-Tde1 that correlate with their ability for secretion upon loading onto VgrG1, N-Tde1 is unlikely to bind proteins non-specifically. This is also supported by the non-specific protein bands from cellular fractions recognized by anti-Tap1 are not co-immunoprecipitated by any of Tde1 variants (Fig. S3). We could repeat the experiments to include additional proteins as negative controls but we chose to use time for other more critical experiments during the limited revision time.

      • Fig. 3B: the blots are very "dirty". It is not clear how the authors were able to determine expression and precipitation of some truncations (for example, C2-Tde1 in the E. coli IP panel looks like a background band found in other lanes too).

      Ans: We agree that western blots of co-IP experiments in E. coli are not very clear due to the weak signals of some Tde1 variants and background. As pointed out by the reviewer 3, this result is not conclusive and rovide little additional information other than the co-IP results from A. tumefaciens. Because the interaction between Tde1 variants and Tap1 when expressed in E. coli are not physiologically relevant and not the main focus of this work, we have removed the E. coli co-IP results from this manuscript as suggested by the reviewer 3.

      • Lines 222-225 (Fig. 4A): I can't see C-1-Tde1(M)-sfGFP in the cellular blot. All the bands in this lane look like background bands that are also present in all other lanes. Therefore, I am not sure how the conclusion regarding this truncation's ability to be secreted was reached. Ans: We agree that C1-Tde1(M)-sfGFP is barely detectable due to its weak signal overlapping with cross-reacted bands. Since several attemps to improve the western blot quality by changing antibody and pre-blocking with protein lysages of vector control strain did not produce convincing results for detection of C1-Tde1(M)-sfGFP, we have rephrased the description of this result as “However, C1-Tde1(M)-sfGFP protein signal could not be unambiguously determined in the cellular fraction due to the overlapping of its predicted protein band with cross-reacted proteins, and no corresponding C1-Tde1(M)-sfGFP band was detected in the extracellular fraction.” (line 234-237).

      • Fig. 4A: the protein names above the lanes should include the sfGFP that is fused to them. Ans: Amended.

      • It would be preferable to show quantitative competition assays with statistics rather than pictures of a plate showing a single competition result, if conclusions or observations on minor differences in toxicity are made (for example, line 253: "The killing activity of Δtdei(Tde1GLGL-Tdi1) was largely compromised"). Since the authors performed each competition assay more than once, these data should be available to them. Ans: Amended. We have repeated the interbacterial competition experiments including single G39L and G43L variants for multiple biological repeats (see detailed in legends of Fig. 5A, 5B). The quantitative data with statistical analysis were added, which show no statistical difference of any glycine zipper mutants as comapred to Tde1(M) or when expressed in the T6SS mutant. Thus, there are no detectable antibacterial activity of glycine zipper mutants against either E. coli or A. tumefaciens siblings.

      • Fig. 5A: The author claim at the beginning of the manuscript (first results section heading: "Tde1 can cause DNase-independent growth inhibition of prey cells") that the N-terminal region of Tde1 is toxic on its own in the prey cell, yet in this competition assay Tdi1(M) shows no toxicity against the E. coli target cells. In the microscopy assay (Fig. 4B), it appears that a lot of Tdi1(M) enters the prey cell, since we can visualize it under the microscope. Can the authors clarify this discrepancy and explain why they do not expect to see target killing by this mutant even though they claimed it is toxic earlier? Ans: As describbed in earlier response, N-Tde1 amd Tde1(M) toxicity can exhibit toxicity by ectopic expression in E. coli. We mainly used this ectopic expression assay to dissect the region and motif contributing the toxicity. Compared to the interbacterial competiton process where Tde1(M) may only transiently permealze cytoplasmic membrane transiently as the final destination is cytoplasm where wild type Tde1 but not Tde1(M) exerts DNase toxicity. Thus, the toxicity of N-Tde1 and Tde1(M) can be only observed when the proteins are continuously produced in the cytoplasm. The role of N-Tde1, specifically the glycine zipper motifs, is to mediate Tde1 translocation across inner membrane, instead of exerting toxicity during the context of interbacterial competition.

      • Fig. 5B: the GLGL mutant seems to have some residual toxicity, not dissimilar to what is shown in 5A. Why are these similar results interpreted differently (in 5A they are "largely compromised", while in 5B "killing activity... was not detectable")? Also, why was Tde(M)1-Tdi1 used in Fig. 5A but Tdi1(M) without the immunity gene used in Fig. 5B? Ans: As described above, to better quantify the interbacterial competition outcomes, we have repeated the interbacterial competition experiments and used Tde(M)1-Tdi1 instead of Tdi1(M) for at least three biological replicates. The quantitative data with statistical analysis were carried out to clarify this ambiguity (Fig. 5A, 5B).

      • Fig. 5: Does the remaining third effector, Tae, not play a role in these competition assays? If, as shown in Fig. 5C, the entire T6SS is less active when a GLGL mutant is expressed, couldn't the different in toxicity shown in Figs. 5A-B be the result of lack of Tae secretion and toxicity?

      Ans: As decribed above, Tae effector has little impact on interbacterial competition outcome. The quantatitive interbacterial competition results (Fig. 5A, 5B) also clarify the ambiguity because single G39L and G43L variants are expressed and secreted at similar or even higher levels than wild type Tde1 but have no detectable antibacterial activity against either E. coli or A. tumefaciens 1D1609.

      • Lines 359-362: T6SS effectors that bind the inner Hcp tube were suggested to be only partially folded. Ans: Amended.

      Reviewer #1 (Significance (Required)):

      The concept of T6SS effectors providing their own mechanism of transport from the cytoplasm to the periplasm is very interesting. It will appeal to audience in a wide range of microbiology disciplines, including those interested in toxins, membrane transport, and even translational applications. A similar concept was recently proposed and demonstrated for a domain that is also found in T6SS effectors (Atanaskovic et al., mBio, 2022).

      Expertise: I have been studying the different aspects of T6SS for the past decade.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This manuscript is focused on understanding how the Agrobacterium tumefaciens T6SS effector, Tde1, is translocated across the cell envelope of target cells and how this effector binds to the adapter Tap1. The authors show that GxxxG motifs in the N terminal region of Tde1 are required for delivery into the cytoplasm of target cells and permeabilising the cytoplasmic membrane. Given that these GxxxG motifs resemble glycine zipper structures that are found in proteins involved in membrane channel formation, the authors propose that these Tde1 motifs are involved in channel formation in the target cell. The authors also show that the N terminal region of Tde1 binds to Tap1 to facilitate loading onto the T6SS machinery but that the GxxxG motifs are not involved in this binding. Overall the manuscript was easy to read and followed a logical presentation of the findings. There are a few major comments that this reviewer has below - addressing these would allow the authors' claims to be more robustly supported. Ans: Thank you very much for the positive comments and valuable suggestion. We hope the revised manuscript including the new data and careful interpretation have substantialized the conclusions and proposed mechanisms.

      Major comments:

      1. Fig 1B: Why is this such a short growth experiment (5 hrs total with 2 hr pre and 3 hrs post induction)? Reporting on a growth experiment would normally be at least until the cells reach stationary phase but here the cells are still clearly in exponential phase. This reviewer would query what happens to growth rate in later exponential growth and into stationary phase? Is the toxic effect lessened in later stages of growth? Ans: We have indeed performed the growth curve analysis with longer time period. However, we noted that the growth at later time points are not always consistent and our interpretation is that the continuous expression of toxins may lead to the selection of mutants. Since the 3 or 4 hr time period already showed the toxicity phenotype, we have focused on this time frame for the growh experiments.

      2. It is indeed surprising that C2-Tde1(WT) does not inhibit growth despite it having a functional DNase domain and being expressed in the cytoplasm. Did the authors confirm that this protein variant was expressed by Western blot or other means? This should be done to confirm that this variant is indeed not impacting upon growth instead of it not impacting growth simply because it is not being expressed.

      Ans: Amended. All Tde1 variants including C2-Tde1 are expressed (data included in Fig S1)

      1. The letters used to report significance are not clear to this reviewer. The authors say that "The significant differences were shown by the different letters (p value

      For all fluorescence microscopy experiments how many fields of view were imaged for each biological replicate? Were the fields selected at random or was the field selection biased to what was present in the field before taking the image? The answers to all of these questions should be stated in the methods. Also the microscopy data presented in the manuscript is not quantitative. Quantification of the number of cells with PI vs Hoechst signal (in Fig 2C) and mcherry vs gfp signal (in Fig 4B) for all fields of view and for all biological replicates would be very informative and convince the reader that the authors have not just "cherry picked" the images they are showing in the manuscript. This could be performed manually or the authors could use the freely available image analysis program Fiji (https://imagej.net/software/fiji/) to perform these analysis in a semi-automated manner.

      Ans: The number of images and experiments were now described in the figure legends and the quantititive data are included (Fig. 2C).

      1. For the co-IP experiments in Fig 3 where interaction between HA tagged Tde1 and Tap1 is demonstrated the authors should also show that Tap1 does not interact with a different HA-tagged protein i.e. that the interaction is specific to Tde1 and not the HA motif. Ans: All Tde1 variants were tagged with HA. As shown in Fig. 3A, Tap1was not co-precipitated by C2-Tde1 and C1-Tde1(M), indicating that Tap1 specifically interacts with N-terminal region of Tde1.

      For all Western blot images there should be at least 2 protein standard markers present in each individual blot - i.e. for Fig 3A and B the bottom panel showing Tap1 detection only has the 35 kDa marker, it should have at least one more marker in it. The same is true for other panels in Fig 2, 3 and 4. Having at least two molecular weight markers in a panel is now standard for most journals when presenting Western blot images. Ans: Amended. We have now included the full gel of western blot results in Fig. S3 and S5 of those shown in main figures.

      For the competition assay serial dilution images in Fig 5A-B the images are a nice way to visually represent the experimental outcome but they should accompany graphs showing the competitive index of CFU/ml of the input prey and attacker vs the output prey and attacker for all biological replicates. This will convince the reader that the authors had equivalent amounts of the prey and the attacker going into the experiment and also that all attackers grew at the same rate and so were equally able to target the prey cell. This quantification could also provide more convincing out competition of ID1609 prey by C58 attacker (Fig 5B). Ans: Amended. As indicated above, we have repeated the interbacterial competition experiments for at leaset three biological replicates and show that quantitative data with statistical analysis (Fig. 5A, 5B).

      Minor comments:

      Line 40: should read "...demonstrate that the effector itself..." Ans: The sentence has been rephrased (line 40) .

      Line 41: "...we propose..." instead of "...we proposed..." since present tense makes more sense for this statement.

      Ans: Amended (line 42).

      Line 51: "Each specialized protein secretion system" instead of "Each of...." Ans: Amended (line 52).

      Line 76: "A glycine zipper structure..."

      Ans: Amended (line 83).

      Line 79: "For example..."

      Ans: Amended (line 86).

      Lines 96-100: The present tense should be used here as the current usage of past tense implies that this has been done in previous work and not in the current study - eg "we revealed", "we showed" would be better as "we reveal", "we show".

      Ans: Thanks for the advice. We have made changes throughout the manuscript.

      Fig 5B - The competition assay serial dilution images look a bit blurry, are there images the authors could use that are not blurry?

      Ans: Amended. As indicated above, we now show quantitative data with statistical analysis (Fig. 5A, 5B).

      Reviewer #2 (Significance (Required)):

      This work is significant in as while there is a great deal known about how T6SS effectors cause toxicity there is less known about how these effectors are loaded onto the T6SS machinery and very little known about how T6SS effectors are able to translocate across the cytoplasmic membrane of target cells to reach a cellular component that is in the cytoplasm. This work would be of wide general interest to researchers in the T6SS field as well as those interested in bacterial secretion systems.

      Reviewer expertise key words: Molecular microbiology, T6SS, interbacterial competition

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      EVIDENCE, REPRODUCIBILITY AND CLARITY

      Summary:

      In this work, Ali et al. demonstrate that the N-terminal GxxxG motif of the T6SS DNase effector Tde1 of Agrobacterium tumefaciens is required for interbacterial intoxication. Using a combination of cell viability, reporter, and microscopy assays, the authors demonstrate that over-expression of the N-terminus of Tde1 results in inner membrane permeability. Moreover, the authors show that both the interaction between Tde1 and its adaptor Tap1 as well as the T6SS-mediated secretion of Tde1 are dependent on the N-terminus of Tde1. Finally, using a combination of in vitro and in vivo experiments, the authors determine that the N-terminal GxxxG motif is essential to Tde1-dependent interbacterial killing by enabling effector entry into competing bacterial cells.

      Major comments:

      If N-tde1 is 1-97 aa, the predicted size is 9 kDa, but it shows up as ~17 kDa? Can the authors comment on this? Does N-tde1 or tde1 dimerize? Ans: The theoretical Mw of N-Tde1-HA is 10.64 KDa, which indeed migrated at higer position ~17 kDa. It is notable that full-length Tde1 with theoretical 29.5-kDa migrated slower in SDS-PAGE with a observed size ~36 kDa as observed previously (Ma et al., 2014 doi:10.1016/j.chom.2014.06.002). Similarly, the full-length HA-tagged Tde1(M) with theoretical 30.89 kDa migrated at a position ~38 kDa. Since the protein samples analyzed by SDS-PAGE including reducing agent, we cannot exclude the possibility that Tde1 or N-Tde1 may form dimer or oligomer that was disrupted by SDS-PAGE but it appears not forming dimer on SDS-PAGE.

      I have many concerns with the data and conclusions drawn from the data in Fig. 3B. I recommend removing it since (1) the data are not accurately represented in the text and (2) it is difficult to ascertain whether biologically relevant conclusions can be drawn from what happens with Agrobacterium proteins in E. coli. Below is a summary of my concerns regarding this section: I disagree with the authors' statements in lines 191-198. Their pulldown with E. coli is not consistent with their pulldown in C58. In fact, given the expression problems of some of the constructs in E. coli, I believe the data shown in Fig. 3B is inconclusive. The amount of Tap1 that co-IP'ed with N-Tde1GLGL and Tde1(M) is very low even though the expression levels of N-Tde1GLGL and Tde1(M) were relatively strong. Therefore, I do not feel confident concluding that these proteins "interact". Secondly, Tde1(M)GLGL was not expressed in E. coli, so no conclusions can be drawn. Moreover, the C1 and C2 variants were also not expressed well, so I believe the authors' statement in line 191-192: "Similar to the results in A. tumefaciens, the N-Tde1 and Tde1(M) interacted with Tap1 but not the C-terminal variants", is unjustified. You cannot rule out that C1 and C2 do not interact with Tap1 because C1 and C2, like Tde1(M)GLGL, were not expressed well in E. coli. Ans: We agree with the reviewer that the E. coli co-IP result is not conclusive due to the low expression and instability of proteins mostly during the process of cell lysis and purification, and it provides little additional information other than data from co-IP in A. tumefaciens. Because the interaction between Tde1 variants and Tap1 when expressed in E. coli are not physiologically relevant and not the main focus of this work, we have removed the E. coli co-IP results from this manuscript.

      Lines 211-214: It looks like C1-Tde1(M) inhibits T6SS secretion. I am aware that in Agrobacterium, it has been shown that effector loading is essential for secretion, but then why does the pTrc200 secrete Hcp? Also, in Fig. 4B, a strain expressing C1-Tde1(M) now secretes Hcp. Ans: Thanks for noting our previous finding that Tde loading is critical for secretion. Our data are indeed supportive of the effector loading in activating T6SS as only very low levels of Hcp secretion could be detected from the strain containing vector only or C1-Tde1(M). In our previous paper (Wu et al., 2020 https://doi.org/10.15252/embr.201947961), there is either little or no detection of Hcp secretions when effectors are not loaded, indicating that effector loading is important but not essential for Hcp secretion. Because overexpression of VgrG can also activate T6SS secretion in the absence of effector loading (Bondage et al., 2016 doi:10.1073/pnas.1600428113), we think the low level secretion under certain conditions could be caused by some cells with higher levels of VgrG protein concentration but more work is required to elucidate the underlying mechanisms.

      Minor comments:

      Fig. 2B could benefit from better labeling to indicate that most strains lack lacY. Also, why is BW25113 WT showing such a low OD420 if it has LacY? Or is WT without lacZ? Please clarify.

      Ans: We apologize for not labeling clearly. The BW25113 strain lacks lacZ, therefore all the ∆lacY strains were complemented with a plasmid encoding lacZ (pYTA-lacZ). We have now added the labels to avoid confusion (Fig. 2B).

      Reviewer #3 (Significance (Required)):

      SIGNIFICANCE

      It has been known for over a decade that T6SS effectors have both periplasmic and cytosolic targets (e.g., cell wall and DNA). However, it remains unclear (1) where within the target cell are T6SS effectors are delivered and (2) once delivered, how do effectors reach their intracellular target site. In this work, Ali et al. demonstrate that for Tde1, the N-terminal GxxxG motif is essential for Tde1 to reach its target (DNA). The authors identified Tde1 homologs in several bacteria, suggesting that this model may be relevant across a wide range of bacteria. Additional research is needed to (1) determine whether Tde1 is originally secreted into the periplasm and (2) understand how non-Tde1/non-GxxxG effectors reach their target site.

    1. Author Response

      Reviewer #2 (Public Review):

      This is an interesting study investigating the effects of sensory conflict on rhythmic behaviour and gene expression in the sea anemone Nematostella vectensis. Sensory conflict can arise when two environmental inputs (Zeitgeber) that usually act cooperatively to synchronize circadian clocks and behaviour, are presented out of phase. The clock system then needs to somehow cope with this challenge, for example by prioritising one cue and ignoring the other. While the daily light dark cycle is usually considered the more reliable and potent Zeitgeber, under some conditions, daily temperature cycles appear to be more prominent, and a certain offset between light and temperature cycles can even lead to a breakdown of the circadian clock and normal daily behavioural rhythms. Understanding the weighting and integration of different environmental cues is important for proper synchronization to daily environmental cycles, because organisms need to distinguish between 'environmental noise' (e.g., cloudy weather and/or sudden, within day/night temperature changes) and regular daily changes of light and temperature. In this study, a systematic analysis of different offsets between light and temperature cycles on behavioural activity was conducted. The results indicated that several degrees of chronic offset results in the disruption of rhythmic behaviour. In the 2nd part of the study the authors determine the effect of sensory conflict (12 hr offset that leads to robust disruption of rhythmic behaviour) on overall gene expression rhythms. They observe substantial differences between aligned and offset conditions and conclude a major role for temperature cycles in setting transcriptional phase. While the study is thoroughly conducted and represents and impressive amount of experimental and analytical work, there are several issues, which I think question the main conclusions. The main issue being that temperature cycles by themselves do not seem to fulfil the criteria for being considered a true Zeitgeber for the circadian clock of Nematostella.

      Major points:

      Line 53: 'However, many of these studies did not compare more than two possible phase relationships.....'. Harper et al. (2016) did perform a comprehensive comparison of different phase relationships between light and temperature Zeitgebers (1 hr steps between 2 and 10 hr offsets), similar to the one conducted here. I think this previous study is highly relevant for the current manuscript and -- although cited -- should be discussed in more detail. For example, Harper et al. show that during smaller offsets temperature is the dominant Zeitgeber, and during larger sensory conflict light becomes the dominant Zeitgeber for behavioural synchronization. Only during a small offset window (5-7 hr) behavioural synchronization becomes highly aberrant, presumably because of a near breakdown of the molecular clock, caused by sensory conflict. Do the authors see something similar in Nematostella? Figure 3 suggests otherwise, at least under entrainment conditions, where behaviour becomes desynchronized only at 10 and 12 hr offset conditions. But in free-run conditions behaviour appears largely AR already at 6 hr offset, but not so much at 4 and 8 hr offsets (Table 2). So there seems to be at least some similarity to the situation in Drosophila during sensory conflict, which I think is worth mentioning and discussing.

      We have added a more detailed discussion of our results in the context of Harper et al. 2016 (L468-476).

      Line 111: The authors state that 14-26C temperature cycle is 'well within the daily temperature range experienced by the source population'. Too me this is surprising, as I was not expecting that water temperature changes that much on a daily basis. Is this because Nematostella live near the water surface, and/or do they show vertical daily migration? Also, I do not understand what is meant by '...range of in situ diel variation (of temperature)'. I think a few explanatory words would be helpful here for the reader not familiar with this organism.

      In fact, one of our motivations for studying temperature is that Nematostella naturally experience extreme temperature variation. The data we cite (Tarrant et al. 2019) are from in-situ water measurements. Nematostella live in extremely shallow water (in salt marshes), and the local population in Massachusetts experience wide swings in temperature due to the temperate latitude.

      We have added this information to the Introduction (L88-90), and we also added a discussion of Nematostella’s ecology in the Discussion section (L591-654).

      Lines 114-117: I was surprised that clock genes can basically not be synchronized by temperature cycles alone. Only cry2 cycled during temperature cycles but not in free-run, so the cry2 cycling during temperature cycles could just be masking (response to temperature). Later the authors show robust molecular cycling during combined LD and temperature cycles (both aligned and out of phase), indicating that LD cycles are required to synchronize the molecular clock. Moreover, a previous study has demonstrated that LD cycles alone (i.e., at constant temperature) are able to induce rhythmic molecular clock gene expression (Oren et al. 2015). Similarly, the free running behaviour after temperature cycles does not look rhythmic to me. In Figure 2A, 14-26C there is at best one peak visible on the first day of DD, and even that shows a ~6 phase delay compared to the entrained condition. After the larger amplitude temperature cycle (8:32C) behaviour looks completely AR and peak activity phases in free-run appear desynchronized as well (Fig. 2B). Overall, I think the authors present data demonstrating that temperature cycles alone are not sufficient to synchronize the circadian clock of Nematostella. One way to proof if the clock can be entrained is to perform T-cycle experiments, so changing the thermoperiod away from 24 hr (e.g., 10 h warm : 10 h cold). If in a series of different T-cycles the peak activity always matches the transition from warm to cold (as in 12:12 T-cycles shown in Fig. 1A) this would speak against entrainment and vice versa.

      Thank you for these thoughtful comments and constructive suggestions. We have conducted an additional experiment, which provides further evidence that temperature cycles can, in fact, synchronize the circadian clock. To do this, we measured the behavior of animals entrained in cycles with a short (12h) period, half the length of a circadian period. This takes advantage of a phenomenon called “frequency demultiplication”, in which organisms in 12h environmental cycles display both 12h and 24h components--essentially, the clock perceives every other cycle as a “day” (Bruce, 1960; Merrow et al., 1999). The important thing is that the 24h behavioral component can only occur if the signal is entraining a circadian clock—otherwise, we would only observe a directly-driven 12h behavior pattern.

      We first show that this phenomenon occurs with 6:6 LD cycles—which we expected, because we know light is a zeitgeber. We then show that animals entrained to a temperature cycle with a 12h period also display 24h behavioral rhythms—and in fact the 24h component is stronger than the 12h component. We believe this is strong evidence that temperature is a bona fide zeitgeber in this system. This experiment is now explained in the Results (L127-154) and in Figure 2–Figure supplement 1.

      In terms of our original data, the reviewer is correct that the statistically-detectable free-running rhythms were weak and not visually obvious). Our confidence in thermal entrainment came from the fact that some individual animals had 24h rhythmicity in free-run, even if the signal was weak in the mean time series—this suggested that temperature must be at least capable of synchronizing internal clocks. It is also important to note that even light-entrained rhythms are “noisy” in cnidarians, which is why we were not surprised that the signal was weak. We have added a discussion of this observation in L601-612.

      Lines 210-226: As mentioned above, I think it is not clear that temperature alone can synchronize the Nematostella clock and it is therefore problematic to call it a Zeitgeber. Nevertheless, Figure 3A, B, D show that certain offsets of the temperature cycle relative to the LD cycle do influence rhythmicity and phase in constant conditions. This is most likely due to a direct effect of temperature cycles on the endogenous circadian clock, which only becomes visible (measureable) when the animals are also exposed to certain offset LD cycles. My interpretation of the combined results would be that temperature cycles play only are very minor role in synchronizing the Nematostella clock (after all, LD and temperature cycles are not offset in nature), perhaps mainly supporting entrainment by the prominent LD cycles.

      With our new data (see previous point), we believe we can safely say that temperature is a zeitgeber. We are not totally clear on what is meant by “a direct effect of temperature cycles on the endogenous circadian clock.” We argue that, because we see changes in free-running behavior during certain offsets, the timing of temperature cycles must affect the internal clock in a way that persists during constant conditions—it can’t just be a direct (clock-independent) effect of temperature.

      Gene expression part: The authors performed an extensive temporal transcriptomic analysis and comparison of gene expression between animals kept in aligned LD and temperature cycles and those maintained in a 12 hr offset. While this was a tremendous amount of experimental work that was followed by sophisticated mathematical analysis, I think that the conclusions that can be drawn from the data are rather limited. First of all, it is known from other organisms that temperature cycles alone have drastic effects on overall gene expression and importantly in a clock independent manner (e.g., Boothroyd et al. 2007). Temperature therefore seems to have a substantially larger effect on gene expression levels compared to light (Boothroyd et al. 2007). In the current study, except for a few clock gene candidates (Figure 2C), the effects of temperature cycles alone on overall gene expression have not been determined. Instead the authors analysed gene expression during aligned and 12 h offset conditions making it difficult to judge which of the observed differences are due to clock independent and clock dependent temperature effects on gene expression. This is further complicated by the lack of expression data in constant conditions. I think the authors need to address these limitations of their study and tone down their interpretations of 'temperature being the most important driver of rhythmic gene expression' (e.g., line 401). At least they need to acknowledge that they cannot distinguish between clock independent, driven gene expression and potential influences of temperature on clock-dependent gene expression rhythms. Moreover, in their comparison between their own data and LD data obtained at constant temperature (taken from Oren et al. 2015), they show that temperature has only a very limited effect (if any) on core clock gene expression, further questioning the role of temperature cycles in synchronising the Nematostella clock. Nevertheless, I noted in Table 3 that there is a 1.5 to 3 hr delay when comparing the phase of eight potential key clock genes between the current study (temperature and LD cycles aligned) and LD constant temperature (determined by Oren et al.). To me, this is the strongest argument that temperature cycles at least affect the phase of clock gene expression, but the authors do not comment on this phase difference.

      We agree with these points about the limitations of our study, and have revised the manuscript to phrase our conclusions more carefully. We still think it is reasonable to observe that temperature was a stronger drive of gene expression than light in our study, but this may not be true in other contexts.

      In terms of the comparison with Oren et al. 2015, we didn’t want to over-interpret these results because there are other differences between the studies (L1181-1185), including the use of a different source population. In addition, we would prefer denser sampling (2h time points rather than 4h) and larger sample sizes to make claims about phase differences.

      Network analysis: This last section of the results was very difficult to read and follow (at least for me). For example, do the colours in Figure 6A correspond to those in Figure 6B, C? A legend for each colour, i.e., which GO terms are included in each colour would perhaps be helpful. As mentioned above, I also do not think we can learn a lot from this analysis, since we do not know the effects of temperature cycles alone and we have no free-run data to judge potential influence on clock controlled gene expression. Under aligned conditions genes are expressed at a certain phase during the daily cycle (either morning to midday, or evening to midnight), which interestingly, is very similar to temperature cycle-only driven genes in Drosophila (Boothroyd et al. 2007). Inverting the temperature cycle has drastic effects on the peak phases of gene expression, but not so much on overall rhythmicity. But since no free-run data are available, we do not know to what extend these (expected) phase changes reflect temperature-driven responses, or are a result of alterations in the endogenous circadian clock.

      We have revised and streamlined this section and Fig. 6, including removing panel 6C. The colors do correspond across panels in the figure. For space, GO terms of select modules are included in Fig. 6, and GO results for all modules are included in the Supplemental Data and discussed in the Results.

      It is true that we can’t distinguish temperature-driven versus clock effects here, and it does seem like many modules simply follow the temperature cycle (which we say in this section). The most interesting finding from this section is probably that the co-expression structure (correlations between rhythmic genes) are substantially weakened during SC, and we do discuss certain modules of genes that lose or gain rhythmicity. We have revised this section to focus on the main points and have cut several of the less pertinent results.

      Reviewer #3 (Public Review):

      This article reflects a significant effort by the authors and the results are interesting.

      For the third set of experiments, are temperature and light really out of synch? While peak in temperature no longer occurs along with lights on, we do still have two 24 hour cycles where changes in the environmental cues still occur simultaneously (lights on with peak in temperature, lights off with min in temperature). I wonder what would happen if light remained at a 24 hour cycle and temperature became either sporadic (randomly changing cycles) or was placed on a longer cycle altogether (temperature taking 20 hours to increase from min to max, and then another 20 hours to go from max to min).

      Thank you for your interesting suggestions for future experiments. This point is addressed in our revisions responding to Reviewer #1, who requested a discussion of the phrase “sensory conflict.” We agree that the binary “in-sync vs. out-of-sync” may be too simplistic. Our original conception of sensory conflict was a situation in which light and temperature provide different phase information, as informed by experiments with only light (prior literature) or only temperature (this work).

      In our revised manuscript, we discuss the idea that “sensory conflict” is not always a useful framework because there are many possible relationships between light and temperature. Although our 12h offset is certainly less “natural” than our aligned time series, it may be useful to think of them simply as 2 different possible light and temperature regimes in which the two signals interact, rather than abstract ideals of “aligned” or “misaligned.”

      An area that could significantly benefit a broader readership would be to improve overall clarity of figures and rethink if all the results are necessary to convert the key findings of the paper. As written, the results sections is somewhat confusing.

      We have revised Figs. 1 and 6 for clarity, and we have also shortened the network analysis portion of the Results.

    1. Evaluation 3


      Ratings and predictions

      Ratings (1-100)

      • Overall assessment: 65 Confidence: Medium
      • Advancing knowledge and practice: 70 Confidence: Medium
      • Methods: Justification, reasonableness, validity, robustness: Not qualified
      • Logic & communication: 80 Confidence: Medium-to-high
      • Open, collaborative, replicable: Not qualified
      • Relevance to global priorities: 80 Confidence: High

      Journal predictions (1-5)

      • What ‘quality journal’ do you expect this work will be published in? 3.5 Confidence: Medium
      • On a ‘scale of journals’, what tier journal should this be published in? 3.5 Confidence: Medium

      Written report

      I am a political scientist specializing in science policy (i.e., how expertise and knowledge production influences the policymaking process and vice-versa), with a focus on “decision making under conditions of uncertainty,” R&D prioritization, and the governance of systemic and catastrophic risk. With respect to the various categories of expertise highlighted by the authors, I can reasonably be considered a “policy analyst.”

      Potential conflict of interest/source of bias: one of the authors (Dr. Anders Sandberg) is a friend and former colleague. He was a member of my PhD dissertation committee.

      A quick further note on the potential conflict of interest/bias of the authors (three of the four are associated with ALLFED, which, as the authors note, could stand to benefit financially from the main implication of their analysis - that significant funding be allocated to resilient food research in the short-term). In my opinion, this type of “self-advocacy” is commonplace and, to some extent, unavoidable. Interest and curiosity (and by extension, expertise) on a particular topic motivates deep analysis of that topic. It’s unlikely that this kind of deep analysis (which may or may not yield these sorts of “self-confirming” conclusions/recommendations) would ever be carried out by individuals who are not experts on - and often financially implicated in - the topic. I think their flagging of the potential conflict of interest at the end of the paper is sufficient - and exercises like this Unjournal review further increase transparency and invite critical examinations of their findings and “positionality.”

      I am unqualified to provide a meaningful evaluation of several of the issues “flagged” by the authors and editorial team, including: the integration of the sub-models, sensitivity analysis, and alternative approaches to the structure of their Monte Carlo analysis. Therefore, I will focus on several other dimensions of the paper.

      Context and contribution

      This paper has two core goals: (1) to explore the value and limitations of relative long-term cost effectiveness analysis as a prioritization tool for disaster risk mitigation measures in order to improve decision making and (2) to use this prioritization tool to determine if resilient foods are more cost effective than AGI safety (which would make resilient food the highest priority area of GCR/X-risk mitigation research). As I am not qualified to directly weigh in on the extent to which the authors’ achieved either goal, I will reflect on the “worthiness” of this goal within the broader context of work going on in the fields of X-risk/GCR, long-termism, science policy, and public policy - and the extent to which the authors’ findings are effectively communicated to these audiences.

      Within this broader context, I believe that these are indeed worthy (and urgent) objectives. The effective prioritization of scarce resources to the myriad potential R&D projects that could (1) reduce key uncertainties, (2) improve political decision-making, and (3) provide solutions that decrease the impact and/or likelihood of civilization-ending risk events is a massive and urgent research challenge. Governments and granting agencies are desperate for rigorous, evidence-based guidance on how to allocate finite funding across candidate projects. Such prioritization is impeded by uncertainty about the potential benefits of various R&D activities (partially resulting from uncertainty about the likelihood and magnitude of the risk event itself - but also from uncertainty about the potential uncertainty-reducing and harm/likelihood-reducing “power” of the R&D). Therefore, the authors’ cost-effectiveness model, which attempts to decrease uncertainty about the potential uncertainty-reducing and harm/likelihood-reducing “power” of resilient food R&D and compare it to R&D on AGI safety, is an important contribution. It combines and applies a number of existing analytical tools in a novel way and proposes a tool for quantifying the relative value of (deeply uncertain) R&D projects competing for scarce resources.

      Overall, the authors are cautious and vigilant in qualifying their claims - which is essential when conducting analysis that relies on the quasi-quantiative aggregation of the (inter)subjective beliefs of experts and combines several models (each with their own assumptions).

      Theoretical/epistemic uncertainty

      I largely agree with the authors’ dismissal of theoretical/epistemic uncertainty (not that they dismiss its importance or relevance - simply that they believe there is essentially nothing that can be done about it in their analysis). Their suggestion that “results should be interpreted in an epistemically reserved manner” (essentially a plea for intellectual humility) should be a footnote in every scholarly publication - particularly those addressing the far future, X-risk, and value estimations of R&D.

      However, the authors could have bolstered this section of the paper by identifying some potential sources of epistemic uncertainty and suggesting some pathways for further research that might reduce it. I recognize that they are both referring to acknowledged epistemic uncertainties - which may or may not be reducible - as well as unknown epistemic uncertainties (i.e., ignorance - or what they refer to as “cluelessness”). It would have been useful to see a brief discussion of some of these acknowledged epistemic uncertainties (e.g., the impact of resilient foods on public health, immunology, and disease resistance) to emphasize that some epistemic uncertainty could be reduced by exactly the kind of resilient food R&D they are advocating for.

      Presentation of model outputs

      When effectively communicating uncertainties associated with research findings to multiple audiences, there is a fundamental tradeoff between the rigour demanded by other experts and the digestibility/usability demanded by decision makers and lay audiences. For example, this tradeoff has been well-documented in the literature on the IPCC’s uncertainty communication framework (e.g., Janzwood 2020). What fellow-modelers/analysts want/need is usually different from what policymakers want/need. The way that model outputs are communicated in this article (e.g. 84% confidence that the 100 millionth dollar is more cost-effective) leans towards rigour and away from digestibility/usability. A typical policymaker who is unfamiliar with the modeling tools used in this analysis may assume that an 84% probability value was derived from historical frequencies/trials in some sort of experiment - or that it simply reflects an intersubjective assessment of the evidence by the authors of the article. Since the actual story for how this value was calculated is rather complex (it emerges from a model derived from the aggregation of the outputs of two sub-models, which both aggregate various types of expert opinions and other forms of data) - it might be more useful to communicate the final output qualitatively.

      This strategy has been used by the IPCC to varying levels of success. These qualitative uncertainty terms can align with probability intervals. For example, 80-90% confidence could be communicated as “high confidence” or “very confident.” >90% could be communicated as “extremely confident.” There are all sorts of interpretation issues associated with qualitative uncertainty scales - and some scales are certainly more effective than others (again, see Janzwood 2020) but it is often useful to communicate findings in two “parallel tracks” - one for experts and one for a more lay/policy-focused audience.

      Placing the article’s findings within the broader context of global priorities and resource allocation

      Recognizing the hard constraints of word counts - and that a broader discussion of global priorities and resource allocation was likely “out of scope” - this article could be strengthened (or perhaps simply expanded upon in future work) by such a discussion. The critical piece of context is the scarcity of resources and attention within the institutions making funding decisions about civilization-saving R&D (governments, granting organizations, private foundations, etc.). There are two dimensions worth discussing here. First, R&D activities addressing risks that are generally considered low-probability/high-impact with relatively long timelines (although I don’t think the collapse of global agricultural would qualify as low-risk - nor is the likely timeline terribly long - but those are my priors) are competing for scarce funding/attention against R&D activities addressing lower-impact risks believed to be shorter-term and more probable (e.g., climate change, the next pandemic, etc.). I think most risk analysts - even hardcore “long-termists” - would agree that an ideal “R&D funding portfolio” be somewhat diversified across these categories of risk. It is important to acknowledge the complexity associated with resource allocation - not just between X-risks but between X-risks and other risks.

      Second, there is the issue of resource scarcity itself. On the one hand, there are many “high value” candidate R&D projects addressing various risks that societies can invest in - but only a finite amount of funding and attention to allocate between them. So, these organizations must make triage decisions based on some criteria. On the other hand, there are also a lot of “low” or even “negative value” R&D activities being funded by these organizations - in addition to other poor investments - that are providing little social benefit or are actively increasing the likelihood/magnitude of various risks. I believe that it is important in these sort of discussions about R&D prioritization and resource scarcity to point out that the reosource pool need not be this shallow - and to identify some of the most egregious funding inefficiencies (e.g., around fossil fuel infrastructure expansion). It should go without saying— but ideally, we could properly resource both resilient food and AGI safety research.


      Evaluator details

      1. Name: Scott Janzwood
      2. How long have you been in this field? 10 years
      3. How many proposals and papers have you evaluated? ~25 proposals, ~10 papers
    2. Evaluation 1


      Ratings and predictions

      Ratings (1-100)

      • Overall assessment: 40 CI: 20-60
        • Comment: See main review
      • Advancing knowledge and practice: 30 CI: 20-60
        • Comment: The paper itself makes an important argument about resilient foods, but I don’t know if the additional element of AGI risk adds much to Denkenberger & Pearce (2016)
      • Methods: Justification, reasonableness, validity, robustness: 50 CI: 40-60
        • Comment: Very major limitations around the survey method, and implementation of certain parts of the parameter sensitivity analysis. However many elements of a high standard
      • Logic & communication: 60 CI: 40-75
        • Comment: Major limitations around the logic and communication of the theoretical model of cost-effectiveness used in the paper. Minor limitations of readability and reporting which could have been addressed before publication (such as reporting 95% CIs without medians, and not reporting overall cost and benefit estimates)
      • Open, collaborative, replicable: 70 CI: 40-75
        • Comment: Provided models are shared with any reader who asks, I couldn’t ask for more here. Limitations of survey replicability (particularly E model) prevent perfect score
      • Relevance to global priorities: 90 CI: 60-95
        • Comment: I’d be surprised if I ever read a paper with more relevance to global priorities, although as mentioned there are a few version of this argument circulating such as Denkenberger & Pearce (2016)

      Journal predictions (1-5)

      • What ‘quality journal’ do you expect this work will be published in? 2 CI: 1-2
      • On a ‘scale of journals’, what tier journal should this be published in? 2 CI: 1-2

      Written report

      This is a very interesting paper on an important and neglected topic. I’d be surprised if I ever again read a paper with such potential importance to global priorities. The authors motivate the discussion well, and should be highly commended for their clear presentation of the structural features of their model, and the thoughtful nature in which uncertainty was addressed head-on in the paper.

      Overall, I suspect the biggest contribution this paper will make is contextualising the existing work done by the authors on resilient food into the broader literature of long-termist interventions. This is a significant achievement, and the authors should feel justifiably proud of having accomplished it. However, the paper unfortunately has a number of structural and technical issues which should significantly reduce a reader’s confidence in the quantitative conclusions which aim to go beyond this contextualisation.

      In general, there are three broad areas where I think there are material issues with the paper:

      1. The theoretical motivation for their specific philosophy of cost-effectiveness, and specifically whether this philosophy is consistent throughout the essay
      2. The appropriateness of the survey methods, in the sense of applying the results of a highly uncertain survey to an already uncertain model
      3. Some specific concerns with parameterisation

      None of these concerns touch upon what I see at the main point of the authors, which I take to be that ‘fragile’ food networks should be contextualised alongside other sources of existential risk. I think this point is solidly made, and important. However, they do suggest that significant additional work may be needed to properly prove the headline claim of the paper, which is that in addition to being a source of existential risk the cost-effectiveness of investing in resilient food is amongst the highest benefit-per-cost of any existential risk mitigation.

      Structure of cost-effectiveness argument

      One significant highlight of the paper is the great ambition it shows in resolving a largely intractable question. Unfortunately, I feel this ambition is also something of a weakness of the paper, since it ends up difficult to follow the logic of the argument throughout.

      • Structurally, the most challenging element of this paper in terms of argumentative flow is the decision to make the comparator for cost-effectiveness analysis ‘AGI Catastrophe’ rather than ‘do nothing’. My understanding is that the authors make this decision to clearly highlight the importance of resilient food – noting that, “if resilient foods were more cost effective than AGI safety, they could be the highest priority [for the existential risk community]” (since the existential risk community currently spends a lot on AGI Risk mitigation). So roughly, they start with the assumption that AI Risk must be cost-effective, and argue that anything more cost-effective than this must therefore also be cost-effective. The logic is sound, but this decision causes a number of problems with interpretability, since it requires the authors to compare an already highly uncertain model of food resilience against a second highly uncertain model of AGI risk.
      • The biggest issue with interpretability this causes is that I struggle to understand what features of the analysis are making resilient food appear cost-effective because of some feature of resilient food, and which are making resilient food appear cost-effective because of some feature of AI. The methods used by the authors mean that a mediocre case for resilient food could be made to look highly cost-effective with an exceptionally poor case for AI, since their central result is the multiplier of value on a marginally invested dollar for resilient food vs AI. This is important, because the authors’ argument is that resilient food should be funded because it is more effective than AI Risk management, but this is motivated by AI Risk proponents agreeing AI Risk is important – in scenarios where AI Risk is not worth investing in then this assumption is broken and cost effectiveness analysis against a ’do nothing’ alternative is required. For example, the authors do not investigate scenarios where the benefit of the intervention in the future is negative because “negative impacts would be possible for both resilient foods and AGI safety and there is no obvious reason why either would be more affected”. While this is potentially reasonable on a mathematical level, it does mean that it would be perfectly possible for resilient foods to be net harmful and the paper not correctly identify that funding them is a bad idea – simply because funding AI Risk reduction is an even worse idea, and this is the only given alternative. If the authors want to compare AGI risk mitigation and resilient foods against each other without a ‘do nothing’ common comparator (which I do not think is a good idea), they must at the very least do more to establish that the results of their AI Risk model map closely to the results which cause the AI Risk community to fund AI Risk mitigation so much. As this is not done in the paper, a major issue of interpretability is generated.
      • A second issue this causes is that the authors must make an awkward ‘assumption of independence’ between nuclear risk, food security risk and AI risk. Although the authors identify this as a limitation of their modelling approach, the assumption does not need to be made if AI risk is not included as a comparator in the model. I don’t think this is a major limitation of the work, but an example of how the choice of comparator has an impact on structural features of the model beyond just the comparator.
      • More generally, this causes the authors to have to write up their results in a non-natural fashion. As an example of the sort of issues this causes, conclusions are expressed in entirely non-natural units in places (“Ratio of resilient foods mean cost effectiveness to AGI safety mean cost effectiveness” given $100m spend), rather than units which would be more natural (“Cost-effectiveness of funding resilient food development”). I cannot find expressed anywhere in the paper a simple table with the average costs and benefits of the two interventions, although a reference is made to Denkenberger & Pearce (2016) where these values were presented for near-term investment in resilient food. This makes it extremely hard for a reader to draw sensible policy conclusions from the paper unless they are already an expert in AGI risk and so have an intuitive sense of what an intervention which is ‘3-6 times more cost-effective than AGI risk reduction’ looks like. The paper might be improved by the authors communicating summary statistics in a more straightforward fashion. For example, I have spent some time looking for the probability the model assigns to no nuclear war before the time horizon (and hence the probability that the money spent on resilient food is ‘wasted’ with respect to the 100% shortfall scenario) but can’t find this – that seems to be quite an important summary statistic but it has to be derived indirectly from the model.

      Fundamentally, I don’t understand why both approaches were not compared to a common scenario of ‘do nothing’ (relative to what we are already doing). The authors’ decision to compare AGI Risk mitigation to resilient foods directly would only be appropriate if the authors expect that increasing funding for resilient food decreased funding for AI safety (that is to say, the authors are claiming that there is a fixed budget for AI-safety-and-food-resilience, and so funding for one must come at the expense of the other). This might be what the authors have in mind as a practical consequence of their argument, as there is an implication that funding for resilient foods might come from existing funding deployed to AGI Risk. But it is not logically necessary that this is the case, and so it creates great conceptual conclusion to include it in a cost-effectiveness framework that requires AI funding and resilient food funding to be strictly alternatives. To be clear, the ‘AI subunit’ is interesting and publishable in its own right, but in my opinion simply adds complexity and uncertainty to an already complex paper.

      Continuing on from this point, I don’t understand the conceptual framework that has the authors consider the value of invested dollars in resilient food at the margin. The authors’ model of the value of an invested dollar is an assumption that it is distributed logarithmically. Since the entire premise of the paper hinges on the reasonability of this argument, it is very surprising there is no sensitivity analysis considering different distributions of the relationship between intervention funding and value. Nevertheless, I am also confused as to the model even on the terms the authors describe; the authors’ model appears to be that there is some sort of ‘invention’ step where the resilient food is created and discovered (this is mostly consistent with Denkenberger & Pearce (2016), and is the only interpretation consistent with the question asked in the survey). In which case, the marginal value of the first invested dollar is zero because the ’invention’ of the food is almost a discrete and binary step. The marginal value per dollar continues to be zero until the 86 millionth dollar, where the marginal value is the entire value of the resilient food in its entirety. There seems to be no reason to consider the marginal dollar value of investment when a structural assumption made by the authors is that there is a specific level of funding which entirely saturates the field, and this would make presenting results significantly more straightforward – it is highly nonstandard to use marginal dollars as the unit of cost in a cost-effectiveness analysis, and indeed is so nonstandard I’m not certain fundamental assumptions of cost-effectiveness analysis still hold. I can see why the authors have chosen to bite this bullet for AI risk given the existing literature on the cost of preventing AI Catastrophe, but there seems to be no reason for it when modelling resilient food and it departs sharply from the norm in cost-effectiveness analysis.

      Finally, I don’t understand the structural assumptions motivating the cost-effectiveness of the 10% decline analysis. The authors claim that the mechanism by which resilient foods save lives in the 10% decline analysis is that “the prices [of non-resilient food] would go so high that those in poverty may not be able to afford food” with the implication that resilient foods would be affordable to those in poverty and hence prevent starvation. However, the economic logic of this statement is unclear. It necessitates that the production costs of resilient food is less than the production costs of substitute non-resilient food at the margin, which further implies that producers of resilient food can command supernormal profits during the crisis, which is to say the authors are arguing that resilient foods represent potentially billions of dollars of value to their inventor within the inventor’s lifetime. It is not clear to me why a market-based solution would not emerge for the ‘do nothing’ scenario, which would be a critical issue with the authors’ case since it would remove the assumption that ‘resilient food’ and ‘AGI risk’ are alternative uses of the same money in the 10% scenario, which is necessary for their analysis to function. The authors make the further assumption that preparation for the 100% decline scenario is highly correlated with preparation for the 10% decline scenario, which would mean that a market-based solution emerging prior to nuclear exchange would remove the assumption that ‘resilient food’ and ‘AGI risk’ are alternative uses of the same money in the 100% decline scenario. A supply and demand model might have been a more appropriate model for investigating this effect. Once again, I note that the supply and demand model alone would have been an interesting and publishable piece of work in its own right.

      Overall, I think the paper would have benefitted from more attention being paid to the underlying theory of cost-effectiveness motivating the investigation. Decisions made in places seem to have multiplied uncertainty which could have been resolved with a more consistent approach to analysis. As I highlighted earlier, the issues only stem from the incredible ambition of the paper and the authors should be commended for managing to find a route to connect two separate microsimulations, an analysis of funding at the margin and a supply-and-demand model. Nevertheless, the combination of these three approaches weakens the ability to draw strong conclusions from each of these approaches individually.

      Methods

      With respect to methods, the authors use a Monte Carlo simulation with distributions drawn from a survey of field experts. The use of a Monte Carlo technique here is an appropriate choice given the significant level of uncertainty over parameters. The model appears appropriately described in the paper, and functions well (I have only checked the models in Guesstimate, as I could not make the secondary models in Analytica function). A particular highlight of the paper is the figures clearly laying out the logical interrelationship of elements of the model, which made it significantly easier to follow the flow of the argument. I note the authors use ‘probability more effective than’ as a key result, which I think is a natural unit when working in Guesstimate. This is entirely appropriate, but a known weakness of the approach is that it can bias in favour of poor interventions with high uncertainty. The authors could also have presented a SUCRA analysis which does not have this issue, but they may have considered and rejected this approach as unnecessary given the entirely one-sided nature of the results which a SUCRA would not have reversed.

      The presentation of the sensitivity analysis as ‘number of parameters needed to flip’ is nonstandard, but a clever way to intuitively express the level of confidence the authors have in their conclusions. Although clever, I am uncertain if the approach is appropriately implemented; the authors limit themselves to the 95% CI for their definition of an ‘unfavourable’ parameter, and I think this approach hides massive structural uncertainty with the model. For example, in Table 5 the authors suggest their results would only change if the probability of nuclear war per year was 4.8x10^-5 (plus some other variables changing) rather than their estimated of 7x10^-3 (incidentally, I think the values for S model and E model are switched in Table 5 – the value for pr(nuclear war) in the table’s S model column corresponds to the probability given in the E model). But it is significantly overconfident to say that risk of nuclear war per year could not possibly be below 4.8x10^-5, so I think the authors overstate their certainty when they say “reverting [reversing?] the conclusion required simultaneously changing the 3-5 most important parameters to the pessimistic ends”; in fact it merely requires that the authors have not correctly identified the ‘pessimistic end’ of any one of the five parameters, which seems likely given the limitations in their data which I will discuss momentarily. I personally would have found one- and two-dimensional threshold analysis a more intuitive way to present the results, but I think the authors have a reasonable argument for their approach. As described earlier, I have some concerns that an appropriate amount of structural sensitivity analysis was undertaken, but the presentation of uncertainty analysis is appropriate in its own terms (if somewhat nonstandard).

      Overall, I have no major concerns about the theory or application of the modelling approach. However, I have a number of concerns with the use of the survey instrument:

      First, the authors could have done more to explain the level of uncertainty their survey instrument contains. They received eight responses, which is already a very low number of responses for a quantitative survey. In addition, two of the eight responses were from authors of the paper. The authors discuss ‘response bias’ and ‘demand characteristic bias’ which would not typically be applied to data generated by an approximately autoethnographic process – it is obvious that the authors of a survey instrument know what purpose the instrument is to be used for, and have incentives to make the survey generate novel and interesting findings. It might have been a good sensitivity analysis to exclude responses from the authors and other researchers associated with ALLFED since there is a clear conflict of interest that could bias results here.

      Second, issues with survey data collection are compounded by the fact that some estimates which are given in the S Model are actually not elicited with the survey technique – they are instead cited to Denkenberger & Pearce (2016) and Denkenberger & Pearce (2017). This is described appropriately in the text, but not clearly marked in the summary Table 1 where I would expect to see it, and the limitation this presents is not described clearly. To be explicit, the limitation is that at least two key parameters in the model are based on a sample of the opinions of two of the eight survey respondents, rather than the full set of eight respondents. As an aside on presentation, the decision to present lower and upper credible intervals in Table 1 rather than median is non-standard for an economics paper, although perhaps this is a discipline-specific convention I am unaware of. Regardless, I’m not sure it is appropriate to present the lowest of eight survey responses as the ‘5th percentile’, as it is actually the 13th percentile and giving 95% confidence intervals implies a level of accuracy the survey instrument cannot reach. While I appreciate the 13th percentile of 8 responses will be the same as the 5th centile of 100 samples drawn from those responses, this is not going to be clear to a casual reader of the paper. ‘Median (range)’ might be a better presentation of the survey data in this table, with better clarity on where each estimate comes from. Alternatively, the authors could look at fitting a lognormal distribution to the survey results using e.g. method of moments, and then resample from the new distribution to create a genuine 95% CI. Regardless, given the low number of responses, it might have been appropriate simply to present all eight estimates for each relevant parameter in a table.

      Third, the authors could have done more to make it clear that the ‘Expert Model’ was effectively just another survey with an n of 1. Professor Sandburg, who populated the Expert Model, is also an author on this paper and so it is unclear what if any validation of the Expert Model could reasonably have been undertaken – the E model is therefore likely to suffer from the same drawbacks as the S model. It is also unclear if Professor Sandburg knew the results of the S Model before parameterising his E Model – although this seems highly likely given that 25% of the survey’s respondents were Professor Sandburg’s co-authors. This could be a major source of bias, since presumably the authors would prefer the two models to agree and the expert parameterising the model is a co-author. I also think more work is needed to be done establishing the Expert’s credentials in the field of agricultural R&D (necessary for at least some of the parameter estimates); although I happily accept Professor Sandburg is a world expert on existential risk and a clear choice to act as the parameterising ‘expert’ for most parameters, I think there may have been alternative choices (such as agricultural economists) who may have been more obviously suited to giving some estimates. There is no methodological reason why one expert had to be selected to populate the whole table, and no defence given in the text for why one expert was selected - the paper is highly multidisciplinary and it would be surprising if any one individual had expert knowledge of every relevant element. Overall, this limitation makes me extremely hesitant to accept the authors’ argument that the fact that S model and E model are both robust means the conclusion is equally robust

      Generally, I am sympathetic to the authors’ claim that there is unavoidable uncertainty in the investigation of the far future. However, the survey is a very major source of avoidable uncertainty, and it is not a reasonable decision of the authors to present the uncertainty due to their application of survey methods as the same kind of thing as uncertainty about the future potential of humanity. There are a number of steps the authors could have taken to improve the validity and reliability of their survey results, some of which would not even have required rerunning the survey (to be clear however, I think there is a good case for rerunning the survey to ensure a broader panel of responses). With the exception of the survey, however, methods were generally appropriate and valid.

      Parameter estimates

      Notwithstanding my concerns about the use of the survey instrument, I have some object level concerns with specific parameters described in the model.

      • The discount rate for both costs and benefits appears to be zero, which is very nonstandard in economic evaluation. Although the authors make reference to “long termism, the view that the future should have a near zero discount rate”, the reference for this position leads to a claim that a zero rate of pure time preference is common, and a footnote observing that “the consensus against discounting future well-being is not universal”. To be clear, pure time preference is only one component of a well-constructed discount rate and therefore a discount rate should still be applied for costs, and probably for future benefits too. Even notwithstanding that I think this is an error of understanding, it is a limitation of the paper that discount rates were not explored, given they seem very likely to have a major impact on conclusions.
      • A second concern I have relating to parameterisation is the conceptual model leading to the authors’ proposed costing for the intervention. The authors explain their conceptual model linking nuclear war risk to agricultural decline commendably clearly, and this expands on the already strong argument in Denkenberger & Pearce (2016). However, I am less clear on their conceptual model linking approximately $86m of research to the widescale post-nuclear deployment of resilient foods. The assumption seems to be (and I stress this is my assumption based on Denkenberger & Pearce (2016) – it would help if the authors could make it explicit) that $86m purchases the ‘invention’ of the resilient food, and once the food is ‘invented’ then it can be deployed when needed with only a little bit of ongoing training (covered by the $86m). This seems to me to be an optimistic assumption; there seems to be no cost associated with disseminating the knowledge, or any raw materials necessary to culture the resilient food. Moreover, the model seems to structurally assume that distribution chains survive the nuclear exchange with 100% certainty (or that the materials are disseminated to every household which would increase costs), and that an existing resilient food pipeline exists at the moment of nuclear exchange which can smoothly take over from the non-resilient food pipeline.

      I have extremely serious reservations about these points. I think it is fair to say that an economics paper which projected benefits as far into the future as the authors do here without an exploration of discount rates would be automatically rejected by most editors, and it is not clear why the standard should be so different for existential risk analysis. A cost of $86m to mitigate approximately 40% of the impact of a full-scale nuclear war between the US and a peer country seems prima facie absurd, and the level of exploration of such an important parameter is simply not in line with best practice in a cost-effectiveness analysis (especially since this is the parameter on which we might expect the authors to be least expert). I wouldn’t want my reservations about these two points to detract from the very good and careful scholarship elsewhere in the paper, but neither do I want to give the impression that these are just minor technical details – these issues could potentially reverse the authors’ conclusions, and should have been substantially defended in the text.

      Conclusions

      Overall, this is a novel and insightful paper which is unfortunately burdened with some fairly serious conceptual issues. The authors should be commended for their clear-sighted contextualisation of resilient foods as an issue for discussion in existential risk, and for the scope of their ambition in modelling. Academia would be in a significantly better place if more authors tried to answer far-reaching questions with robust approaches, rather than making incremental contributions to unimportant topics.

      Where the issues of the paper lie are structural weaknesses with the cost-effectiveness philosophy deployed, methodological weaknesses with the survey instrument and two potentially conclusion-reversing issues with parameterisation which should have been given substantially more discussion in the text. I am not convinced that the elements of the paper which are robust are sufficiently robust to overcome these weaknesses – my view is that it would be premature to reallocate funding from AI Risk reduction to resilient food on the basis of this paper alone. The most serious conceptual issue which I think needs to be resolved before this can happen is to demonstrate that ‘do nothing’ would be less cost-effective than investing $86m in resilient foods, given that the ‘do nothing’ approach would potentially include strong market dynamics leaning towards resilient foods. I agree with the authors that an agent-based model might be appropriate for this, although a conventional supply-and-demand model might be simpler.

      I really hope the authors are interested in publishing follow-on work, looking at elements which I have highlighted in this review as being potentially misaligned to the paper that was actually published but which are nevertheless potentially important contributions to knowledge. In particular, the AI subunit is novel and important enough for its own publication.


      Evaluator details

      1. Name: Alex Bates
      2. How long have you been in this field? In the field of cost-effectiveness analysis, 10 years. I wouldn’t consider myself to be in the field of x-risk
      3. How many proposals and papers have you evaluated? I’ve lost count, but probably mid double figures - perhaps 50?
    1. So as a schematic one might think of different Christmases overlaying each other:An “Franco-German-European Christmas”, freighted with cultural and historical weight. Let us call that the“politico-sentimental Christmas”. And there is an “Anglo-American Christmas” fashioned by 19th-century bourgeois culture and 20th-century mass commercialism and mass production - “organized Fordist Christmas”, which now extends through its supply chains around the world And, by the late 20th century, we have “global Christmas”. The majority of people celebrating Christmas today may not even be in the North Atlantic world, its original cradle. Christmas is now a global commercial event.

      Hm. Sure a political-religious Christmas was in there somewhere? Still useful

    1. Author Response

      eLife assessment:

      This study addresses whether the composition of the microbiota influences the intestinal colonization of encapsulated vs unencapsulated Bacteroides thetaiotaomicron, a resident micro-organism of the colon. This is an important question because factors determining the colonization of gut bacteria remain a critical barrier in translating microbiome research into new bacterial cell-based therapies. To answer the question, the authors develop an innovative method to quantify B. theta population bottlenecks during intestinal colonization in the setting of different microbiota. Their main finding that the colonization defect of an acapsular mutant is dependent on the composition of the microbiota is valuable and this observation suggests that interactions between gut bacteria explains why the mutant has a colonization defect. The evidence supporting this claim is currently insufficient. Additionally, some of the analyses and claims are compromised because the authors do not fully explain their data and the number of animals is sometimes very small.

      Thank you for this frank evaluation. Based on the Reviewers’ comments, the points raised have been addressed by improving the writing (apologies for insufficient clarity), and by the addition of data that to a large extent already existed or could be rapidly generated. In particularly the following data has been added:

      1. Increase to n>=7 for all fecal time-course experiments

      2. Microbiota composition analysis for all mouse lines used

      3. Data elucidating mechanisms of SPF microbiome/ host immune mechanisms restriction of acapsular B. theta

      4. Short- versus long-term recolonization of germ-free mice with a complete SPF microbiota and assessment of the effect on B. theta colonization probability.

      5. Challenge of B. theta monocolonized mice with avirulent Salmonella to disentangle effects of the host inflammatory response from other potential explanations of the observations.

      6. Details of all inocula used

      7. Resequencing of all barcoded strains

      Additionally, we have improved the clarity of the text, particularly the methods section describing mathematical modeling in the main text. Major changes in the text and particularly those replying to reviewers comment have been highlighted here and in the manuscript.

      Reviewer #1 (Public Review):

      The study addresses an important question - how the composition of the microbiota influences the intestinal colonization of encapsulated vs unencapsulated B. theta, an important commensal organism. To answer the question, the authors develop a refurbished WITS with extended mathematical modeling to quantify B. theta population bottlenecks during intestinal colonization in the setting of different microbiota. Interestingly, they show that the colonization defect of an acapsular mutant is dependent on the composition of the microbiota, suggesting (but not proving) that interactions between gut bacteria, rather than with host immune mechanisms, explains why the mutant has a colonization defect. However, it is fairly difficult to evaluate some of the claims because experimental details are not easy to find and the number of animals is very small. Furthermore, some of the analyses and claims are compromised because the authors do not fully explain their data; for example, leaving out the zero values in Fig. 3 and not integrating the effect of bottlenecks into the resulting model, undermines the claim that the acapsular mutant has a longer in vivo lag phase.

      We thank the reviewer for taking time to give this details critique of our work, and apologies that the experimental details were insufficiently explained. This criticism is well taken. Exact inoculum details for experiment are now present in each figure (or as a supplement when multiple inocula are included). Exact microbiome composition analysis for OligoMM12, LCM and SPF microbiota is now included in Figure 2 – Figure supplement 1.

      Of course, the models could be expanded to include more factors, but I think this comment is rather based on the data being insufficiently clearly explained by us. There are no “zero values missing” from Fig. 3 – this is visible in the submitted raw data table (excel file Source Data 1), but the points are fully overlapped in the graph shown and therefore not easily discernable from one another. Time-points where no CFU were recovered were plotted at a detection limit of CFU (50 CFU/g) and are included in the curve-fitting. However, on re-examination we noticed that the curve fit was carried out on the raw-data and not the log-normalized data which resulted in over-weighting of the higher values. Re-fitting this data does not change the conclusions but provides a better fit. These experiments have now been repeated such that we now have >=7 animals in each group. This new data is presented in Fig. 3C and D and Fig. 3 Supplement 2.

      Limitations:

      1) The experiments do not allow clear separation of effects derived from the microbiota composition and those that occur secondary to host development without a microbiota or with a different microbiota. Furthermore, the measured bottlenecks are very similar in LCM and Oligo mice, even though these microbiotas differ in complexity. Oligo-MM12 was originally developed and described to confer resistance to Salmonella colonization, suggesting that it should tighten the bottleneck. Overall, an add-back experiment demonstrating that conventionalizing germ-free mice imparts a similar bottleneck to SPF would strengthen the conclusions.

      These are excellent suggestions and have been followed. Additional data is now presented in Figure 2 – figure supplement 8 showing short, versus long-term recolonization of germ-free mice with an SPF microbiota and recovering very similar values of beta, to our standard SPF mouse colony. These data demonstrate a larger total niche size for B. theta at 2 days post-colonization which normalizes by 2 weeks post-colonization. Independent of this, the colonization probability, is already equivalent to that observed in our SPF colony at day 2 post-colonization. Therefore, the mechanisms causing early clonal loss are very rapidly established on colonization of a germ-free mouse with an SPF microbiota. We have additionally demonstrated that SPF mice do not have detectable intestinal antibody titers specific for acapsular B. theta. (Figure 2 – figure supplement 7), such that this is unlikely to be part of the reason why acapsular B. theta struggles to colonize at all in the context of an SPF microbiota. Experiments were also carried to detect bacteriophage capable of inducing lysis of B. theta and acapsular B. theta from SPF mouse cecal content (Figure 2 – figure supplement 7). No lytic phage plaques were observed. However, plaque assays are not sensitive for detection of weakly lytic phage, or phage that may require expression of surface structures that are not induced in vitro. We can therefore conclude that the restrictive activity of the SPF microbiota is a) reconstituted very fast in germ-free mice, b) is very likely not related to the activity of intestinal IgA and c) cannot be attributed to a high abundance of strongly lytic bacteriophage. The simplest explanation is that a large fraction of the restriction is due to metabolic competition with a complex microbiota, but we cannot formally exclude other factors such as antimicrobial peptides or changes in intestinal physiology.

      2) It is often difficult to evaluate results because important parameters are not always given. Dose is a critical variable in bottleneck experiments, but it is not clear if total dose changes in Figure 2 or just the WITS dose? Total dose as well as n0 should be depicted in all figures.

      We apologized for the lack of clarity in the figures. Have added panels depicting the exact inoculum for each figure legend (or a supplementary figure where many inocula were used). Additionally, the methods section describing how barcoded CFU were calculated has been rewritten and is hopefully now clearer.

      3) This is in part a methods paper but the method is not described clearly in the results, with important bits only found in a very difficult supplement. Is there a difference between colonization probability (beta) and inoculum size at which tags start to disappear? Can there be some culture-based validation of "colonization probability" as explained in the mathematics? Can the authors contrast the advantages/disadvantages of this system with other methods (e.g. sequencing-based approaches)? It seems like the numerator in the colonization probability equation has a very limited range (from 0.18-1.8), potentially limiting the sensitivity of this approach.

      We apologized for the lack of clarity in the methods. This criticism is well taken, and we have re-written large sections of the methods in the main text to include all relevant detail currently buried in the extensive supplement.

      On the question of the colonization probability and the inoculum size, we kept the inoculum size at 107 CFU/ mouse in all experiments (except those in Fig.4, where this is explicitly stated); only changing the fraction of spiked barcoded strains. We verified the accuracy of our barcode recovery rate by serial dilution over 5 logs (new figure added: Figure 1 – figure supplement 1). “The CFU of barcoded strains in the inoculum at which tags start to disappear” is by definition closely related to the colonization probability, as this value (n0) appears in the calculation. Note that this is not the total inoculum size – this is (unless otherwise stated in Fig. 4) kept constant at 107 CFU by diluting the barcoded B. theta with untagged B. theta. Again, this is now better explained in all figure legends and the main text.

      We have added an experiment using peak-to-trough ratios in metagenomic sequencing to estimate the B. theta growth rate. This could be usefully employed for wildtype B. theta at a relatively early timepoint post-colonization where growth was rapid. However, this is a metagenomics-based technique that requires the examined strain to be present at an abundance of over 0.1-1% for accurate quantification such that we could not analyze the acapsular B. theta strain in cecum content at the same timepoint. These data have been added (Figure 3 – figure supplement 3). Note that the information gleaned from these techniques is different. PTR reveals relative growth rates at a specific time (if your strain is abundant enough), whereas neutral tagging reveals average population values over quite large time-windows. We believe that both approaches are valuable. A few sentences comparing the approaches have been added to the discussion.

      The actual numerator is the fraction of lost tags, which is obtained from the total number of tags used across the experiment (number of mice times the number of tags lost) over the total number of tags (number of mice times the number of tags used). Very low tag recovery (less than one per mouse) starts to stray into very noisy data, while close to zero loss is also associated with a low-information-to-noise ratio. Therefore, the size of this numerator is necessarily constrained by us setting up the experiments to have close to optimal information recovery from the WITS abundance. Robustness of these analyses is provided by the high “n” of between 10 and 17 mice per group.

      4) Figure 3 and the associated model is confusing and does not support the idea that a longer lag-phase contributes to the fitness defect of acapsular B.theta in competitive colonization. Figure 3B clearly indicates that in competition acapsular B. theta experiences a restrictive bottleneck, i.e., in competition, less of the initial B. theta population is contributed by the acapsular inoculum. There is no need to appeal to lag-phase defects to explain the role of the capsule in vivo. The model in Figure 3D should depict the acapsular population with less cells after the bottleneck. In fact, the data in Figure 3E-F can be explained by the tighter bottleneck experienced by the acapsular mutant resulting in a smaller acapsular founding population. This idea can be seen in the data: the acapsular mutant shedding actually dips in the first 12-hours. This cannot be discerned in Figure 3E because mice with zero shedding were excluded from the analysis, leaving the data (and conclusion) of this experiment to be extrapolated from a single mouse.

      We of course completely agree that this would be a correct conclusion if only the competitive colonization data is taken into account. However, we are also trying to understand the mechanisms at play generating this bottleneck and have investigated a range of hypotheses to explain the results, taking into account all of our data.

      Hypothesis 1) Competition is due to increased killing prior to reaching the cecum and commencing growth: Note that the probability of colonization for single B. theta clones is very similar for OligoMM12 mouse single-colonization by the wildtype and acapsular strains. For this hypothesis to be the reason for outcompetition of the acapsular strain, it would be necessary that the presence of wildtype would increase the killing of acapsular B. theta in the stomach or small intestine. The bacteria are at low density at this stage and stomach acid/small intestinal secretions should be similar in all animals. Therefore, this explanation seems highly unlikely

      Hypothesis 2) Competition between wildtype and acapsular B. theta occurs at the point of niche competition before commencing growth in the cecum (similar to the proposal of the reviewer). It is possible that the wildtype strain has a competitive advantage in colonizing physical niches (for example proximity to bacteria producing colicins). On the basis of the data, we cannot exclude this hypothesis completely and it is challenging to measure directly. However, from our in vivo growth-curve data we observe a similar delay in CFU arrival in the feces for acapsular B. theta on single colonization as in competition, suggesting that the presence of wildtype (i.e., initial niche competition) is not the cause of this delay. Rather it is an intrinsic property of the acapsular strain in vivo,

      Hypothesis 3) Competition between wildtype and acapsular B. theta is mainly attributable to differences in growth kinetics in the gut lumen. To investigate growth kinetics, we carried our time-courses of fecal collection from OligoMM12 mice single-colonized with wildtype or acapsular B. theta, i.e., in a situation where we observe identical colonization probabilities for the two strains. These date, shown now in Figure 3 C and D and Figure 3 – figure supplement 2, show that also without competition, the CFU of acapsular B. theta appear later and with a lower net growth rate than the wildtype. As these single-colonizations do not show a measurable difference between the colonization probability for the two strains, it is not likely that the delayed appearance of acapsular B. theta in feces is due to increased killing (this would be clearly visible in the barcode loss for the single-colonizations). Rather the simplest explanation for this observation is a bona fide lag phase before growth commences in the cecum. Interestingly, using only the lower net growth rate (assumed to be a similar growth rate but increased clearance rate) produces a good fit for our data on both competitive index and colonization probability in competition (Figure 3, figure supplement 5). This is slightly improved by adding in the observed lag-phase (Figure 3). It is very difficult to experimentally manipulate the lag phase in order to directly test how much of an effect this has on our hypothesis and the contribution is therefore carefully described in the new text.

      Please note that all data was plotted and used in fitting in Fig 3E, but “zero-shedding” is plotted at a detection limit and overlayed, making it look like only one point was present when in fact several were used. This was clear in the submitted raw data tables. To sure-up these observations we have repeated all time-courses and now have n>=7 mice per group.

      5) The conclusions from Figure 4 rely on assumptions not well-supported by the data. In the high fat diet experiment, a lower dose of WITS is required to conclude that the diet has no effect. Furthermore, the authors conclude that Salmonella restricts the B. theta population by causing inflammation, but do not demonstrate inflammation at their timepoint or disprove that the Salmonella population could cause the same effect in the absence of inflammation (through non-inflammatory direct or indirect interactions).

      We of course agree that we would expect to see some loss of B. theta in HFD. However, for these experiments the inoculum was ~109 CFUs/100μL dose of untagged strain spiked with approximately 30 CFU of each tagged strain. Decreasing the number of each WITS below 30 CFU leads to very high variation in the starting inocula from mouse-to-mouse which massively complicates the analysis. To clarify this point, we have added in a detection-limit calculation showing that the neutral tagging technique is not very sensitive to population contractions of less than 10-fold, which is likely in line with what would be expected for a high-fat diet feeding in monocolonized mice for a short time-span.

      This is a very good observation regarding our Salmonella infection data. We have now added the fecal lipocalin 2 values, as well as a group infected with a ssaV/invG double mutant of S. Typhimurium that does not cause clinical grade inflammation (“avirulent”). This shows 1) that the attenuated S. Typhimurium is causing intestinal inflammation in B. theta colonized mice and 2) that a major fraction of the population bottleneck can be attributed to inflammation. Interestingly, we do observe a slight bottleneck in the group infected with avirulent Salmonella which could be attributable either to direct toxicity/competition of Salmonella with B. theta or to mildly increased intestinal inflammation caused by this strain. As we cannot distinguish these effects, this is carefully discussed in the manuscript.

      6) Several of the experiments rely on very few mice/groups.

      We have increased the n to over 5 per group in all experiments (most critically those shown in Fig 3, Supplement 5). See figure legends for specific number of mice per experiment.

      Reviewer #2 (Public Review):

      The goal of this study was to understand population bottlenecks during colonization in the context of different microbial communities. Capsular polysaccharide mutants, diet, and enteric infection were also used paired to short-term monitoring of overall colonization and the levels of specific strains. The major strength of this study is the innovative approach and the significance of the overall research area.

      The first major limitation is the lack of clear and novel insight into the biology of B. theta or other gut bacterial species. The title is provocative, but the experiments as is do not definitively show that the microbiota controls the relative fitness of acapsular and wild-type strains or provide any mechanistic insights into why that would be the case. The data on diet and infection seem preliminary. Furthermore, many of the experiments conflict with prior literature (i.e., lack of fitness difference between acapsular and wild-type strain and lack of impact of diet) but satisfying explanations are not provided for the lack of reproducibility.

      In line with suggestions from Reviewer 1, the paper has undergone quite extensive re-writing to better explain the data presented and its consequences. Additionally, we now explicitly comment on apparent discrepancies between our reported data and the literature – for example the colonization defect of acapsular B. theta is only published for competitive colonizations, where we also observe a fitness defect so there is no actual conflict. Additionally, we have calculated detection limits for the effect of high-fat diet and demonstrate that a 10-fold reduction in the effective population size would not be robustly detected with the neutral tagging technique such that we are probably just underpowered to detect small effects, and we believe it is important to point out the numerical limits of the technique we present here. Additionally for the Figure 4 experiments, we have added data on colonization/competition with an avirulent Salmonella challenge giving some mechanistic data on the role of inflammation in the B. theta bottleneck.

      Another major limitation is the lack of data on the various background gut microbiotas used. eLife is a journal for a broad readership. As such, describing what microbes are in LCM, OligoMM, or SPF groups is important. The authors seem to assume that the gut microbiota will reflect prior studies without measuring it themselves.

      All gnotobiotic lines are bred as gnotobiotic colonies in our isolator facility. This is now better explained in the methods section. Additionally, 16S sequencing of all microbiotas used in the paper has been added as Figure 2 – figure supplement 1.

      I also did not follow the logic of concluding that any differences between SPF and the two other groups are due to microbial diversity, which is presumably just one of many differences. For example, the authors acknowledge that host immunity may be distinct. It is essential to profile the gut microbiota by 16S rRNA amplicon sequencing in all these experiments and to design experiments that more explicitly test the diversity hypotheses vs. alternatives like differences in the membership of each community or other host phenotypes.

      This is an important point. We have carried out a number of experiments to potentially address some issues here.

      1) We carried out B. theta colonization experiments in germ-free mice that had been colonized by gavage of SPF feces either 1 day prior to colonization of 2 weeks prior to colonization. While the shorter pre-colonization allowed B. theta to colonize to a higher population density in the cecum, the colonization probability was already reduced to levels observed in our SPF colony in the short pre-colonization. Therefore, the factors limiting B. theta establishment in the cecum are already established 1-2 days post-colonization with an SPF microbiota (Figure 2 - figure supplement 8). 2) We checked for the presence of secretory IgA capable of binding to the surface of live B. theta, compared to a positive control of a mouse orally vaccinated against B. theta. (Fig. 2, Supplement 7) and could find no evidence of specific IgA targeting B. theta in the intestinal lavages of our SPF mouse colony. 3) We isolated bacteriophage from the intestine of SPF mice and used this to infect lawns of B. theta wildtype and acapsular in vitro. We could not detect and plaque-forming phage coming from the intestine of SPF mice (Figure 2 – figure supplement 7).

      We can therefore exclude strongly lytic phage and host IgA as dominant driving mechanisms restricting B. theta colonization. It remains possible that rapidly upregulated host factors such as antimicrobial peptide secretion could play a role, but metabolic competition from the microbiota is also a very strong candidate hypothesis. The text regarding these experiments has been slightly rewritten to point out that colonization probability inversely correlates with microbiota complexity, and the mechanisms involved may involve both direct microbe-microbe interactions as well as host factors.

      Given the prior work on the importance of capsule for phage, I was surprised that no efforts are taken to monitor phage levels in these experiments. Could B. theta phage be present in SPF mice, explaining the results? Alternatively, is the mucus layer distinct? Both could be readily monitored using established molecular/imaging methods.

      See above: no plaque-forming phage could be recovered from the SPF mouse cecum content. The main replicative site that we have studied here, in mice, is the cecum which does not have true mucus layers in the same way as the distal colon and is upstream of the colon so is unlikely to be affected by colon geography. Rather mucus is well mixed with the cecum content and may behave as a dispersed nutrient source. There is for sure a higher availability of mucus in the gnotobiotic mice due to less competition for mucus degradation by other strains. However, this would be challenging to directly link to the B. theta colonization phenotype as Muc2-deficient mice develop intestinal inflammation.

      The conclusion that the acapsular strain loses out due to a difference of lag phase seems highly speculative. More work would be needed to ensure that there is no difference in the initial bottleneck; for example, by monitoring the level of this strain in the proximal gut immediately after oral gavage.

      This is an excellent suggestion and has been carried out. At 8h post-colonization with a high inoculum (allowing easy detection) there were identical low levels of B. theta in the upper and lower small intestine, but more B. theta wildtype than B. theta acapsular in the cecum and colon, consistent with commencement of growth for B. theta wildtype but not the acapsular strain at this timepoint. We have additionally repeated the single-colonization time-courses using our standard inoculum and can clearly see the delayed detection of acapsular B. theta in feces even in the single-colonization state when no increased bottleneck is observed. This can only be reasonably explained by a bona fide lag-phase extension for acapsular B. theta in vivo. These data also reveal and decreased net growth rate of acapsular B. theta. Interestingly, our model can be quite well-fitted to the data obtained both for competitive index and for colonization probability using only the difference in net growth rate. Adding the (clearly observed) extended lag-phase generates a model that is still consistent with our observations.

      Another major limitation of this paper is the reliance on short timepoints (2-3 days post colonization). Data for B. theta levels over 2 weeks or longer is essential to put these values in context. For example, I was surprised that B. theta could invade the gut microbiota of SPF mice at all and wonder if the early time points reflect transient colonization.

      It should be noted that “SPF” defines microbiota only on missing pathogens and not on absolute composition. Therefore, the rather efficient B. theta colonization in our SPF colony is likely due to a permissive composition and this is likely to be not at all reproducible between different SPF colonies (a major confounder in reproducibility of mouse experiments between institutions. In contrast the gnotobiotic colonies are highly reproducible). We do consistently see colonization of our SPF colony by wildtype B. theta out to at least 10 days post-inoculation (latest time-point tested) at similar loads to the ones observed in this work, indicating that this is not just transient “flow-through” colonization. Data included below:

      For this paper we were very specifically quantifying the early stages of colonization, also because the longer we run the experiments for, the more confounding features of our “neutrality” assumptions appear (e.g., host immunity selecting for evolved/phase-varied clones, within-host evolution of individual clones etc.). For this reason, we have used timepoints of a maximum of 2-3 days.

      Finally, the number of mice/group is very low, especially given the novelty of these types of studies and uncertainty about reproducibility. Key experiments should be replicated at least once, ideally with more than n=3/group.

      For all barcode quantification experiments we have between 10 and 17 mice per group. Experiments for the in vivo time-courses of colonization have been expanded to an “n” of at least 7 per group.

    1. Author Response

      Reviewer #2 (Public Review):

      Susswein et al. analyze a fine-scale, novel data stream of human mobility, openly available from Safegraph, based on the usage of mobile apps with GPS and sampled from over 45 million smartphone devices. They define a metric $\sigma_{it}$, properly normalized, that quantifies the propensity for visits to indoor locations relative to outdoor locations in a given county $i$ at week $t$. For each pair of counties $i$ and $j$, they compute the Pearson correlation coefficient $\rho_{ij}$ between the corresponding $\sigma$ metrics. This generates a correlation matrix that can be interpreted as the adjacency matrix of a network. They then perform community detection on this network/matrix, effectively clustering together time series that are correlated. This identifies three main clusters of counties, characterized geographically as either in the north of the country, in the south of the country, and possibly in tourism active areas. They then show, via a simple model, how including over-simplified models of seasonality may affect infectious disease models.

      This work is very interesting for the infectious disease modeling community, as it addresses a complex problem introducing a new data stream.

      This work builds on several strengths, among which:

      It is the first analysis of the Safegraph dataset to capture seasonality in indoor behavior.

      It provides a simple metric to quantify indoor activity, that thanks to the dataset can be computed with a high level of spatial detail.

      It aims at characterizing clusters of counties with a similar pattern of indoor activity.

      It aims at quantifying the impact of neglecting finer-scale patterns of seasonality, for example considering seasonality to be homogeneous at the US level.

      We thank the reviewer for the positive review of our work.

      At the same time, it presents several weaknesses that should be addressed to improve the methodology, its results, and the implication:

      There is no quantitative comparison of the newly introduced metric for indoor activity with other proxies of seasonality (e.g. temperature or relative humidity). The (dis)similarity with other proxies may help in assessing the importance of this metric, showing why it can not be exchanged with other data sources (like temperature data) that are widely available and are not affected by sampling issues (more on that later).

      We have now added supplementary figures (Figure S3) to illustrate how indoor activity seasonality compares with temperature and humidity. We have also added text to the Results and the Discussion to discuss this point.

      A major flow of the analysis is to perform community detection on a network defined by the correlation between time series with an algorithm that is based on modularity optimization. As explained in Macmahon et al.[1], all modularity optimization methods rely on null assumptions that in the case of correlation between time series are violated. Therefore, there is a very strong potential bias in their results that is not accounted for. Possible solutions could be to proceed via the methodology presented in [1] or via a different type of algorithm (e.g. Infomap [2]). In both cases, as the network is thresholded (considering only a correlation larger than 0.9), a more quantitative assessment of the impact of the threshold value should be included.

      References

      [1] Mel MacMahon and Diego Garlaschelli Phys. Rev. X 5, 021006 (2015).

      [2] Martin Rosvall and Carl T. Bergstrom PNAS 105, 1118 (2008).

      We thank the reviewer for making this excellent point. We have now added Supplementary Figures S13 and S14. In Figure S13, we demonstrate the robustness of our clustering results with different correlation thresholds. (We have also corrected a typo in our original Methods section which mistakenly stated our correlation threshold as 0.9 rather than the 90th percentile which is what we used.) In Figure S14, we show the clustering results using a different clustering algorithm. In an effort to test a non-network-based clustering approach, we use a hierarchical clustering approach and find a consistent partition of the US to our main results.

      It is not clear what is the added value of the data on indoor activity, as no fitting to real data is performed. Although this may be considered beyond the scope of this paper, I think it would be crucial to quantify how much a data-informed model would better describe real epidemic data (for example in the case of COVID-19). For now, only the impact of neglecting heterogeneity in indoor activity is shown, comparing a model with region-average parameters vs a model with county-level average parameters. Given that the dataset comes with potential bias in sampling (more on this later) it would be good to assess its goodness in predicting real epidemic spread. When showing results from different models, no visible errors are shown on the plot. How have the errors been estimated?

      We appreciate this point by the reviewer, and agree that future work will have to consider how indoor activity seasonality affects our ability to capture observed transmission trends. However, such work would additionally need careful characterization of other seasonal factors hypothesized to drive transmission (including environmental and other behavioral factors), and is beyond the scope of our work. Instead, in Figure 4 we aim to (a) provide the infectious disease modeling community with empirically-inferred parameters for a simple sinusoidal model which is commonly used in infectious disease models to capture transmission seasonality; and (b) demonstrate the implications of ignoring geographic heterogeneity in transmission seasonality in theoretical models of disease dynamics, which are commonly used for scenario analysis and model-based intervention design. As we demonstrate, transmission seasonality described by such sinusoidal models, even when they are empirically characterized as in our case, can lead to meaningfully different epidemic dynamics when transmission seasonality varies from the assumptions.

      Additionally, there is no uncertainty included in Figure 4B because transmission seasonality is either based on empirical data point per time step, or on the fitted sinusoidal model (where the estimated parameters have negligible standard errors).

      The dataset is presented as representative of the US population. However, this has not been assessed over time. As adherence to social distancing is influenced by several socio-economic determinants the lack of representativity in certain strata of the population at a given time may introduce an important bias in the dataset. Although this is an inherent limitation of the dataset, it should be discussed in the paper more thoroughly.

      We agree with the reviewer that this is a limitation. However, we do not have any way of assessing demographic representation in the dataset over time. We have instead included an additional sentence into the Discussion section acknowledging this point.

      In conclusion, I think that the methodology should be revised to account for the fact that the analysis is performed on a correlation matrix. Capturing seasonal patterns of indoor activity can help in tackling the crucial problem of seasonality in human behavior. This could help in identifying effective strategies of disease containment able to curb disease spread at a lower societal cost than fully-fledged lockdowns.

      We thank the reviewer again for their helpful suggestions.

    1. the atman as i 00:20:34 said it's the witness the agent the enjoyer most importantly it's distinct from our body and mind it's their uh it's their owner and it's a permanent continuous thing unlike our bodies and minds which 00:20:48 are changing from moment to moment so they've got this kind of momentary impermanence but also as you may know they each come to an end we die um but the idea is that the self just 00:20:59 persists and goes on and on um and most importantly most most importantly when we identify the atman we're identifying what you are your essence or your core and so we might 00:21:13 think i change a lot my thoughts change my political preferences change my food preferences change my friends change but i remain the same as a self

      !- explanation of : Atman -the thing that remains the same while everything else changes

    1. Author Response

      Reviewer #1 (Public Review):

      In the current study, the authors reanalyze a prior dataset testing effects of D2 antagonism on choices in a delay discounting task. While the prior report using standard analysis, showed no effects, the current study used a DDM to examine more carefully possible effects on different subcomponents of the decision process. This approach revealed contrasting effects of D2 blockade on the effect of reward size differences and bias. Effects were uncorrelated, suggesting separate mechanisms perhaps. The authors speculate that these opposing effects explain the variability in effects across studies, since they mean that effects would depend on which of these factors is more important in a particular design. Overall the study is novel and well-executed, and the explanation offers interesting insight into neural processes.

      We thank the reviewer for judging our study as interesting and well-executed.

      Reviewer #2 (Public Review):

      The authors aim to test the hypothesis that dopamine mediates the evaluation of temporal costs in intertemporal choice in humans, with a specific goal of synthesizing the competing accounts and previous results regarding whether dopamine increases or decreases evaluation of delays in comparing differently delayed future rewards. To do this, they computationally dissect the impact of the drug amisulpride, a D2R antagonist, using a variant of a sequential sampling model, the drift-diffusion model (DDM), that is well established in decision-making literature as a cognitive process model of choice. This model allows the dissociation of starting bias from the rate at which decision evidence is integrated ('drift'), which the authors map to different accounts of the role of dopamine: the temporal proximity of an outcome is proposed to impact bias, while the cost of a delay to impact the drift rate of evidence evaluation/accumulation. Consistent with previous results, and perhaps integrating conflicting findings, the authors find that d2R blockade impacts both bias and drift rate in a cohort of 50 participants, demonstrating dopaminergic action at this receptor is implicated in dissociable components of intertemporal choice, with D2R block reducing the bias towards sooner, more temporally proximate rewards as well as enhancing the contrast between reward magnitudes irrespective of delay, effectively diminishing the effect of delay in the drug condition. These effects are consistent across a small subset of alternative models, confirming the multiple cognitive mechanisms through which D2R block impacts intertemporal choice is a robust feature of decisions on this task.

      Overall, this study is a detailed dissection of the specific effects of amisulpride on a type of future-oriented, hypothetical intertemporal choice, and provides consistent evidence integrating conflicting accounts that implicate dopaminergic signaling on evaluation of the cognitive costs, such as a delay, on choice. However the specificity of the empirical intervention and the task design limits the interpretation of the broader dopaminergic mechanisms at play in intertemporal choice, especially given the complexity of receptor specificity of this drug, dopamine precursor availability and individual differences and the specifics of the intertemporal choice in this task. As it stands, the results contribute an interesting, synthesized account of how D2R manipulation can impact evaluation of delays in multiple ways, that will likely be useful for motivating future studies and more detailed computational assessments of the cognitive process-level components of intertemporal choice more generally.

      We thank the reviewer for the positive overall evaluation of our study. We revised the manuscript according to the reviewer’s comments, addressing also the receptor specificity of amisulpride and the specifics of the administered intertemporal choice task, which further improved the quality of the manuscript.

      The focus of this study is important, and delineating the role of DA in intertemporal choice is of high relevance given DA disfunction is prevalent in many psychiatric disorders and a key target of pharmacological treatment. While the hypotheses of the current study are framed with respect to "costs", the task used by the authors reduces these to evaluation of a hypothetical delay, one which the participants do not necessarily experience in the context of the task. In some respects this is reasonable, given the prevalence of this task paradigm in testing temporal aspects of choice in humans in an economic sense. However, humans are also notoriously subject to framing effects and the impact of instructions in cognitive tasks like these, which can limit the generality of the conclusions, and in particular the specific ways in which a delay can be interpreted as costly (for eg cost as loss of potential earnings, cost as effortful waiting, cost as computational/simulation cost in future evaluation). Given the hypothesis recruits the idea of cost in assessing the role of dopamine, testing for generality in the effects of amisulpride in related but differently framed tasks seems critical for making this link in a general sense, and in connecting it to the previous studies in the literature the authors point to as demonstrating conflicting effects.

      We agree that it is important to discuss whether our findings for delay costs can be generalized to other costs types as well, such as risk, social costs, effort, or opportunity costs. Based on a recent literature review (Soutschek, Jetter, & Tobler, 2022), we speculate that dopamine may moderate proximity effects also for risk and social costs but not for effortful rewards, though we emphasize that these hypotheses still require more direct empirical evidence. We also discuss the issue that delays can be perceived as costly in different ways. While in some tasks participants actually experience the waiting time until reward delivery, such that delayed rewards are associated with opportunity costs, in our current task paradigm delayed rewards were virtually free of opportunity costs as participants could engage in other reward-related behaviors during the waiting time. Previous studies suggest that lower tonic dopamine levels reduce the sensitivity to opportunity costs (Niv et al., 2007), which seems in line with our finding that amisulpride decreases the influence of delays on the starting bias parameter. Nevertheless, we emphasize that further evidence is needed to decide whether dopamine shows similar effects for experienced and non-experienced waiting costs. In the revised manuscript, we discuss the cost specificity of our findings on p.22:

      “An important question refers to whether our findings for delay costs can be generalized to other types of costs as well, including risk, social costs (i.e., inequity), effort, and opportunity costs. In a recent review, we proposed that dopamine might also moderate proximity effects for reward options differing in risk and social costs, whereas the existing literature provides no evidence for a proximity advantage for effort-free over effortful rewards (Soutschek et al., 2022). However, these hypotheses need to be tested more explicitly by future investigations. Dopamine has also been ascribed a role for moderating opportunity costs, with lower tonic dopamine reducing the sensitivity to opportunity costs (Niv et al., 2007). While this appears consistent with our finding that amisulpride (under the assumption of postsynaptic effects) reduced the impact of delay on the starting bias, it is important to note that choosing delayed rewards did not involve any opportunity costs in our paradigm, given that participants could pursue other rewards during the waiting time. Thus, it needs to be clarified whether our findings for delayed rewards without experienced waiting time can be generalized to choice situations involving experienced opportunity costs.”

      Further, while the study aims to test the actions of dopamine broadly, the empirical manipulation is limited to the action of amisulpride, a D2R anatgonist. There is little to no discussion of, or control for, the relationship between dopaminergic action at D2 receptors (the site of amisulpride effects) and wider mechanisms of dopaminergic action at other sites eg D1-like receptors, and the interplay between activation at these two receptor types alongside baseline levels of dopamine concentration. This is necessary for a comprehensive account of dopamine effects on intertemporal choice as the authors aim to test, as opposed to a specific test of the role of the D2 receptor, which is what the study achieves. On a related note, in some preparations at least, amisulpride also acts at some of the 5-HT receptors, raising the possibility of a non-dopaminergic mechanism by which this drug might impact intertemporal decisions. This possibility, while it would not be expected to act without dopaminergic effects as well, is consistent with established effects of serotonin on waiting behaviors and patience. Granted, the limits of pharmacology in humans does not necessarily mean this can be controlled for, it should be kept in mind with a systemic manipulation such as this.

      We agree with the reviewer that it is important to distinguish between the contributions of D1 and D2 receptors to decision making, given that these receptor families are hypothesized to have dissociable functional roles. We therefore re-analyzed also data on the impact of a D1 agonist on intertemporal decision making (previous findings for this data set were published in Soutschek et al., 2020, Biological Psychiatry). This analysis provided no evidence for significant effects of D1R stimulation on parameters from a drift diffusion model. This suggests that D2R, rather than D1R, activation mediates the impact of proximity on intertemporal choices.

      In the revised manuscript, we report the findings for the D1 agonist study on p.16:

      “To assess the receptor specificity of our findings, we conducted the same analyses on the data from a study (published previously in Soutschek et al. (2020)) testing the impact of three doses of a D1 agonist (6 mg, 15 mg, 30 mg) relative to placebo on intertemporal choices (between-subject design). In the intertemporal choice task used in this experiment, the SS reward was always immediately available (delay = 0), contrary to the task in the D2 experiment where the delay of the SS reward varied from 0-30 days. Again, the data in the D1 experiment were best explained by DDM-1 (DICDDM-1 = 19,657) compared with all other DDMs (DICDDM-2 = 20,934; DICDDM-3 = 21,710; DICDDM-5 = 21,982; DICDDM-6 = 19,660; note that DDM-4 was identical with DDM-1 for the D1 agonist study because the delay of the SS reward was 0). Neither the best-fitting nor any other model yielded significant drug effects on any drift diffusion parameter (see Table 4 for the best-fitting model). Also model-free analyses conducted in the same way as for the D2 antagonist study revealed no significant drug effects (all HDI95% included zero). There was thus no evidence for any influence of D1R stimulation on intertemporal decisions.”

      We discuss the specificity of D2 receptors for moderating the proximity bias on p.17: “This finding represents first evidence for the hypothesis that tonic dopamine moderates the impact of proximity (e.g., more concrete versus more abstract rewards) on cost-benefit decision making (Soutschek et al., 2022; Westbrook & Frank, 2018). Pharmacological manipulation of D1R activation, in contrast, showed no significant effects on the decision process. This provides evidence for the receptor specificity of dopamine’s role in intertemporal decision making (though as caveat it is worth keeping the differences between the tasks administered in the D1 and the D2 studies in mind).”

      We also agree that amisulpride acts also on 5-HT7 receptors, such that it remains unclear whether also such effects contribute to the observed result pattern. We discuss this limitation in the revised manuscript on p.21:

      “Lastly, while the actions of amisulpride on D2/D3 receptors are relatively selective, it also affects serotonergic 5-HT7 receptors (Abbas et al., 2009). Because serotonin was related to impulsive behavior (Mori, Tsutsui-Kimura, Mimura, & Tanaka, 2018), it is worth keeping in mind that amisulpride effects on serotonergic, in addition to dopaminergic, activity might contribute to the observed result pattern.”

      Overall the modeling methods are robust and appropriate for the specific test of decision impacts of D2R blockade, and include several prima facie variable alternative models for comparison. Some caution is warranted, since there are not many trials per subject, and some trials are discarded as well as outliers, which raises the question of power. Given the models are fit hierarchically, which gives both group-level and individual-level parameter estimates, the elements are there to probe more deeply into individual differences, and to test how reliably this approach can dissociate the dual effects of bias and drift rate at the individual level, and perhaps correlate it with other informative subject measures of either dopamine activity/capacity or other dopamine-dependent behaviors. Alternative DDMs might also capture some of this individual variation, with meaningful differences potentially in model comparison at the individual level. It should be noted that the scope of these models do not exhaust the ways in which proximity (here, temporal) of rewards and contrast between choice options might be incorporated into a cognitive process model account of choice; all alternatives here rest on the same implicit 2-alternative forced choice assumption of the DDM, and the assumptions of this model are not here tested against other accounts of choice, for example the linear ballistic accumulator (LBA) and its derivatives. Further, the concept of proximity as a global feature of a trial (on average, how soon are these options overall?) is never tested on my read of the alternative models.

      We thank the reviewer for these interesting suggestions. First, to explore whether measures of dopaminerigc activity correlate with individual differences in drug effects on DDM parameters, we now report correlations between DDM parameters and performance in the digit span backward task as proxy for dopamine synthesis capacity (Cools et al., 2008). None of these correlation analyses showed significant results. In the revised manuscript, we report these analyses on p.13:

      “However, we observed no evidence that individual random coefficients for the drug effects on the drift rate or on the starting bias correlated with body weight, all r < 0.22, all p > 0.10. There were also no significant correlations between DDM parameters and performance in the digit span backward task as proxy for baseline dopamine synthesis capacity (Cools, Gibbs, Miyakawa, Jagust, & D'Esposito, 2008), all r < 0.17, all p > 0.22. There was thus no evidence that pharmacological effects on intertemporal choices depended on body weight as proxy of effective dose or working memory performance as proxy for baseline dopaminergic activity.”

      Regarding model comparisons on the individual level, we note that the hierarchical Bayesian modelling approach allows (to the best of our knowledge) computing indices of model fit like DIC only on the group, not the individual level (while accounting for individual differences). However, we agree with the reviewer that theoretically different models might work best in different individuals (depending, for example, on the individual sensitivity to proximity). While such fine-grained model comparisons on the individual level are beyond the scope of the current study (and might not yield robust results given the limited number of trials for each participant), we now discuss this limitation in the revised manuscript (p.17-18):

      “We note that the hierarchical modelling approach allowed us to compare models on the group level only, such that in some individuals behavior might better be explained by a different model than DDM-1. Such model comparisons on the individual level, however, were beyond the scope of the current study and might not yield robust results given the limited number of trials per individual.”

      Likewise, linear ballistic accumulator (LBA) models represent a further class of process models with different assumptions on the mechanisms underlying the choice process than DDMs. In LBAs, evidence is accumulated separately for each choice alternative, whereas DDMs assume only one accumulation process which integrates attributes from two choice options, limiting the use of DDMs to two-alternative forced-choice scenarios. Nevertheless, proximity effects might be incorporated also in LBA models via modulating the starting point of the option-specific accumulators as a function of proximity. To the best of our knowledge, there is no built-in function in JAGS that allows estimating LBA models in a hierarchical Bayesian fashion (in contrast to, e.g., STAN), such that in the context of the current study it is difficult to directly compare our DDM-based approach with LBA models. It is importance to emphasize, however, that similar to other studies we do not make any claims about whether the choice process per se is best explained by DDMs or LBA models; instead, we focus on how rewards and delay costs affect different components of the decision process within a class of decision models. Nevertheless, we discuss such alternative modelling approaches in the revised manuscript on p.18:

      “We also emphasize that alternative process models like the linear ballistic accumulator (LBA) model make different assumptions than DDMs, for example by positing the existence of separate option-specific accumulators rather than only one as assumed by DDMs. However, proximity effects as investigated in the current study might be incorporated in LBA models as well by varying the starting points of the accumulators as function of proximity.”

      Lastly, we thank the reviewer for the interesting suggestion to assess whether the starting bias parameter is affected by the overall proximity of offers (sum of delays) instead of by the difference in proximity between the options. We ran a further DDM to test this hypothesis, but this model explained the data worse (DIC = 9,492) than the original DDM (DIC = 9,478). Nevertheless, also the overall proximity DDM yielded a significant amisulpride effect on the impact of reward magnitude on the drift rate, HDImean = 0.83, HDI95% = [0.04; 1.75], underlining the robustness of this effect. In the revised manuscript, we report this analysis on p.12:

      “In a further model (DDM-4), we explored whether the starting bias is affected by the overall proximity of the options (sum of delays, Delaysum) rather than the difference in proximity (Delaydiff; see Table 3 for an overview over the parameters included in the various models). Importantly, our original DDM-1 (DIC = 9,478) explained the data better than DDM-2 (DIC = 9,481), DDM-3 (DIC = 10,224), or DDM-4 (DIC = 9,492). Nevertheless, amisulpride moderated the impact of Magnitudediff on the drift rate also in DDM-2, HDImean = 0.86, HDI95% = [0.18; 1.64], and DDM-4, HDImean = 0.83, HDI95% = [0.04; 1.75], and amisulpride also lowered the impact of Delaydiff on the starting bias in DDM-3, HDImean = -0.02, HDI95% = [-0.04; -0.001]. Thus, the dopaminergic effects on these subcomponents of the choice process are robust to the exact specification of the DDM.”

      Reviewer #3 (Public Review):

      Soutschek and Tobler provide an intriguing re-analysis of inter-temporal choice data on amisulpride versus placebo which provides evidence for an as-yet untested hypothesis that dopamine interacts with proximity to bias choices.

      The modeling methods are sound with a robust and reasonably exhaustive set of models for comparison, with good posterior predictive checks at the single subject level, and decent evidence of parameter recoverability. Importantly, they show that while there is no main effect of drug on the proportion of larger, later (LL) versus smaller, sooner (SS) choices, this obscures conflicting-directional effects on drift rate versus starting point bias which are under-the-hood, yet anticipated by the hypothesis of interest.

      We thank the reviewer for judging our findings as intriguing and the modelling approach as robust and convincing.

      While I have no major concerns about methodology, I think the Authors should consider an alternative interpretation - albeit an interpretation which would actually support the hypothesis in question more directly than their current interpretation. Namely, the Authors should re-consider the possibility that amisulpride's effects are mediated primarily by acting at pre-synaptic receptors. If the D2R antagonist were to act pre-synaptically, it would drive more versus less post-synaptic dopamine signaling.

      There are multiple reason for this inference. First, the Authors observe that the drug increases sensitivity to differences in the relative offer amounts (in terms of effects on the drift rate). With respect to the canonical model of dopamine signaling in the direct versus indirect pathway, greater post-synaptic signaling should amplify sensitivity to reward benefits - which is what the Authors observe.

      Second, the Authors also observe an effect on the starting bias which may also be consistent with an increase in post-synaptic dopamine signaling. Note that according to the Westbrook & Frank hypothesis, a proximity bias in delay discounting should favor the SS over the LL reward, yet the Authors primarily observe a starting bias in the direction of the LL reward. This contradiction can be resolved with the ancillary assumption that, independent of any choice attribute, participants are on average predisposed to select the LL option. Indeed, the Authors observe a reliable non-zero intercept in their logistic regression model indicating that participants selected the LL more often, on average. As such, the estimated starting point may reflect a combination of a heightened predisposition to select the LL option, opposed by a proximity bias towards the sooner option. Perhaps the estimated DDM starting point is positive because the predisposition to select the LL option has a larger effect on choices than the proximity bias towards sooner rewards does in this data set. To the extent that amisulpride increases post-synaptic dopamine signaling (by antagonizing pre-synaptic D2Rs) it should amplify the proximity bias arising from the differences in delay, shifting the starting bias towards the SS option. Indeed, this is also what the Authors observe.

      Note that it remains unclear why an increase in post-synaptic dopamine signaling would amplify one kind of proximity bias (towards sooner over later rewards) without amplifying the other (towards a predisposition to select the LL option). Perhaps the cognitive / psychological nature of the sooner bias is more amenable to interacting with dopamine signaling than the latter. Or maybe proximity bias effects are most sensitive to dopamine signaling when they are smaller, and the LL predisposition bias is already at ceiling in the context of this task. These assumptions would help explain why a potential increase in post-synaptic dopamine signaling both amplified the proximity effect of delay when it was smallest (when the differences in delay were smaller), and also failed to amplify the predisposition to select the LL option (which may already be maxed out). More importantly, the assumption that there are opposing proximity biases would also help explain why there is a negative effect of delay magnitude on the estimated starting point on placebo. Namely - as the delay gets larger, the psychological proximity of sooner over later rewards grows, counteracting the proximity bias arising from choice predisposition / repetition.

      We thank the reviewer for suggesting this alternative interpretation of our data. We agree that the administered dose of 400 mg amisulpride can show both postsynaptic (reducing D2R activation) and presynaptic effects (enhancing D2R activation), which in many studies makes it difficult to decide whether the observed behavioral effects are caused by presynaptic or postsynaptic mechanisms.

      The reviewer suggests that the observed stronger influence of reward magnitudes on drift rates under amisulpride compared with placebo speaks in favor of presynaptic effects, because according to theoretical accounts higher dopamine levels should increase reward seeking (e.g., Frank & O’Reilly, 2006). On the other hand, Figure 2C suggests that amisulpride (compared with placebo) increased the preference only for relatively high, above-average rewards. If the difference between reward magnitudes was below average, amisulpride reduced rather than increased the preference for the larger reward. In our view, this is consistent with the hypothesis that D2R activation implements a cost control, with higher D2R activation increasing the attractiveness of costly rewards and lower D2R activation reducing it. In other words, under low dopamine levels individuals should decide for the costlier reward only if the magnitude of the costlier reward is sufficiently large compared with the lower, less costly reward. In fact, this is exactly what we find in our data according to Figure 2C. In our view, the amisulpride effect on drift rates is thus compatible with both presynaptic and postsynaptic mechanisms of action, depending on the underlying conceptual account of dopamine, as we now discuss in the revised manuscript.

      According to the reviewer, also the observed influence of amisulpride on the starting bias speaks in favor of increased rather than reduced dopamine levels. We agree with the reviewer that the result pattern for the starting bias is somewhat complex and seems to combine the effects of two different biases: a general tendency to choose LL over SS rewards (intercept of starting bias where the difference in delays is close to zero), and a shift towards the SS option under placebo if one options has a strong (temporal) proximity advantage over the other. Amisulpride shows opposite effects on the two different biases, as it shifts the intercept of the starting bias further away from the LL option but also reduces the proximity advantage of the SS over the LL reward for larger differences in delay. The reviewer writes that “To the extent that amisulpride increases post-synaptic dopamine signaling (by antagonizing pre-synaptic D2Rs) it should amplify the proximity bias arising from the differences in delay, shifting the starting bias towards the SS option. Indeed, this is also what the Authors observe.” In contrast to that statement, in our study amisulpride reduced rather than increased the starting bias arising from delay (as in Figure 2K the regression line is flatter under amisulpride compared with placebo, despite the differences regarding the intercept). We believe that the amisulpride effects on both the intercept and the delay-dependent slope can be explained via postsynaptic effects: First, the shift of the intercept of the starting bias (small differences in proximity) from the LL towards the SS option under amisulpride is consistent with the assumption that lower dopamine reduces the preference for larger reward (e.g., Beeler & Mourra, 2018; Salamone & Correa, 2012). Second, the finding that amisulpride weakens the proximity advantage of SS over LL rewards (delay-dependent slope) is consistent with the proximity account by Westbrook & Frank (2018) according to which lower tonic dopamine should reduce proximity effects. Thus, if we assume that the result pattern for the starting bias parameter is driven by dopaminergic effects on two separate decision biases (as suggested by the reviewer), we believe that both effects can better be explained by pharmacologically reduced rather than increased dopamine levels.

      In the revised manuscript, we extensively discuss the question as to whether the observed drug effects are caused by postsynaptic versus presynaptic effects. We clarify that the amisulpride effect on drift rates seems consistent with both presynaptic and postsynaptic effects (depending on the underlying conceptual account). We moreover discuss that the starting bias effects may reflect the interaction between two different bias types, and the drug effects on both bias types can more easily be reconciled with postsynaptic than presynaptic effects. On balance, we believe that the observed effects are more likely to reflect lower as compared to higher dopamine levels, but the extended discussion of this issue gives all readers the opportunity to weigh the arguments for and against these alternatives. If the reviewer should not agree with some aspects of our argumentation as outlined above, we would of course be happy to modify the discussion according to the reviewer’s advice.

      In the revised manuscript, we modified the discussion of presynaptic versus postsynaptic effects as follows (p.20-21):

      “While higher doses of amisulpride (as administered in the current study) antagonize post-synaptic D2Rs, lower doses (50-300 mg) were found to primarily block pre-synaptic dopamine receptors (Schoemaker et al., 1997), which may result in amplified phasic dopamine release and thus increased sensitivity to benefits (Frank & O'Reilly, 2006). At first glance, the stronger influence of differences in reward magnitude on drift rates under amisulpride compared with placebo might therefore speak in favor of presynaptic (higher dopamine levels) rather than postsynaptic mechanisms of action in the current study. On the other hand, one could argue that amisulpride reduced the preference for the LL reward if the gain from the costlier LL option compared with the SS option was small (as suggested by Figure 2C), which is consistent with the cost control hypothesis of dopamine (Beeler & Mourra, 2018). The impact of amisulpride on the drift rate thus appears ambiguous regarding the question of pre- versus postsynaptic effects. The result pattern for the starting bias parameter, in turn, suggests the presence of two distinct response biases, reflected by the intercept and the delay-dependent slope of the bias parameter (see Figure 2K), which are both under dopaminergic control but in opposite directions. First, participants seem to have a general bias towards the LL option in the current task (intercept), which is reduced under amisulpride compared with placebo, consistent with the assumption that dopamine strengthens the preference for larger rewards (Beeler & Mourra, 2018; Salamone & Correa, 2012; Schultz, 2015). Second, amisulpride reduced the proximity advantage of SS over LL rewards with increasing differences in delay, as predicted by the proximity account of tonic dopamine (Westbrook & Frank, 2018). On balance, the current results thus appear more likely under the assumption of postsynaptic rather than presynaptic effects. Unfortunately, the lack of a significant amisulpride effect on decision times (which should be reduced or increased as consequence of presynaptic or postsynaptic effects, respectively) sheds no additional light on the issue.”

      Regardless of the final interpretation, showing that pharmacological intervention into striatal dopamine signaling can simultaneously modify a starting point bias and drift rate (in opposite directions - thus having systematic effects on choice biases without altering the average proportion of LL choices) provides crucial first evidence for the hypothesis that dopamine and proximity interact to influence decision-making. These results thereby enrich our understanding of the neuromodulatory mechanisms influencing inter-temporal choice, and take an important step towards resolving prior contradictions in this literature. They also have implications for how striatal dopamine might impact decision-making in diverse domains of impulsivity beyond inter-temporal choice, ranging from cognitive neuroscience (e.g. in numerous cognitive control tasks) to psychiatry (treating diverse disorders of impulse control).

      We thank the reviewer for highlighting the importance of the current findings for understanding dopamine’s role in decision making.

  9. Dec 2022
    1. Reviewer #3 (Public Review):

      A problem in synthetic ecology is that one can't brute-force complex community design because combinatorics make it basically impossible to screen all possible communities from a bank of possible species. Therefore, we need a way to predict phenomena in complex communities from phenomena in simple communities. This paper aims to improve this predictive ability by comparing a few different simple models applied to a large dataset obtained with the use of the author's "kchip" microfluidics device. The main question they ask is whether the effect of two species on a focal species is predicted from the mean, the sum, or the max of the effect of each single "affecting" species on the focal species. They find that the max effect is often the best predictor, in the sense of minimizing the difference between predicted effect and measured effect. They also measure single-species trait data for their library of strains, including resource niche and antibiotic resistance, and then find that Pearson correlations between distance calculations generated from these metrics and the effect of added species are weak and unpredictive. This work is largely well-done, timely and likely to be of high interest to the field, as predicting ecosystem traits from species traits is a major research aim.

      My main criticism is that the main take-home from the paper (fig 3B)-that the strongest effect is the best predictor-is oversold. While it is true that, averaged over their six focal species, the "strongest effect" was the best overall predictor, when one looks at the species-specific data (S9), we see that it is not the best predictor for 1/3 of their focal species, and this fraction grows to 1/2 if one considers a difference in nRMSE of 0.01 to be negligible.

      The same criticism applies to the result from figure 2-that pairs of affecting species have more negative effects than single species. Considered across all focal species this is true (though minor in effect size, Fig 2A). But there is only a significant effect within two individual species. Again, this points to the effects being focal-species-specific, and perhaps not as generalizable as is currently being claimed.

      Another thing that points to a focal-species-specific response is Fig 2D, which shows the distributions of responses of each focal species to pairs. Two of these distributions are unimodal, one appears bimodal, and three appear tri-modal. This suggests to me that the focal species respond in categorically different ways to species addition.

      These differences occur even though the focal bacteria are all from the same family. This suggests to me that the generalizability may be even less when a more phylogenetically dispersed set of focal species are used.

      Considering these points together, I argue that the conclusion should be shifted from "strongest effect is the best" to "in 3 of our focal species, strongest effect was the best, but this was not universal, and with only 6 focal species, we can't know if it will always be the best across a set of focal species".

      My second main criticism is that it is hard to understand exactly how the trait data were used to predict effects. It seems like it was just pearson correlation coefficients between interspecies niche distances (or antibiotic distances) and the effect. I'm not very surprised these correlations were unpredictive, because the underlying measurements don't seem to be relevant to the environment tested. What if, rather than using niche data across 20 nutrients, only the growth data on glucose (the carbon source in the experiments) was used? I understand that in a field experiment, for example, one might not know what resources are available, and so measuring niche across 20 resources may be the best thing to do. Here though it seems imperative to test using the most relevant data.

      Additionally and relatedly, it would be valuable to show the scatterplots leading to the conclusion that trait data were uninformative. Pearson's r only works on an assumption of linearity. But there could be strong relationships between the trait data and effect that are monotonic but not linear, or even that are non-monotonic yet still strong (e.g. U-shaped). For the first case, I recommend switching to Spearman's rho over Pearson's r, because it only assumes monotonicity, not linearity. If there are observable relationships that are not monotonic, a different test should be used.

      In general, I think the analyses using the trait data were too simplistic to conclude that the trait data are not predictive.

    1. Author Response

      Reviewer #2 (Public Review):

      I am not a specialist in cryo-EM, so cannot comment on the technicalities of the structure reconstruction or methods used. I thus focus on the conclusions and observations that the authors provide in the manuscript and their relevance to functional photosynthesis.

      The authors attempt to resolve the structure of PSII from Dunaliella and noticed that three types of PSII could be identified: two conformational states, and a stacked configuration. There is no doubt that these structures add to our current knowledge of PSII and that they exist in abundance upon solubilisation of the sample. My main issue however is the relevance to in vivo conditions, and the efforts to exclude the possibility that pigment loss and conformational states and stacking are a reflection of ex-vivo manipulations.

      Our compact model contains 202 Chls molecules while the stretched conformation contains 206 Chls. All of the differences in Chl binding are attributed to CP29. We have compiled a table enumerating the different CP29 structures currently available from plants and green alga at similar resolution to our work (Supplementary table 2). In the larger plant complexes (C2S2M2) CP29 contains 14 chls, while CP29 in smaller C2S2 complexes contains 10-13 chls, so it appears the some chl loss from CP29 is associated with the release of LHCIIM. In the green alga structures, CP29 contains less chls in general and shows a similar trend. The currently published structure most relevant to our work contains 8 chls (6KAC), a somewhat lower amount then both the compact and stretched models (9 and 11 chls, respectively). The stretched orientation, which is the closest match to the known PSII core arrangement, therefore contains more chls than comparable models. While the in-vivo configuration is not known in the sense that it could contain more chls, the current structure is apparently the closest representation of it.

      The presence of CP29 with lower chls content in the chlamy C2S2 (6KAC, which is in a stretched orientation) supports a conclusion that pigment loss from CP29 alone is not sufficient to trigger the stretch to compact transition although it is associated with it. In general, the precise orientation of CP29 is variable and seem to depend on the binding of additional LHCII, it is possible that some chl loss is accompanied with these changes in vivo.

      I see a number of questions pertaining to this work. Starting from the two conformations of PSII, compact and stretched, the authors say that both are highly active based on oxygen measurements at a saturating light intensity. In the meantime, they report large variations in the chl content and positions of the chlorophyll molecules in these structures (also compared to other known PSIIs). This gives the impression that one can lose two chlorophylls, and freely modify the distance between others without losing efficiency, certainly a risky conclusion. Are the samples highly active also in light-limiting conditions? It is thought that even tiny movements and alterations in chl-chl distances alter their coupling and spectral properties, how come the variations in this report are so huge? In other words, the assay tests the charge separation activity of the PSII RC in the preps, but not the light-harvesting efficiency.

      The chl content differences reported in this work amounts to 2%. In our opinion this represents quite a low variation in pigment content, which exist in virtually any experiment involving large complexes. We agree that measurements of activity in limiting light conditions are interesting, however this goes beyond the scope of the current work. Light harvesting efficiency in PSII is known to vary substantially as a result of additional mechanisms (NPQ in some of its forms), not associated with chl loss or gain. While the formation of quenching centers is attributed to small structural changes within specific pigment protein complexes, what we are showing in this work are structural changes between pigment protein complexes. These can affect transfer rates between the different complexes but are distinct from the structural changes thought to accompany the formation of quenching centers within specific pigment protein complexes.

      How does one ascertain that the lost chlorophyll molecules in CP29 are not a preparation error? Does slightly increasing the detergent concentration impact the proportion of stretched:compact forms?

      The effect of detergent concentration on the proportion of the different forms was not tested directly. However, we do not detect many differences in lipids or bound detergent molecules content between the two conformations, suggesting that for these “ligands” the differences are not substantial. We can only distinguish these two forms at the very last stages of data processing, at the present state of cryoEM cost and time availability, mapping the effect of detergent concentration on the different orientations is outside our reach.

      On a similar note, how do the authors exclude that a certain interaction with this type of grid impacts the distribution of these complexes? Is it identical to a biologically separate preparation of algae? In case of discoveries of this type, it is of high importance to exclude as many possibilities of non-native conditions or influences on the structure.

      It’s hard to completely exclude grid and sample preparation issues. However, we employed relatively standard grids and vitrification conditions. The observed complexes are embedded in vitrified ice and do not interact with the grid directly. The differences we observed are mainly in the orientations of the PSII cores, all the interactions between PSII subunits within each core are preserved and agree with previously published structures. Since the interactions within the core and between cores involve the same physical principles, we think its fairly conservative to think that the observed core orientations are not an artefact of sample preparation.

      I would further like to encourage the authors to elaborate on the CP29 phosphorylation. What is the proportion of PSIIcomp that are phosphorylated? I assume it is not 100%, as in this case, the authors would propose that this is the effect that modulates between compact and stretched architectures.

      Its difficult to estimate the proportion of observed phosphorylation/sulfinylation. To be detected in maps, most of the residues (above 50%) are probably modified. We attempted to estimate this by refining the atom occupancies of the Pi molecule on Ser84 and the oxygens attached to Cys218, both values suggested that about 70% of the complexes are modified. With regards to the possibility that these modifications can promote the formation of the compact state, we think that this is certainly a possibility, since these modifications were detected in this state and are in close proximity to each other. However, this can also result from the resolution differences of the maps and the structural implications of both modifications are hard to predict. At this point we prefer to note their existence without further interpretations.

      In line 290, the authors highlight the structural heterogeneity within the two groups' PSII conformations. I would like to see how does the distribution look like for all the structures together: are the two (stretched and compact) specifically forming two heterogenous distributions? Or is it possible that the distribution between the two is quasi-continuous? In other words, if the structures are not perfectly defined, how do the authors decide that two- and not more or less subtypes exist?

      We went back and refined the initial particle group (containing both compact and stretched orientations) using multibody with masks defining the two PSII monomers. This analysis showed the expected two peaks only in the first Principal components which accounted for ~38% of the variance in the dataset.

      Multibody refinement carried out on the combined particle dataset shows one very large PC accounting for about 38% of the variance and the presence of two distinct peaks in the particle distribution of the first PC.

      From this analysis it’s clear that there are two distinct classes in this particle set (as expected), as none of the other PC’s shows any signs of multiple peaks, this analysis suggests that two distinct models are the best representation of this eukaryotic PSII. Whether these are quasi continuous or distinct is more complex. There is continuity in this representation (particle distributions along PC), a different picture may appear if characters such as CP29 state are considered, but the size of CP29 and the remaining heterogeneity does not provide enough signal to carry out this classification at the moment.

      Considering the stacked PSII, I also have a few concerns. Contrary to previous studies the authors do not assign a functional role to the stacking beyond the structural aspect. This could be better backed by a discussion about the closest chlorophyll a molecules across the stacked PSII, which given the rather large distance shown in fig. 4L seems to be too large for any EET across the stromal gap.

      The closest chl-chl distance that we can measure in the stacked PSII dimer is ~54 Å, with most distances at the ~70 Å range, making EET between staked complexes very slow. We have added a statement clarifying this to our manuscript. In our opinion a structural role for the staked PSII dimer is more likely.

      There is a report that suggests the presence of some density between the stacked PSII - could the authors comment on the differences between it and their work? Are the angles and positions conserved between these types of stacks? https://doi.org/10.1038/s41598-017-10700-8

      We referred to Albanese et al, in our manuscript. We isolated the C2S2 complex from green alga, the analysis in Albanese et al was done on C2S2M1 complexes from pea and this can account for some of the differences. At any rate, our conclusion that we don’t find any evidence for protein linkers in the stacked complex is stated clearly. The angles described in Albanese et al are consistent with our analysis.

      Line 387, the authors state that due to the transient nature of the interactions across the stromal gap, the stacks could be "under-detected" in cryo-ET data. This statement is in my opinion misformulated. For once, the transient interaction argument would apply the same (if not more due to changing conditions induced by the purification process) to the single particle analysis performed in this paper. Second, tomographic volumes detect hundreds of PSII in a suspended state. Any transient interaction that adds up to 25% of particle population in a steady state cell should be clearly visible, while the in situ data suggests not more than random cross-stromal-gap orientations. Of course, this can be a specificity of Chlamydomonas or a particular growth condition. The statement used by the authors could be indeed converted into: the PSII stacks are over-detected in vitro, and it is certainly a simpler explanation for their presence. It is also important to mention that PSII stacking alone is not the only reason for grana architecture - stacking with the antenna of larger complexes, absent in the authors' preparation could also contribute to grana maintenance; and auxiliary proteins such as CURT help with this issue as well. Here a recent demonstration of the importance of minor antenna should probably be also cited: https://doi.org/10.1101/2021.12.31.474624

      We used the term “flexible” rather than “transient” to describe the interactions within the stacked PSII dimer. Our data (and tomographic data) do not contain any temporal component. When we used the term under-detected we refer to the fact that PSII is mainly detected by the luminal extrinsic subunits. The flexibility detected in our analysis may affect the concurrent visibly of these features in the PSII complexes making up an individual PSII stack. Specifically, Wietrzynski et al mainly analyze C2S2M2L2 complexes while our analysis only contained C2S2 complexes. It is likely that the different amount of bound LHCII affect PSII stacking as well. For example, Wietrzynski et al, show some overlap between LHCII complexes and little overlap between cores in the larger complexes they analyzed. We observe mainly core to core overlap with little LHCII overlap in the smaller C2S2, although we did not observe any states where LHC’s were not included in what appear to be the binding interface. We agree with the reviewer on the relevance Lhcb’s and CURT contributions to stacking but prefer to focus on what was directly demonstrated in our data. We clearly note that we are discussing in-vitro results.

      Taking these last thoughts, I would like to finish by mentioning one more thing - almost philosophical. The authors are certainly at the forefront of the booming cryoEM revolution in biology which is profoundly changing the way we understand the living. There is absolutely zero doubt that this powerful technique is of the highest interest. But a growing number of structures of photosynthetic complexes remain puzzling, in particular with regard to their abundance in vivo (such as the PSII stacks) and functional relevance. How do we ascertain that these interactions are not due to in vitro preparation (isolation from cells, solubilisation)? Which ways can we use to try to exclude this (simple) hypothesis? I suggest that at least a small extent of biological replicas - experiments performed on separate batches, in different technical conditions, with slightly altered solubilization conditions, and so on - could shed light on the nature of these structures and their occurrence in vivo. Technical reps of the freezing+analysis pipeline could also be tried to see the variability. This would strongly reinforce this manuscript and its conclusions, and while not completely unequivocal (the stacked PSII, for example, could form upon each purification), a quantification of the effects would be of high interest.

      We certainly share the reviewer hope of being able to conduct cause and effect cryoEM experiments covering a complete set of experimental parameters. This is still beyond reach in terms of time and cost. Within each cryoEM experiment, however, all the analysis is consistent and, more importantly, transparent with regards to image analysis, which is the most important factor in our opinion. Preparation artefacts are always a possibility but, in our opinion, cryoEM is not affected by them differentially compared to other techniques. As we mentioned above, the particles are being observed suspended in vitreous ice, this is not different, and one can say even better, then numerous low temperature spectroscopic observations on samples suspended in glass state or crystals obtained in the presence of high concentrations of various agents. One thing that validates structural studies are the chemical details (bond lengths and angles etc…) underlying every model which are consistence with known values to close tolerances.

      Reviewer #3 (Public Review):

      In this manuscript, Caspy et al. present a detailed structural analysis of eukaryotic photosystem II (PSII) isolated from the green alga Dunaliella salina. By combining single-particle cryo-EM with multibody refinement, the authors not only reveal a high-resolution (2.4Å) structure of the eukaryotic PSII, but also demonstrate alternate conformations and intrinsic flexibility of the overall complex. Stretched and compact conformations of the PSII dimer were readily identified within the single-particle dataset. From this structural analysis, the authors propose that excitation energy transfer properties may be modulated by changes in transfer distance between key chlorophyll molecules observed in different conformational states of the PSII dimer. Due to the high resolution of the maps obtained, the authors identify post-translational modifications and a sodium binding site based on the observed cryo-EM maps. Additionally, the authors analyze PSII complexes in stacked and unstacked configurations, and find that compact and stretched states also exist within the stacked PSII complexes. From their cryo-EM maps, the authors demonstrate that there is no direct protein-protein interaction between stacked PSII complexes, and rather propose a model wherein long-range electrostatic interactions mediated by divalent cations such as magnesium, can facilitate PSII stacking.

      The conclusions and models presented in the manuscript are mostly well justified by the data. The cryo-EM maps are high quality and the models appear generally well refined. However, some aspects of data processing and analysis, as well as the resultant conclusions need to be clarified.

      1) In general, it is not clear from the cryo-EM processing workflow (suppl. Fig 1) or the methods section when exactly symmetry was applied during 3D classification and refinement. In the case of C2S2 unstacked particles, when was symmetry first applied in the overall processing workflow? To identify the compact and stretched configurations of C2S2, did the 3D classification without alignment (and/or the refinement preceding this classification) have C2 symmetry applied? If so, have you considered the possibility that some particles may actually be asymmetric in some regions?

      We modified figure S1 to clearly indicate the use of symmetry and particle expansion. In general, we refined most of the particle sets without symmetry (C1). At the final processing stage of the unstacked PSII sets, after we separated both conformations, we used C2 symmetry to expand the data, this was followed by multibody refinement. No symmetry or symmetry expansion was used for the stacked PSII particle sets.

      2) Following multibody refinement in Relion individual maps and half-maps for each body will be generated. There is no mention in the methods of how these individual maps for each C2S2 "monomer" were combined to produce an overall map of the dimer following multibody refinement. There are several methods currently used to combine such maps, including taking the maximum or average of the two maps or using a model-based approach in phenix. The authors should be explicit about the method they used, any potential artifacts that may develop from this map combination process, and/or the interface between masks used in multibody refinement.

      We used phenix.combined_focused_maps to combine the maps. This is now indicated in the method section.

      3) In addition to the point raised above, following multibody refinement there will be an individual FSC curve and resolution for each body. However, in supplemental figure 2 and supplemental table 1, only a single FSC curve and resolution are reported. Are these FSC curves/resolutions only reported for the better of the two bodies? If not, how was a single resolution calculated for the overall map of combined bodies?

      Both FSC curves were calculated and were highly similar, as expected following C2 expansion. This can also be evaluated from the local resolution maps which are highly similar between the two bodies. The reported resolutions are all taken from the displayed FSC curves generated through relion PostProcess.

      4) One of the major conclusions from the 3D classification and multibody refinement is that conformational changes and inherent flexibility of the PSII dimers have the potential to change distances between cofactors in the complex, ultimately leading to altered excitation energy transfer. However, it is unclear whether or not the authors believe one conformation over another may more readily support the evolution of oxygen. It would be nice if the authors could elaborate slightly upon this topic in the discussion.

      As discussed above the structural changes associated with the formation of quenching centers are not expected to be detected in the current work. The changes we observe can however affect the transfer to such centers and by doing so can play an important part in PSII biology. We do not detect any changes around the OEC and we don’t find any reason to think the two conformations are different with respect to their ETC.

      5) Along the lines of point 4 above, on line 95 the authors claim that "the high specific activity of 816 umol O2/ (mg Chl * hr) suggest that" both the C2S2 compact and stretched conformation are highly active. However, it is not clear to me why this measure of specific activity would suggest that both PSII conformations should have "high" activity. Maybe a reference here would help guide readers to previous measures of specific activity?

      Looking at specific activity from previously published structural studies on eukaryotic PSII we find that Sheng et al, 2019 reported on a specific activity of 272 mol O2/ (mg Chl * hr), this difference can stem partially from the presence of larger complexes in their preparation and is comparable to the activity that we measured in our As fraction (276 mol O2/ (mg Chl * hr), Figure 1-figure supplement 9). Reported specific activity values from plants (Pisum sativum) are also similar, Su et al, reported on a maximal value of 288 mol O2/ (mg Chl * hr), again, for larger complexes which can explain some of the difference. However, the specific activity measured for the C2S2 PSII isolated in the current study is 2.8 X higher than this value, more than the differences in chl content which ranges between 1.5 X to 2 X in favor of the larger complexes. If either one of the conformations is not as active, it would only mean that the other conformation will display even higher specific activity which seems less likely. In addition, we find no difference around the oxygen evolution center or in the peripheral luminal subunits in both the shape or map strength so both orientations show highly similar structures around these regions which determine the oxygen evolution activity.

      6) It is claimed that "more than 2100 water molecules were detected in the C2S2 compressed model", and the water distribution is shown in Figure 3. Obtaining resolutions capable of visualizing waters with cryo-EM is still a significant challenge. Upon visual inspection of the map supplied, it appears that several of the waters that were built into the atomic model simply do not have supporting peaks in the coulomb potential map above the level of noise. While some of the modeled waters are certainly supported by the map, in my opinion, there are many waters that simply are not, or at best are questionable. What method or tool was originally used to build waters into the model, and how were these waters subsequently validated during structure refinement?

      We followed standard methods for water placement and refinement in the preparation of the model, in addition to manually curating the water structure. However, in light of the reviewer comment we undertook additional rounds of refinement and inspection of the water molecules in the model. We removed a few hundred water molecules so that the total number of water molecules is now around 1700. All the water molecules in the present model should be well supported at maps values higher then 2.5 sigma and in our opinion the current water model should be regarded as conservative and underestimates the number of bound water molecules. This also led to some improvements in additional validation statistics of the model which are listed in the Table 1. The new model has been deposited in the PDB and the new PDB validation report is included in our resubmission.

      7) The authors claim to identify several unique map densities during model building. One of these is a sodium ion close to the OEC, which is coordinated by D1-His337, several backbone carbonyls, and a water molecule. When looking closely at the cryo-EM map supplied, it appears that the coulomb potential map is quite weak for this sodium, and is only visible at quite low contour levels. In fact, the features for the coordinating water, and chloride ions located ~7-9A away are much stronger than the sodium. Do the authors have any explanation for why the cryo-EM map is significantly weaker for the sodium compared to the coordinating water or chloride ions in the same general vicinity? Similar to what they did for the other post-translational modifications, the authors should consider showing the actual cryo-EM map for the bound sodium in supplemental Figure 10 a,b.

      Our main support for the placement of a Na+ ion in this location stems from the analysis of Wang et al. Our maps show the presence of a density which is discernible at 4 σ with an elongated shape suggesting the presence of multiple atoms/waters. Although in principle positive ions should have very strong densities in cryoEM maps due to their interactions with electrons, other factors such as occupancy, coordination and b-factor also play a role making the distinction between water and sodium complicated and case specific. The sodium peak is not observed in unsharpened maps (as do most of the water molecules which occupy conserved positions).

        We collected a few examples from comparable cases (cryo-EM maps of similar resolution ranges) where the presence of sodium ions is highly probable based on additional evidence. These maps densities highlight the factors we discussed above. In cases ‘a’ (dual oxidase 1 prepared in high sodium conditions) and ‘b’ (human voltage-gated sodium channel), Na+ is observed in a highly coordinated states and especially in ‘a’ shows the expected increase density values compared to water molecules. However, cases ‘d’ (human Na+/K+ P type Atpase) and ‘e’ (voltage-gated sodium channel) appear very similar to the proposed Na+ assignment in PSII. We conclude that map density alone is not enough to distinguish between Na+ and water molecules and rely on the additional experiments described by Wang et al. which show increase PSII activity in elevated Na+ levels in basic conditions.

      8) The cryo-EM maps showing CP29-Ser84 phosphorylation and CP47-Cys218 sulfinylation are quite convincing. However, it is interesting that these modifications are only observed in the compact conformation, and not in the stretched conformation. Can the authors elaborate on whether or not they believe the compact and stretched conformations could be a result of these posttranslational modifications, or vice versa?

      This is an interesting suggestion. In our opinion it is less likely that the modification themselves trigger the transition between compact and stretched states. It is not clear how these modifications will stabilize the compact vs the stretched states. It is equally likely that these modifications are somehow triggered by the structural change. We cannot be certain that these modifications are not present in the stretched orientation as well but remain unobserved due to resolution differences. The correlation between the states and post translation modifications should be verified before a discussion on their possible roles in the transitions.

      9) Do the authors believe that PSII dimers in the solution can readily interconvert between compact and stretched conformations? Or is the relative ratio of these conformations fixed at the time of membrane solubilization with decyl-maltoside?

      We think that its more probable that the transition between these states occur in the membrane phase. The main reason for this will be that pigment loss and structural transitions in CP29 are more likely to occur in the membrane rather than in aqueous/micelle environments.

      10) The model proposed for divalent cation-mediated stacking of PSII dimers is compelling, and seems to be in agreement with previous investigations that observed a lack of stacked dimers in cryo-EM preparations lacking calcium/magnesium. However, my understanding from reading the methods section is that the observed lack of density between the stacked PSII dimers was inferred from maps obtained after multibody refinement. Based on the way the masks to define bodies were created for multibody refinement (Fig. 4A), the region between stacked dimers would be highly prone to map artifacts following multibody refinement. Have the authors looked closely at the interfacial region between stacked dimers following conventional 3D classification/refinement to ensure that there are indeed no features observed in the interfacial region even at low contour levels?

      We’ve made several attempts to resolve differences in the space between the stacked PSII dimer. These include focused classification with masks containing selected volumes from this regions and masks that include only one of the stacked PSII dimers to avoid signal subtraction in this region. All of these did not reveal any discernible features in this region. In addition, any stable binding of a bridging protein across the stacked dimer will probably be at least partially visible as additional density over the unstacked PSII. We searched for such features and found none.

    1. Author Response

      Reviewer #1 (Public Review):

      This paper shows that nuclear pore complex components are required for Kras/p53 driven liver tumors in zebrafish. The authors previously found that nonsense mutation in ahctf1 disrupted nuclear pore formation and caused cell death in highly proliferative cells in vivo. In the absence of this gene, there are multiple mitotic functions involving the nuclear pore that are defective, leading to p53 dependent cell death. Heterozygous fish are viable but have reduced kras/p53 liver tumor growth, and this is associated with multiple nuclear and mitotic defects that lead to cancer cell death/lack of growth. This therapeutic window suggests targetability of this pathway in cancer. I think the data are robust, rigorous, and clearly presented. I believe this in vivo work will encourage therapeutic targeting of NPCs in cancer.

      We are pleased that this reviewer believes that our data are robust, rigorous, and clearly presented and that our in vivo work will encourage therapeutic targeting of NPCs in cancer.

      Reviewer #2 (Public Review):

      Overall this is a very interesting and important paper that demonstrates a novel synthetic interaction between nucleoporin inhibition and oncogene-driven hyperproliferation. This work is especially significant because of the paucity of effective treatments for hepatocellular carcinoma (HCC). The authors' demonstration that the Nup inhibitor Selinexor decreases larval liver size in KRAS-overexpressing zebrafish but does not cause toxicity in wild-type animals lays the groundwork for exploiting this class of drugs in HCC treatment. This paper represents an elegant demonstration of the utility of zebrafish models in cancer studies. The relevance of this work to human cancer is supported by the authors' studies using TCGA data, wherein they demonstrate that decreased NUP expression is associated with increased survival in HCC.

      Other major strengths of the paper include beautiful pictures demonstrating that ahctf1+/- decreases the density and volume of nuclear pores in TO(kras) larvae and increases the rate of multipolar spindle formation, misaligned chromosomes, and anaphase bridges. The experiments are very well-controlled, including detailed analysis of the effects of ahctf1 heterozygosity and Selinexor on wild-type animals. The inclusion of distinct methods for disruption nucleoporins (ranbp2 heterozygosity and drug treatment) bolsters the authors' conclusion that this represents a viable drug target in HCC.

      My major concerns are as follows:

      1) The authors state that "the beneficial effect of ahctf1 heterozygosity to reduce tumour burden persists in the absence of functional Tp53, due to compensatory increases in the levels of tp63 and tp73". However, tp63 and tp73 appear similarly upregulated in ahctf1 heterozygotes regardless of tp53 status. The authors do not provide enough evidence that tp63 and tp73 are compensating for tp53 loss. An alternative possibility based on the data presented is that the effects of ahctf1+/- are independent of tp53 family members, and the effects on apoptosis go through a different pathway.

      We agree with this reviewer that we did not provide enough evidence that tp63 and tp73 are compensating for tp53 loss. Accordingly, we have addressed this issue comprehensively.

      2) The authors state in multiple locations that nucleoporin inhibition decreases tumor burden. In my opinion, this is not strictly correct. The TO(kras) model clearly results in HCC in adults, but it's a little unclear whether the larval liver overgrowth is truly HCC or not based on the original paper by Nguyen et al. (2012 Dis Model Mech).

      We agree with these comments and accordingly, we performed several new experiments in adult fish.

      Reviewer #3 (Public Review):

      The nuclear transport machinery is aberrantly regulated in many cancers in a context-dependent fashion, and mounting evidence with cultured cell and animal models indicates that reducing the activity or expression of certain nuclear transport proteins can selectively kill cancer cells while sparing nontransformed cells. Here the authors further explore this concept using a zebrafish model for hepatocellular carcinoma (HCC) induced by liver-specific transgenic expression of oncogenic krasG12V. The transgene causes greatly increased liver size by day 7 in larvae, associated with a gene expression profile that resembles early-stage human HCC. This study focuses on Ahctf1, a nuclear pore complex (NPC) protein known to be essential for postmitotic NPC assembly. Using the krasG12V background, the authors analyze animals that are heterozygous for a recessive mutation in the ahctf1 gene that leads to ~50% reduction in ahctf1 mRNA (and likely the encoded protein). The authors show that the ~4-fold increase in liver volume of krasG12V animals is reduced by ~1/3 in the ahctf1 heterozygous mutants. This is associated with increased apoptosis, decreased DNA replication, up-regulation of pro-apoptotic and cdk-inhibitor genes, and down-regulation of anti-apoptotic gene. These effects found to be substantially Tp53-dependent. Consistent with previous Ahctf1 depletion studies, hepatocytes of ahctf1 heterozygotes show decreased NPC density at the nuclear surface, elevated levels of aberrant mitoses and increased DNA damage/double stranded breaks. Finally, the authors show that combining the achtf1 heterozygous mutant with a heterozygous mutation in another NPC protein- RanBP2- or treating animals with a chemical inhibitor of exportin-1 (Selinexor) can further reduce liver volume. Overall they suggest that combinatorial targeting of the nuclear transport machinery can provide a therapeutic approach for targeting HCC.

      This is an interesting study that bolsters the notion that reduction in the levels of discrete nucleoporins (and/or inhibiting specific nuclear transport pathways) can result in cancer cell-selective killing. Moreover, the work extends previous studies involving cultured cell and mouse xenografts to a new cancer model (HCC) and nucleoporin (Ahctf1). Whereas the authors describe multiple aberrant cellular phenotypes associated with the dosage reduction in ahctf1, the exact causes for reduction in liver size in the krasG12V model remain unclear. Although it would be desirable to parse effects of Ahctf1 related to NPC number, aberrant mitoses, licensing of DNA replication and chromatin regulation, this is a tall order at present, given the limited understanding of Ahctf1. However, useful insight on these and related questions could be gained with further analysis of the system as outlined below.

      We are pleased this reviewer thinks this is an interesting study that bolsters the notion that reduction in the levels of discrete nucleoporins (and/or inhibiting specific nuclear transport pathways) can result in cancer cell-selective killing. This reviewer also suggests that useful insight on these and related questions could be gained with further analysis of the system as outlined below:

      1) In the krasG12V model, it would be helpful to distinguish the contribution of increased cell death vs decreased cell proliferation to the change in liver size seen with heterozygous ahctf1. Is this predominantly due to decreased proliferation?

      We think this question is difficult to address, because the relative contributions of the two processes may vary with time. Our data show definitively that by 7 dpf, the impact of ahctf1 heterozygous mutation has disrupted multiple cellular processes, leading to a 40% increase in the number of hepatocytes expressing Annexin 5 (dying cells), and a 40% decrease in the number of hepatocytes incorporating EdU over a 2 h incubation (fewer cells in S-phase). Both responses are likely to contribute to the reduction in liver volume observed in response to ahctf1 heterozygosity. It is worth stating that in our experiments, we captured snapshots of apoptosis and DNA replication in the livers of larvae at 7 days post-fertilisation after 5d of dox treatment/KrasG12V expression. To answer the Reviewer’s question properly, we would need to monitor the behaviour of individual cells over time. If such experiments were technically possible, we think that some cells that undergo growth arrest in response to dox treatment might ultimately succumb to apoptosis (unless dox treatment is withdrawn) while other cells might enter into a state of prolonged senescence. However, given the technical challenges, we did not attempt to test this in the current manuscript.

      2) It would be good to know whether the heterozygous ahctf1 state blunts the overall level of Ras activity in krasG12V animals.

      We have addressed this interesting question thoroughly in new Fig. 1g, h. To do this, we used a commercial RAS-RBD pulldown kit followed by western blot analysis to determine the levels of activated GTP-bound Kras protein. Our results demonstrate that the levels of GTP-bound Kras protein, expressed as a proportion of total Kras protein, do not change in response to ahctf1 heterozygosity. We conclude from these data that the potentially therapeutic value of reduced ahctf1 expression in a cancer setting is not caused by inhibiting Kras activity.

      3) Notwithstanding the analysis of Tp53 target genes presented in this study, it would be helpful to see detailed transcriptional profiling of hepatocytes in the krasG12V model with the heterozygous ahctf1 mutation, and to assess the effects of Selinexor. GSEA type analysis offers a way to start untangling the effects of these pathways. Moreover this analysis could provide insight on the relevance of this model to human HCC.

      We used RNAseq to address the relevance of our larval model to human HCC. Specifically, we performed differential gene expression analysis to identify up- and downregulated genes in cohorts of ahctf1+/+ (WT) larvae versus dox-treated ahctf1+/+(WT);krasG12V larvae. We used gene set enrichment analysis to compare these differentially regulated transcripts with the gene expression signature of 369 patient samples in the Liver hepatocellular carcinoma (LIHC) dataset versus healthy liver samples in the TCGA. These analyses revealed a significant association between the patterns of gene expression in our larval model of zebrafish HCC and those of human HCC (Fig. 1-figure supplement 1c, d).

      The genetic experiments we report in Figures 4, 5, 6 show that WT Tp53 is required for the reductions in liver enlargement (Fig. 4), apoptosis (Fig. 5) and DNA replication (Fig. 6) that occurs in response to ahctf1 heterozygosity in dox-treated krasG12V larvae. We also used RT-qPCR to show that a Tp53-mediated transcriptional program was activated in these ahctf1 heterozygous livers (Fig. 5). Similarly, in adult livers, ahctf1 heterozygosity triggered the upregulation of Tp53 target genes, including pro-apoptotic genes (pmaip1, bbc3, bim and bax) and cell cycle arrest genes (cdkn1a and ccng1) (new Fig. 6-figure supplement 1). These results show that to obtain the full potential of ahctf1 heterozygosity in reducing growth and survival of KrasG12V-expressing hyperplastic hepatocytes requires activation of WT Tp53. This is an important conclusion from our paper that is likely to be relevant in a clinical setting, for instance in patient selection, if ELYS inhibitors are developed for the treatment of HCC in which the KRAS/MAPK pathway is activated.

      Also, one reviewer mentions performing genome-wide transcriptional profiling of hepatocytes in the krasG12V model in response to ahctf1 heterozygosity and the presence and absence of Selinexor treatment. While these are potentially interesting experiments, they are substantial in nature and not crucial for the main messages of our paper. Therefore, we respectively contend that they are beyond the scope of the current manuscript.

      4) Functions of Achtf1 in regard to chromatin regulation could be compromised in this model. Scholz et al (Nat Gen 2019) report that Ahctf1 is involved in increasing Myc expression via gene gating mechanism. It would be good to know what the effects are in this system.

      The Scholz, 2019 and Gondor, 2022 papers from the same group, are very interesting in that they demonstrate a novel role for the ELYS protein in addition to the ones we pursued in our paper. The authors showed that in HCT116 cells, a human colorectal cancer cell line in which proliferation is driven by aberrant WNT/CTNNB1 signalling, the longevity of nascent MYC mRNA was increased by accelerating its movement from the nucleus to the cytoplasm, thereby preventing its degradation by nuclear surveillance mechanisms. The authors showed that siRNA knockdown of AHCTF1 in HCT-116 cells reduced the rate of nuclear export of MYC transcripts without changing the transcriptional rate of the MYC gene. They proposed a mechanism that depended on the formation of a complex chromatin architecture comprising transcriptionally active MYC and CCAT1 alleles plus proteins including β-Catenin, CTCF and ELYS. Together these interacting components guided nascent MYC mRNA molecules to nuclear pores, enhanced their export to the cytoplasm to be translated, resulting in activation of a MYC transcriptional program that induced expression of pro-proliferation genes.

      In theory, this role of ELYS in protecting MYC from nuclear degradation might extrapolate to other cancer settings where MYC expression is elevated. While interplay between MYC and mutant KRAS to enhance cancer growth has been previously reported, to date, most emphasis on this interaction has focused on the role of mutant KRAS in increasing the stability of the MYC protein, for example via RAS effector protein kinases (ERK1/2 and ERK5) that stabilise MYC by phosphorylation at S62 (Farrell and Sears, 2014: https://doi.org/10.1101/cshperspect.a014365) (Vaseva and Blake 2018: DOI:https://doi.org/10.1016/j.ccell.2018.10.001). While we appreciate the novelty of the recent papers, the current findings are limited to -Catenin activated HCT-116 cells and may not be relevant to our zebrafish model of mutant Kras-driven HCC. Accordingly, we have not allocated a high priority to following this up in our current manuscript.

      6) The synthetic lethality argument pressed in this manuscript seems exaggerated. Standard anti-cancer treatments typically target several cellular pathways, and nucleoporins directly affect a multiplicity of pathways besides nuclear transport.

      While we do not disagree that standard anti-cancer treatments may target several cellular pathways, we believe our data are consistent with the accepted definition of a synthetic lethal interaction whereby single mutations in two separate genes (kras and ahctf1) cooperate to cause cell death, whereas cells harbouring just one of these mutations are spared.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      The manuscript reports effects on brood size, lifespan and healthspan upon manipulation of C. elegans genes encoding RagA, TOR and Pol III orthologs, as well as other well-characterized lifespan-affecting genes. The results point to complex relationships among TOR and Pol III that are not fully resolved, suggest a role for rpc-1 Pol III that is additive with well-characterized lifespan pathways, indicate a late-life requirement for rpc-1 Pol III to limit lifespan, and, contrary to a previous publication, suggest a muscle requirement for rpc-1 Pol III for lifespan limitation.

      Major comments regarding key conclusions:

      The work demonstrates that brood size is reduced upon rpc-1 Pol III RNAi feeding from the L4 stage. However, no further analysis is provided to show how later aspects of reproduction impair brood. Minimally, ruling out effects on spermatogenesis would be important since sperm number limits self-fertile brood size. It is also unclear from the methods whether the brood size results include embryonic lethality (post-reproduction). Internal hatching, if it occurred, could also affect interpretation of the results. A change in the reproductive period should be noted if it occurred.

      The reviewer is correct that it is important to address the role of Pol III more thoroughly in relation to reproduction.

      • The brood size experiments we present simply record the number of hatched progeny. To develop this analysis further we will present the age-specific fecundity data that we generated whilst doing these assays to demonstrate the impact of Pol III on the reproductive period. In addition, we will quantify and present data on the total brood size (dead eggs and hatched progeny) to address whether Pol III also impact embryonic development.
      • At 25oC (the temperature that we did these experiments) very few animals suffered internal hatching and those that did were taken out of the analysis – therefore this is unlikely to skew the results.
      • The question as to whether Pol III limits egg or sperm function (or later developmental roles) is also interesting and is not yet addressed. To examine this we will: Quantify brood size (dead eggs and hatched progeny) in elegans +/- Pol III RNAi that have been exposed to males during the reproductive period compared to those that reproduce solely as hermaphrodites.

      The authors claim that, similar to the relationship previously concluded from aging studies, rpc-1 acts downstream of TORC1. However, this claim is not well supported. In an effort to circumvent early lethality caused by loss of let-363 ("CeTOR"), they use a mutation in raga-1 RagA and demonstrate a further reduction in brood with rpc-1 RNAi. If raga-1(ok386) were a null this result would demonstrate a relationship that is at least partially parallel, not linear. By contrast, double RNAi with let-363 was "non-additive", suggesting a more linear relationship. However, interpretation of these experiments requires (1) that the raga-1 mutation is null and affects only TORC1 signaling, (2) evidence that the double RNAi worked well (e.g., qPCR; see Ahringer et al. 2006 review regarding issues with multi-RNAi), and (3) failure to consider alternative effects of loss of let-363 (e.g., TORC2). Negative results with RNAi are particularly problematic in the absence of convincing evidence that the RNAi worked well. Moreover, results in Figure 1G are difficult to interpret since the initial values are low. Here and elsewhere the genetics descriptions are unconventional, hampering interpretation. For example, what is meant by a mutation being "incomplete"? That it acts as a hypomorph?

      We understand the concerns of the reviewer:

      For reference, this has been used in several other studies, e.g. doi.org/10.7554/eLife.49158

      • We agree that double RNAi can be challenging. Appropriate controls were used here e.g. each RNAi diluted 50:50 with control RNAi in the single treatments and phenotypes were observed in each case (either brood size or lifespan). However, to address the precise knockdown of rpc-1 and let-363 obtained with RNAi we will perform qPCR in response to single and double RNAi treatment (both in WT and raga-1 mutant elegans).
      • In addition, we will attempt to measure S6Kinase phosphorylation, a downstream readout of TORC1 signalling in response to raga-1 mutation or let-363 RNAi treatment with and without rpc-1 A phosphor S6 Kinase antibody is commercially available and has been used successfully in C. elegans - doi.org/10.7554/eLife.31268
      • Our apologies that the nomenclature was confusing. The CeTOR RNAi nomenclature was ’borrowed’ from other papers describing this tool e.g. org/10.7554/eLife.31268 and doi: 10.1371/journal.pgen.1000972. Here, to make our work clearer, we will change ceTOR to let-363 TOR RNAi and raga-1 to raga-1 RagA in the manuscript – as suggested by the reviewer (see below). The description of ‘incomplete’ mutations will also be amended, and informed by our proposed qPCR analysis.

      Another claim is that rpc-1 Pol III limits adult lifespan downstream of TOR. These results are not convincing. The two treatments (raga-1 mutation as "embryonic" and L4 stage "CeTOR" let-363 RNAi as late) are not directly comparable for reasons noted above, and the double RNAi problem hampers interpretation.

      Our lifespan data points out that the longevity increase upon Pol III knockdown is additive with TOR/let-363, suggesting a mechanism independent of TOR. Indeed, due to lack of ideal reagents, we were forced to try the double RNAi knockdown approach for TOR/let-363 and Pol III/ rpc-1. To make the data interpretation easier, and rule out the possibility of confounding background RNAi to the maximum possible extent, we have included appropriate RNAi controls. Wherever double RNAi has been used, the effect on the phenotype by 50% dilution of target RNAi with empty-vector control, has also been shown independently and used for the statistical comparison with combinatorial RNAi. Our results have shown that diluting let-363 RNAi and rpc-1 RNAi both to 50%, is enough to impart lifespan increase when initiated from L4 stage.

      The nomenclature might be easier to follow if the authors state the actual C. elegans genes manipulated (e.g., let-363 TOR versus raga-1 RagA) rather than using "CeTOR" as a catch-all since these genes are not identical in action.

      Thank you for this suggestion. We will implement this in the manuscript where appropriate.

      Based on genetic interactions (rsks-1, ife-2, ppp-1, daf-2 and germline loss) they show that rpc-1 RNAi further extends the long lifespan conferred by each of the mutant alleles tested, as well as germline loss induced by two different mutant conditions. These results, though negative, are important. The statement that rpc-1 does not affect global protein synthesis is somewhat overstated without additional experimental support.

      We thank the reviewer for supporting our inclusion of ‘negative data’. We agree that our statement relating to protein synthesis is overstated given the data presented. We will soften this to: “rpc-1 does not seem to affect the lifespan incurred by reducing global protein synthesis, although this does not rule out the possibility that Pol III affect protein synthesis by other mechanisms”.

      Extending and challenging their own previous work showing an intestinal focus of activity for rpc-1 in limiting longevity (Filer et al., 2017), and noting that RPC-1::GFP detection can be knocked down by RNAi in several tissues, they use a tissue restricted rde-1 expression approach (or sid-1 for neurons) to test the contribution of intestine, hypodermis, neurons, muscle and germline. This new analysis points to a role for the muscle. This result is intriguing and warrants further experiments. To shore up tissue-specific claims the authors could consider (1) additional drivers for intestine and muscle rde-1 in the RNAi experiments, or, ideally, a different approach such as tissue-specific protein degradation (again with multiple drivers), (2) a sufficiency experiment for muscle (wild-type muscle expression in the mutant to demonstrate reversal of the phenotype, or rescue of RNAi defects with an RNAi-insensitive reagent expressed in muscle).

      Thank for you appreciating the work we have done here and suggesting further experiments. To take your points one at a time: (1) We have already used the most robust tissue-specific alleles generated and reported in the C. elegans literature so far. It would be a significant amount of work to generate new rde-1 driven tissue specific alleles to double check the Pol III levels/ rpc-1 knockdown response in certain tissues, and we feel this is beyond the scope of this project. Suggestion (2) is interesting and would require us to generate a muscle specific rpc-1 strain. However, there are issues with this approach. Firstly, it would require that we have a rpc-1 mutant to rescue – which we don’t as it is embryonically lethal and secondly it would not be possible to do this experiment using RNAi as the RNAi would then knock down the muscle construct.

      The possible explanation for the differences in rde-1 results from the previous work should not be buried in the legends of Figure 3 and Figure S3. Perhaps this leaky background hypothesis should be directly tested (e.g., using the RPC-1::GFP to examine whether residual expression exists in ne219 but not in ne300)? In any case, legend to Figure S3 needs editing: The ne219 background is not itself "intestine-specific", as implied, and the last sentence of Figure S3 legend should be "Thus, the rde-1(ne219)...", right?

      The differences between the different tissue-specific strains is interesting. On reflection we agree with the reviewer that it should be included in the main text. We will describe the differences between the two rde-1 alleles ne219 and ne300 in the appropriate section in the manuscript and state our results.

      Finally, they show that late-adult rpc-1 RNAi extends lifespan over control RNAi and that, by several movement assays, healthspan is improved upon L4 rpc-1 RNAi, even when RNAi is active in muscle (based on WM118).<br /> The most significant new results are that rpc-1(RNAi) affects brood size, can extend lifespan (though modestly) after day 5 of adulthood, and that muscle may be involved rather than intestine.

      Additional comments:

      Text throughout should clarify TOR vs presumed TORC1. Methods are insufficient. Important aspects of the lifespan methods and raw data are missing - e.g. exact numbers of worms censored. Exact information regarding statistical analysis is lacking (e.g., which tests, corrections for multiple testing). References should be given for all strains. For the rde-1 strains, it would be helpful to include, in addition to the transgene alleles, the actual promoters used to claim tissue specificity. Note, worms do not have "skeletal" muscle, as implied in the discussion. Figure 5 was not helpful for this reviewer. Figure legend to S3A is confusing: the intestinal signal appears stronger or at least equal, not weaker, in the rpc-1 RNAi background. Were these images collected using the exact same exposure settings?

      To address this we will:

      • Standardise genetic notation throughout the manuscript (see specific comments above)
      • Provide more detail on the transgenic alleles used e.g. promoters driving rde-1.
      • The majority of strains were obtained from the CGC but wherever appropriate we will also supply a reference.
      • Expand and revise Material and Methods section to appropriately describe all the statistical analyses performed.
      • Revise lifespan methods to include censoring detail and lifespan Tables to include information on censored animals.
      • Remove the reference to ‘skeletal muscle’ and replace with ‘body wall muscle’.
      • Once we have generated new data on the specific knockdowns and downstream targets achieved with let-363 TOR RNAi and raga-1 RagA mutation, as well as on the brood size/dead eggs effects, we will incorporate this information into Fig. 5A for better clarity and readability.
      • We can see on reflection that Figure S3A is confusing, mainly due to the gut autofluorescence in both the control and rpc-1 RNAi conditions. We will amend this figure to make this clear and include a selection of close up images of each tissue to make it easier to see the tissue specific knockdown by RNAi.

      Reviewer #1 (Significance):

      See above. Study will be of interest to aging community.

      Reviewer #2 (Evidence, reproducibility and clarity):

      The study by Malik and Silva et al describes results of the study investigating the role of RNA Polymerase III in regulating fecundity and lifespan in C. elegans. The authors show that knockdown of Pol III, similar to mTOR suppression, is detrimental for reproduction. Likewise, suppression of either Pol III or mTOR in adult animals extends lifespan via apparently the same pathway. In contrast, Pol III knockdown has an additive effect on lifespan in combination with other established genetic lifespan-extending approaches suggesting that they are working via different mechanisms. Furthermore, using the tissue-specific knockdown of Pol III the authors found that suppression Pol III expression is the muscle, but not other major worm tissues, is sufficient for its lifespan extending effect. Finally, the lifespan extension is also observed when Pol III knockdown is initiated late in adulthood. The overall conclusion is that suppression of Pol III expression late in animal life, particularly in the muscle, is a potential strategy to extend life- and health-span. Overall, the study is well-designed, the tools and results are robust and analysed appropriately. The data presentation is excellent, and the manuscript is clearly written. Addressing the points below will help to improve the clarity further.

      We thank the reviewer for their very positive response to our study and are pleased that they found the data convincing. We are extremely pleased that the reviewer agrees with the design and tools used in this study. We can address all of the review’s comments – as discussed below.

      Major:

      Significant amount of GFP signal is still present in RNAi treated animals, what is the tissue that maintains particularly high levels of expression (Fig. 3A) and how does it affect the conclusions? What is the level of Pol III reduction in different tissues? Could different efficiency of knockdown explain the tissue-specific effect of Pol III downregulation on lifespan? It would be important to show (and, if possible, to quantify) the knockdown efficiency in different tissues using the available reporter

      • This experiment had originally been done to test the efficiency of the RNAi, particularly in tissues where rpc-1 RNAi did not impact lifespan. The reviewer is right though, and this information could be analysed further to enhance our study. Figure 3A shows C. elegans expressing the rpc-1::3xflag::gfp reporter. This was used to a) determine the expression pattern of RPC-1 and b) determine the effect of rpc-1 RNAi on this. We noted that RPC-1::GFP is expressed a wide number of tissues and when the reporter strain is treated with rpc-1 RNAi, it is decreased in all tissues. The ‘green’ observed in the RNAi treatment is unfortunately attributable to autofluorescence generated by lysozymes in the C. elegans intestine and masks some of the effects we saw by eye.
      • To establish the tissue-specific efficiency of Pol III knockdown and also address the confounding issue of the autofluorescence we will now use a combination of quantitative and qualitative fluorescent microscopy to measure the percentage RPC-1::GFP knockdown in each tissue relevant to this study.

      Minor:<br /> Fig. S3B is not cited in the text and the legend for the figure is somewhat confusing, potentially containing errors, this needs to be clarified.

      We thank the reviewers for pointing this out. The legend for this figure will be re-written as a result of the analysis described above and we will cite it in the main text.

      Reviewer #2 (Significance):

      This is the first thorough study of Pol III knockdown as a lifespan extending strategy in C. elegans. In addition to the different laboratory model (previous study of Pol III in ageing primarily focused on Drosophila), this manuscript also offers several novel insights into consequences of Pol III perturbation at phenotypic, as well as mechanistic level in terms of interaction with other longevity pathways. The study will be of interest to those interested in processes underlying longevity and ageing. Considering that this topic is currently in fashion the publication will probably attract attention of not only specialist but also general public.

      We are extremely pleased that the reviewer shares our enthusiasm for this study and that they find the experimental evidence compelling.

      Reviewer #3 (Evidence, reproducibility and clarity):

      Summary: The paper by Yasir Malik et al investigates the genetic interrelationship between TOR signalling and Pol III expression regarding fecundity and longevity in C. elegans. Based on a previous study that defined a role of Pol III downstream of TOR in longevity across various species, this study looks particularly at the relative timing and tissue requirements for TOR and Pol III inhibition in longevity. Data indicate that Pol III acts downstream of TOR in regulating fecundity while there are additive effects regarding survival. The Pol III effect on longevity is based on its role in the muscle. Finally, health-span parameters mirror the survival data.

      Major comments: This is a nice study the relies on genetic interaction to ask how TOR and Pol III interact. I find the observation that Pol III inhibition extends survival when initiated at day 5 of adulthood very exciting. In general, the study would benefit from additional data that back up the genetic observations._We thank the reviewer for appreciating the study and the novel insights it provides about the TOR-Pol III inter-relationship. We can address reviewer’s comments with the a few, limited experiments. Discussed below.

      In Fig. 1, experiments are done to inhibit TOR to varying degrees in order to perform epistasis experiment. Of course these are difficult to interpret without the use of full KOs/loss of function. So while this is a good solution, it would be important to quantify the level to which TOR signalling is inhibited, optimally with biochemical experiments. We fully appreciate the reviewer’s point. A similar concern was raised by reviewer 1. We propose to address this in two ways: 1) by quantifying mRNA levels by qPCR of let-363 in response to either let-363 TOR RNAi; and 2) by determining the extend of TORC1 activity by using a biochemical readout of the pathway’s activity – S6 Kinase phosphorylation using Western blotting as described here: doi.org/10.7554/eLife.31268 2. General brood size is very low in the WT worms. Normally, one would expect 250-300 offspring per adult worm. It would be helpful if the authors could address this.

      Indeed, as pointed out by the reviewer, the WT worms have a brood size of 250-300 eggs when kept at 20oC. but C. elegans exhibit different brood sizes dependent on temperature and these decline in size with increasing temperature. The experiments shown here were carried out at 25oC, where C. elegans produce less offspring. Our observation is in agreement with other studies of similar nature e.g. doi:10.1371/journal.pone.0112377 and doi.org/10.1371/journal.pone.0145925

      1. Why were lifespan assays performed at 25C? The standard temperature for the worm is 20C and here I think this is very relevant as the TOR pathway is responsive to suboptimal conditions. I wonder if the results are also true for lower temperatures.

      The reviewer raises an interesting point. This study follows from the previous study of Filer et al., Nature 2017 which demonstrated the role of Pol III in ageing. During this study we found and reported that there was a high proportion of intestinal bursting when lifespans were carried out at 20oC, which was ameliorated by carrying out the experiments at 25oC. This was quantified in the original manuscript. To maintain consistency, we continued carrying out Pol III lifespans at this slightly higher temperature. Due to this limitation it is not possible to test the impact of TOR signalling on Pol III at lower temperatures.

      Minor comments: 1. It would help to better delineate the rationale for the experiments in Fig. S1. Experiments here are aimed to find mediators of TOR effects distinct from Pol III. Such distinct mediators would be additive to Pol III (as is the case in the figure) and downstream of TOR.

      Interpreting epistasis analysis is challenging. We were looking for interactors of Pol III using this targeted genetic approach and working on the premise that if two genes interacted then their effects would be non-additive. However, the reviewer is correct that if two genes are doing the same thing independently then their effects may be additive. Although our data does not suggest these mediators interacting with Pol III in the same pathway, it does not rule out the other possibility. When we re-work the manuscript we will explain our rational more clearly and outline the two scenarios.

      Reviewer #3 (Significance):

      Strengths: The study advances our knowledge regarding the timing of the Pol III targeting intervention for survival effects.<br /> Limitations: The study relies only on genetic data and not all of it is conclusive.

      This study will be interesting for the geroscience community with an eye on TOR inhibition and is relevant to worm biology. I work with C. elegans as a genetic model and I am interested in protein homeostasis, metabolism, health, and longevity.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01480

      Corresponding author(s): Ananda, Sarkar

      1. General Statements

      We are thankful to Review commons platform that helped our manuscript critically reviewed with very constructive and valuable feedback. This gave us the opportunity to do the experiments accordingly and significantly improve the manuscript. We are hopeful that this platform will help our manuscript get published in a journal of repute.

      2. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The manuscript entitled "LDL1 and LDL2 histone demethylases interact with FVE to regulate flowering in Arabidopsis" characterized that LDL1 regulates flowering by binding on the chromatin of MAF4 and MAF5 to repress their expression. Further the authors proposed LDL1/LDL2-FVE model. Here are some comments for this manuscript.

      Major problems: 1. This experiment is still not testing or showing/concluding that the whole complex forms on the MAF4 and MAF5

      Response: We understand reviewer’s concern regarding the complex. Previously FVE was shown to be a part of co-repressor complex including HDA6, HDA5 and FLD to regulate the expression of FLC and its clade members during floral transition1-3 We showed that LDL1 binds directly to the chromatin of MAF4 and MAF5 to suppress their expression (Figure 1 and 2). Furthermore, we discovered that LDL1 and LDL2 interact with FVE to influence floral transition (Figure 8 and 9). Hung et al., 2018 reported the interaction of LDL1 and LDL2 with HDA6 to regulate circadian rhythm4 and we found that the expression of MAF4 and MAF5 was upregulated in ldl1ldl2hda6 than ldl1ldl2 (Figure 5C and 5D). Therefore, our experimental data, together with previously reported data makes it evident that LDL1 and LDL2 are a part of co-repressor complex through their interaction with FVE and HDA6, which we concluded here. We agree with the reviewer that an additional experiment, such as complex pull-down, will be helpful, but in our opinion, it will only provide additional confirmatory evidence.

      2.It is not shown LDL1/LDL2 repress MAF4 and MAF5 by removing H3K4me2 activity. It would be useful to test whether the methylation level of MAF4 and MAF5 has been altered in ldl1/ldl2 mutant

      Response: We found altered methylation level in MAF4 and MAF5 chromatin during floral transition in ldl1 and ldl1ldl2 mutants (Figure 6 and 7). We observed that the absence of LDL1, or both LDL1 and LDL2 disturbs the shift in H3K4 methylation status on MAF4 and MAF5 during floral transition and ends up in a more active (enriched in H3K4me3 marks) chromatin state at 19 days. This result, taken together with the increased MAF4 and MAF5 expression in ldl1 and ldl1ldl2 double mutants (Figure 5C and 5D) indicates that LDL1/LDL2 repress MAF4 and MAF5 by altering H3K4 methylation.

      3.I suggest that further research is required to provide conclusive evidence concerning the physiology function of LDL1/LDL2-FVE. Such as the expression pattern of LDL1/LDL2, the methylation level of MAF4 and MAF5 before or after floral transition

      Response: Taking this suggestion into account, we performed quantification of rosette leaves and flowering time of fvec, ldlfvec and ldl2fvec along with WT, ldl1 and ldl2 (Figure 9). We also observed decreased expression of floral activator genes, FT and SOC1 (targets of MAF4 and MAF5) in fvec, ldlfvec and ldl2fvec in comparison to the WT (Supplementary Figure 10C), which corresponds to their late flowering phenotype.

      To understand the role of LDL1and LDL2 during floral transition, we first analyzed the expression of LDL1 and LDL2 during floral transition (Supplementary Figure 8). We observed that the expression of LDL1 and LDL2 expression peaks at 16 days and gets stabilized till 19 days. Then we checked the enrichment of H3K4me1, H3K4me2 and H3K4me3 on MAF4 and MAF5 chromatin in ldl1 and ldl1ldl2 plants with respect to the WT at 16 days (before floral transition) and 19 days (after floral transition). We found an increase in the conversion of H3K4me1 to H3K4me3, when LDL1 and LDL2 were not present (Figure 6 and 7).

      Reviewer #1 (Significance (Required)):

      The manuscript provide some evidences how LDL1 involve in flowering through epigenetic regulation.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Mahima and colleagues investigated LDL1/LDL2-MAF4/MAF5 in Arabidopsis flowering time control. The manuscript contains some interesting observations. To my point of view, however, the data need to be consolidated to support conclusions drawn in the manuscript.

      • Title: it does not correctly reflect the manuscript content. Data in relation with FVE were limited to Fig 6, where the data themselves appear preliminary.

      Response: We agree with the reviewers that our title didn’t reflect the manuscript content precisely and are happy to take this criticism into consideration. We have revised the title to, “LDL1 and LDL2 affect the dynamics of H3K4 methylation on the chromatin of MAF4 and MAF5 to allow floral transition in Arabidopsis”. Additionally, have provided the quantification data for fvec, ldlfvec and ldl2fvec with respect to WT, ldl1 and ldl2 plants (Figure 9)

      • Abstract: most conclusions are over-stated. The current data shown in the manuscript cannot support such strong conclusions.

      Response: We have rigorously revised the abstract and toned down the overstated conclusions

      • Introduction: It is necessary to make clear that the role of the LDL1 and LDL2 genes in flowering time control had been well established in previous studies, including their repression of transcription of FLC, MAF4 and MAF5 (Berr et al., 2015, Plant J 81:316).

      Response: We have revised the introduction to include the previously known roles of LDL1 and LDL2 in regulating flowering time.

      • Results:

      Regarding LDL1-overexpression lines, 'Relative expression' in Supplementary Fig 2B referred to normalization to WT? The phenotype of plants needs to be shown.

      Response: Yes, the level of upregulation of LDL1 expression in different T1 plants (after selection from Hygromycin) was calculated with respect to the WT.

      Regarding flowering time, have the observation and measures been performed in the same experiments for the ldl1, ldl1 flc, ldl1 maf4 and ldl1 maf5 mutants (Fig 3 and Supplementary Fig 1)? The late-flowering phenotype of ldl1 shown in Fig 3D-F is much severe than the same mutant shown in the other Figs, any explanation? What's the interpretation that ldl1 is epistatic to flc, maf4 and maf5?

      Response: We agree with the reviewer’s observation which is correct. The following quantifications were taken at various points during the study:

      flc, ldl1, and ldlflc (Supplementary Figure 1)

      WT, ldl1, and ldl1maf4 (Figure 3A, 3B and 3C)

      WT, ldl1, and ldl1maf5 (Figure 3D, 3E and 3F)

      The rosette leaf numbers and flowering time of the plants in Figure 3D-3F are more severe than the others because seeds were directly sprinkled onto the soil in this phenotyping, whereas in previous phenotypings, plants were grown on 1/2MS plates before being transferred to soil. However, all the components of a single experiment were grown in the same condition. We appreciate your observation, the present data does suggest ldl1 being epistatic to flc, maf4 and maf5.

      The in vitro test of LDL1 for its enzyme activity (Fig 4) appears preliminary and fragmented. The quantification data in Fig 4C-D need repeats. Have other histone methylation types (e.g. H3K4me3, H3K27me3, H3K36me3) been tested? The only two types (H3K4me2 and H3K9me2) shown are both down-regulated by LDL1-GST. Can H3K9 demethylation also play a role in flowering time control? In any case, the current in vitro data only are not sufficient to draw the strong conclusions as those appeared in the manuscripts.

      Response: Before concluding that LDL1 has H3Kme2 and H3K9me2 demethylase activity, we confirmed it several times__. __Please refer to the PDF file for “response to reviewers” for supporting data.

      We analyzed the western band intensity by calculating the area under the curve with imageJ software, which varies between experiments depending on the band intensities, therefore, rather than plotting absolute values of band intensity, we plotted the ratio of LDL1-GST/GST from three independent experiments in Figure 4B. We did perform a preliminary experiment to see if LDL1 has demethylation activity against different methylation marks, such as H3k4me1, me3, H3K9me1, and me3 (1=GST, 2=LDL1-GST), but there was no significant change in the methylation marks in the presence of LDL1. Please refer to the PDF file for “response to reviewers” for supporting data.

      H3K9 is a repressive chromatin mark, and its removal would suggest gene activation. Upregulation of FLC, MAF4, and MAF5 in ldl1 and ldl2 mutant suggests LDL1 and LDL2 removes H3k4me2 methylation marks during flowering. However, JMJ28, Jumonji C (JmjC) domain-containing histone demethylase have been shown to positively regulate flowering by removing repressive H3K9me2 marks from the chromatin marks from the chromatin of CONSTANS (CO)5.

      In the manuscript, it is saying that LDL1 binds on the chromatin of MAF4 and MAF5. However, I cannot find any data shown to support this conclusion.

      Response: We would like to refer to Figure 2A and B where we have provided this information.

      Protein-protein interactions, e.g. LDL1/LDL2-FVE in Fig 6A and LDL1-LDL2 and LDL1-HDA5 in Supplementary Fig 5, are examined in yeast two-hybrid assay. Other independent assays would be required.

      Response: We have confirmed the interaction of LDL1 and LDL2 with FVE using co-immunoprecipitation assay (Figure 8B). Since Co-IP is a confirmatory experiment, we have done it for positive interactions found through Y2H only. Moreover, in the current manuscript our focus has not been on HDA5, so we didn’t proceed with further experiments.

      The study of genetic interaction between fve and ldl1/ldl2 (Fig 6B-D) looks very preliminary. It is unclear how ldl1 fve and ldl2 fve were obtained: by crosses or by CRISPR-Cas9 using ldl1 and ldl2? The phenotypes need more investigations and some molecular data regarding flowering regulatory genes (e.g. MAF4/5) are necessary. In any case, the current title and the related conclusions drawn in the manuscript are over-stated.

      Response: We performed the quantification of the genetic interaction between fve and ldl1/ldl2. The binary vector pHSE401-FVE was transformed in ldl1 and ldl2 to produce ldl1fvec and ldl2fvec, respectively. We previously mentioned it in the material methods, but we have now updated it in the results section to avoid confusion.

      Following the suggestions, we have scored the phenotype (Figure 9) and checked the expression of flowering regulatory genes (Supplementary Figure 10C).

      Fig 7 showed data about MAF5-FLC, MAF5-SVP and MAF5-MAF5 interactions in yeast two-hybrid and about transcriptional repressor activity assay in tobacco leaves using the LUC-reporter. Again, the data need to be confirmed and reproducibility of experiments need to be shown. In addition to proFT:LUC, it is also necessary to have an internal normalization reference construct. Anyway, currently it is far away to allow a strong conclusion such as drawn in the manuscript that MAF5 interacts with FLC and SVP and repress FT to delay floral transition. Response: We have confirmed the interaction of MAF5-FLC, MAF5-SVP and MAF5-MAF5 using co-immunoprecipitation (Figure 10B). We quantified the firefly luciferase activity under proFT using renilla luciferase under pro35s as an internal control and the ratio of LUC/REN represented the promoter activity of FT promoter (Figure 10C).

      Reviewer #2 (Significance (Required)):

      Topic is interesting, but data are poor to support the conculsions drawn.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      LDL1 and LDL2 histone demethylases interact with FVE to regulate flowering in Arabidopsis Summary This work study the role on flowering time of LDL1 and LDL2, two Arabidopsis homologs of the histone demethylase LSD1. Although this phenotype was previously described, the authors explore if LDL1 and LDL2 regulate other genes in addition to the floral repressor FLC. In fact, mRNA expression experiments and genetic analyse suggest that LDL1 modules flowering regulating the expression of MAF4 and MAF5, two FLC-like genes that has been less characterized. The also provide some in vitro biochemical evidence of the demethylase activity of LDL1 protein and yeast-two-hybrid data showing the interaction with FVE, another chromatin regulator involved in flowering time.

      Major comments 1. Lines 116-117. Please rephrase these lines and remove panels C, D and E from figure 1 (these could be supplementary material). The flowering time phenotype of MAF4 and MAF5 in Col background is very well documented and was described before, see Gu et al Nat. Comm., 2013 (10.1038/ncomms2947) and Kim et al. Plant Cell, 2013 (10.1105/tpc.112.104760)

      Response: As per the suggestion, we have modified the discussion and moved the panels 1C, 1D and 1E to the supplementary.

      Lines 128-130 and Fig Sup3. The proLDL1:LDL1-GUS cannot be described as fully functional because its flowering time and LDL1 mRNA expression levels has not been compared to the wild-type plant. The line flowers earlier that the ldl1 mutant but it may only partially complement the flowering phenotype.

      Response: We have provided additional experiment that the transgene is functional in proLDL1:LDL1-GUS (ldl1) with respect to the WT plants (Supplementary Figure 5A).

      Line 135 and Figure 2. How the Chip data was normalized? What are you comparting in your statistical significance tests? Only two regions of each gene were analysed; to assess the binding of LDL1 to MAF4 and MAF5 loci more regions must be analysed.

      Response: Normalization of the ChIP data and significance of enrichment of LDL1 was calculated with respect to the fold enrichment in the empty vector control (EV (ldl1)) plants. We only examined the promoter and exon1 of MAF4 and MAF5 for LDL1 enrichment because Hung et al,2019's6 study demonstrated that LDL1 is enriched on the promoter and exon1 of the downstream protein coding genes. However, to check for methylation marks during flowering, we have employed different primer sets on various positions between the promoter and exon1 on MAF4 and MAF5 chromatin.

      Figures 6C and 6D. The genetic analysis of ldl mutant with fve-c line is prelaminar and incomplete. The epistasis cannot be evaluated as no quantitative flowering time data is provided. A questionable picture of one lonely plant cannot sustain the conclusions of lines 207-208.

      Response: We have modified the picture and quantified the flowering time data to show genetic interaction of ldl1 and ldl2 with fvec mutant plants (Figure 9).

      METODS. Please clarify the used mutant alleles for LDL1 LDL2, MAF4, MAF5 and FLC; if they has been previously described; if they are full knock-outs; and, consequently, use the appropriated allele name across the manuscript.

      Response: As per the suggestion, we have clarified the different mutant alleles used in the study.

      Minor points: 6. I think the title does not describe the work - the interaction with FVE is very relevant but it is not the central theme of the article.

      Response: We have changed the title of the study to “LDL1 and LDL2 affect the dynamics of H3K4 methylation on the chromatin of MAF4 and MAF5 to allow floral transition in Arabidopsis”.

      It would be very informative to have short-day flowering tome data of the genetic combinations of ldl mutants with flc, maf4 and maf5 mutations.

      Response: We absolutely agree that elaborate SD experiment may open interesting avenue for LDL1 mediated regulation of flowering, which might be good for future studies. However, ldl1ldl2 shows late flowering, while maf4 and maf5 exhibit the early flowering phenotype irrespective of the day length7,8.

      I found the Discussion section rather too long.

      Response: We have shortened the discussion to make it more focused.

      Reviewer #3 (Significance (Required)):

      Although it is clear that LDL proteins regulate MAF4 and MAF 5. I found that the manuscript lacks of a general overview of flowering time regulation. At the end, it is not clear how LDL proteins regulate flowering time because they regulate FLC, FWA, MAF4 and MAF5: What is more important? Which is the main role of each protein? Are they reductant or do they have specialized functions? In a nut shell, this study is an interesting piece of work for the flowering time field: However, in my opinion, some of the presented data are redundant with previous works and the manuscript may not be relevant for a general audience.

      1. Yu, C.-W. et al. HISTONE DEACETYLASE6 Interacts with FLOWERING LOCUS D and Regulates Flowering in Arabidopsis. Plant Physiology 156, 173-184 (2011).
      2. Luo, M. et al. Regulation of flowering time by the histone deacetylase HDA 5 in A rabidopsis. The Plant Journal 82, 925-936 (2015).
      3. Yu, C.-W., Chang, K.-Y. & Wu, K. Genome-wide analysis of gene regulatory networks of the FVE-HDA6-FLD complex in Arabidopsis. Frontiers in plant science 7, 555 (2016).
      4. Hung, F.-Y. et al. The Arabidopsis LDL1/2-HDA6 histone modification complex is functionally associated with CCA1/LHY in regulation of circadian clock genes. Nucleic acids research 46, 10669-10681 (2018).
      5. Hung, F.-Y. et al. The Arabidopsis histone demethylase JMJ28 regulates CONSTANS by interacting with FBH transcription factors. The Plant Cell 33, 1196-1211 (2021).
      6. Hung, F.-Y. et al. The expression of long non-coding RNAs is associated with H3Ac and H3K4me2 changes regulated by the HDA6-LDL1/2 histone modification complex in Arabidopsis. NAR Genomics and Bioinformatics 2 (2020). 7 Berr, A. et al. The trx G family histone methyltransferase SET DOMAIN GROUP 26 promotes flowering via a distinctive genetic pathway. The Plant Journal 81, 316-328 (2015).

      8 Kim, D.-H. and Sibum, S. Coordination of the vernalization response through a VIN3 and

              FLC gene family regulatory network in Arabidopsis. *The Plant Cell *__25, __454-469 (2013)
      
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We are grateful to the reviewers for their efforts in critically reading our work. Their meaningful input led us to make the revisions detailed below in our “point-by-point” answers to the reviewer’s comments. The insightful comments have helped us significantly improve the manuscript, allowing us to more accurately quantify and convey our data – we are thankful for that.

      Reviewer #1

      1. The Figure 1 legend indicates that the BirA tagged strains are mated with ~6000 AviTag yeast strains but results in Figure 2 pie chart account for 4812 total readouts. Presumably 1000 or more strains could not mate or did not produce viable diploids with the BirA tagged strains? It would be helpful to explain this differential. We thank Reviewer #1 for pointing out this gap which occurred exactly as they have interpreted. We have now corrected the figure legend to say exactly how many strains were in the library (5330) and have clearly stated the attrition of strains.

      If possible, suggest including more of the raw data (in supplementary) that supports the pie chart in Figure 2. Table S1 shows the 111 proteins that display preference for Ssh1 (out of 586 total interactors?) and the fold change (in rank order) for interaction preference. At a minimum, similar data on Sec61 preference and the list of positive interactors should be included. There may also be useful information in the relative biotinylation signal for each BirA and AviTag combination when significantly above background. This is presumably a readout of AviTag protein abundance, dwell time and orientation to BirA activity. The data could be useful to other investigators.

      This is a very good suggestion. We have now added a supplementary table (Supplementary Table S2) with the interaction results for proteins that preferred Sec61 and proteins that did not show any preference.

      The authors might want to be more cautious in interpreting impact of the UPR on ssh1 phenotypes in the results and discussion. The Wilkinson et al 2002 paper referenced used very different conditions to detect UPR in ssh1 deletions strains. Jonikas et al 2009 does not detect a chronic UPR in ssh1 deletion cells and the conditions used in the current study seem more similar to the 2009 report. It seems more likely that deficits in translocating/localizing specific proteins causes the observed phenotypes instead of chronic UPR due to reduced ER levels of PDI.

      *We agree that as result of the different conditions it is difficult to compare our data to the Wilkinson et al 2002 paper. We have therefore adjusted the text to remove this interpretation. *

      Reviewer #2

      1. Why was BirA used to study transient interactions? Biotinylation through BirA is slow (that is why it takes several hours to label proximity proteins) and thus it may not be suitable for capturing transient interactions. Instead, TurboID would be more suitable as the biotinylation reaction is faster than BirA. A reasonable explanation using BirA is required. We thank the reviewer for this comment which indeed also reflects our “process” of thinking. Originally, we did try to use TurboID to identify potential cargo proteins. We now have a very robust methodology to look at protein substrates by TurboID (see: https://www.biorxiv.org/content/10.1101/2022.04.27.489741v1) and so this would have obviously been the easier and faster method. However using this approach we mainly observed ribosome subunits and cytosolic proteins for Sec61 and very few, mostly cytosolic, proteins for Ssh1. Our interpretation of this is that since all biotinylation of TurboID strains occurs in parallel there is “competition” from the highly abundant and strong interactors and this does not leave a possibility to detect the low-abundance and even more transient interactions that we would like to measure. It seems that although birA/AviTag are much slower, the specificity and singular ligation site that should be exposed also in co-translational-translocation events, are more suitable for this specific experimental setup. We have now explained this also in the text.

      One key question is whether biotinylated proteins identified by this method are substrates or proteins just proximal to Sec61 or Ssh1 due to close cellular localization (e.g. ER membrane) or same protein complex members. An experiment or analysis would be required to confirm that the proteins they identified are indeed potential substrates.

      *This is indeed an extremely important point and we have now carefully addressed it in the text. We are certain that the reviewer is right and that many of the biotinylated proteins are same complex members and cytosolic components that happen to be in proximity (maybe regulators?) just as the reviewer suggested. We now clearly write this in the results section. This is why we focused on signal peptide containing proteins. These proteins CAN NOT be complex members nor biotinylated simply due to proximal location on the ER membrane. This is since they reside inside the lumen of the ER if they are soluble or are inserted (if they contain also a transmembrane domain) with their tagged N’ facing the lumen of the ER (So called Type I proteins). The only way such proteins could be biotinylated by the slow BirA on the cytosolic surface is if they passed through the pore of the translocon. *

      Along the same line, if proteins identified by this approach are bona fide substrates of Sec61 and Ssh1, proteins having signal peptides should be enriched in the candidate list of substrates. However, it does not look like that according to Figure 2A where the secretome proteins/total proteins ratio appears to be similar among the 4 categories (e.g., Ssh1 preferring, No preference, and Not interacting or excluded). The authors should comment on this.

      *We thank Reviewer #2 for highlighting this point that was not clear from our text and figures. There is definitely an enrichment of Signal Peptide (SP) containing proteins amongst the proteins that we think are bona fide substrates however this was not visualized clearly. To highlight this point we have modified Figure 2 and added a bar graph showing the distribution of SP and TMD proteins within the potential secretome. This graph now highlights the enrichment of SP containing proteins in the groups of proteins that preferred Sec61 or Ssh1 in comparison to the group that did not show a preference. *

      *We also now add a citation from a new manuscript from the Hegde lab that suggests that indeed soluble SP containing proteins are the key clients for the translocon pore (https://pubmed.ncbi.nlm.nih.gov/36261528/). We have also added a section to the discussion as to why we do not see differential enrichment of SRP or its receptor on either pore although in the past this was suggested to be the key difference between the two translocons. *

      Figures 1-2: They should comment on the reproducibility of the method. How many independent experiments were performed? If performed, how was reproducibility of results?

      Thank you for highlighting that this was not clarified enough – we have now extended the materials and methods section to make all of the above issues clear and apparent. In short, we performed 3 biological repetitions for each experiment and for each biological repeat we performed 3 technical repeats making our results altogether rely on 9 repeats. We then excluded proteins in two cases

      1. If strains were missing in either of the collections (so there was no complete set to compare them) – this caused us to drop 661 strains.
      2. In cases where the proteins were expressed at very low levels of extracted poorly in our assay – we defined this as the signal being ten standard deviations (or more) lower than the rest of the signals on the same membrane – this caused us to lose an additional 93 strains. Importantly, the SD between all 9 repeats never rose above 3 (see graph below showing al strains arranged by order in library and the SD between all 9 repeats) and also now stated clearly in the text) hence we think that our method is highly reproducible

      Figure 3: It is important to know the overlap of proteins commonly identified in both the interaction screening and protein localization assay. A Venn diagram that compares results between the two high-throughput assays would be useful.

      *We have indeed considered making this Venn diagram (See below). However, since the connection between the screens is not direct due to the fact that the protein localization is downstream to translocation events or maybe completely independent of it, we found that the number of specific proteins that are in both screens is low. However, there is a much larger overlap in joining processes and functions, therefore we decide to make the grouping showed in Figure 4B. We would prefer not to show this figure in the manuscript however we leave this to editorial decision. *

      Figure 4A (GO term): The authors mentioned that " the most consistent and repeating GO term group was those related to budding and polarity process. These include: "Establishment or maintenance of cell polarity"; "Development process involved in reproduction"; "Bipolar cellular bud site selection"; "Cell budding" and "Structural constituent of cell wall". Are protein sets in these functional categories similar or different? I am asking because GO enrichment analysis often provides apparently different functional categories but similar protein sets are included.

      Indeed, this reviewer is totally correct and this is also the case here to some extent. There is some level of overlap between the GO terms. However, in our case this overlap is quite small: Out of the 77 genes that have one of those GO terms assigned only 2 have all 4, 9 have 3 and 4 have 2 of the GO terms therefore we believe that in this case this issue with GO terms hierarchy and assignment is not redundant. We are happy to highlight this in the figure or text if this is deemed to be important.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

              We greatly appreciate the valuable comments from the referees, which have generally been very positive and constructive. The three referees have emphasized the significance of our study that opens a new direction of research regarding the role of RNA modification in viral defense. In addition, the reviewers confirm our view that the audience of our work would be broad.
      
              The major concerns of the reviewers are limited to four main points:
      
      1. i) to be clearer in our description on the effect of the m6A-YTHDF axis on the viral infectivity and avoid making assumptions on effects on replication (ref. #1 and #3);
      2. ii) reviewer 1 finds that the title and conclusion of this manuscript defining YTHDF proteins (ECTs) as "direct effectors of antiviral immunity" is misleading. Nonetheless, as detailed below, Reviewer 1 confuses mere knowledge of effects of m6A with those conferred by YTHDF proteins binding to m6A, and indeed overlooks nearly all evidence presented in the paper for how m6A in AMV confers antiviral resistance (i.e. mechanistic insight); iii) the discussion on the relative importance of antiviral RNA silencing and m6A-YTHDF against AMV;

      3. iv) to establish more clearly whether the phase separating capability of IDRs in the reading proteins correlates with the antiviral activity (reviewer 2). We have already completed substantial experimental work to address several of these points. Nonetheless, we find it prudent to ask for an extension of the revision time beyond four weeks to allow for repeats of a few of the infection experiments in question. In the following section, we specify a plan of action for the revisions.

      2. Description of the planned revisions

      • *Regarding the four major concerns raised by the reviewers, we will experimentally address the last two, whereas we think the first two do not need any further experimental work, as explained in section 4. Thus, the working plan for points #3 and #4 will be as follows:

      iii) the discussion on the relative importance of antiviral RNA silencing and m6A-YTHDF against AMV and related viruses

      As we mention in the manuscript (discussion, first chapter), AMV *“is one of only very few studied plant RNA viruses for which no anti-RNAi effector has been identified. In addition, prunus necrotic ringspot virus (PNRSV), a virus genetically and functionally closely related to AMV (Pallas et al, 2013), does not induce easily detectable siRNAs, unlike nearly all other studied plant RNA viruses (Herranz et al, 2015)”. *

      Thus, we do not come up with a strong judgment on whether RNAi is more or less important than m6A-YTHDFs for AMV resistance.

      In any case, although these indirect observations seem to be quite solid, we agree with the reviewer that conclusive evidence to discard RNAi as a defense layer against AMV, at least at the time where ECTs are acting, is lacking. Thus, we plan to evaluate how the absence of the main components of the RNAi machinery affects AMV infection and if this ‘universal’ defense layer interferes/overlaps with the ECTs antiviral defense observed here. Realistically, this will take us 8-10 weeks. The experiments within this topic are based on established and published methods and thus, on solid experience. We do not expect any fallback solution and the results will be conclusive in this sense. We also note that the very time-consuming part of constructing mutants defective in both RNAi and m6A-ECT components (in this case, ect2/ect3/rdr6), as well as a first round of infection assays has already been completed at this point

      iv) To establish more clearly whether the phase separating capability of IDRs in the reading proteins correlates with the antiviral activity (Reviewer 2).

      We agree with Reviewer 2 that this is an interesting and important question. Hence, we have teamed up with the group of Prof. Kresten Lindorff-Larsen, expert in molecular simulations of protein folding and interaction. The Lindorff-Larsen group has recently published a powerful computational approach to simulate phase separation behavior of intrinsically disordered proteins (IDPs) or regions of proteins (IDRs) (Tesei et al., 2021, Accurate model of liquid-liquid phase behavior of intrinsically disordered proteins from optimization of single-chain properties, PNAS 118, (44) e2111696118). Applying this simulation method to the Arabidopsis ECT proteins establishes two facts that we will incorporate into a revised version:

      • The IDR of ECT2 shows marked phase separation propensity, in agreement with the experimental evidence published in Arribas-Hernández et al., 2018, Plant Cell.
      • The deletion mutant of ECT2 (ΔN5) with defective antiviral activity, yet unaffected ability to accelerate growth of leaf primordia shows markedly reduced phase separation propensity driven, in the main, by the many tyrosine residues in the region deleted in the mutant. These results suggest that phase separation capability indeed correlates with antiviral activity.

      Since not only ECT2, but also ECT3, ECT5 and, to some extent, ECT4, participate in AMV resistance, we plan further simulation work on these proteins during the first two weeks of January 2023 before submission of a revised version of the manuscript.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

        • All the minor concerns raised by the three reviewers have been addressed and we have incorporated all of their suggestions in this intermediate version.

      4. Description of analyses that authors prefer not to carry out

      • *As previously mentioned, we believe that points 1 and 2 do not require an experimental approach to be addressed for the following reasons:

      i) to be clearer in our description on the effect of the m6A-YTHDF axis on the viral infectivity and avoid making assumptions on effects on replication (ref. #1 and #3)

              We agree with the reviewer that the term 'inhibition of viral replication' was not very appropriate because the idea that was intended to be conveyed was that of viral accumulation.        Hence, we will change this use of language, and we thank the reviewer for pointing out this inaccurate description.
      

      When it comes to differences between effects on infection in inoculated and non-inoculated leaves, there may be a slight misunderstanding, perhaps because we were not clear enough in our originally submitted version. In reality, there are some differences even in inoculated leaves between wild type and ect mutants, especially in the triple mutant, but the slightly higher accumulation in ect mutants is not clearly observed in every experiment and hence, does not always rise to the level of significance. Although it is possible that, at local level, ALKBH9B-mediated m6A would have other ECTs-independent effects, similar to what has been described for some animal viruses (Baquero-Pérez et al., 2021. Viruses), we think that the most likely explanation for this phenomenon is a combination of infection titers and ECT redundancy.

      The suggestion to use protoplasts is very accurate, but it would not resolve any doubt in this scenario, because ECTs are mainly expressed in mitotically active cells (Arribas-Hernández et al, 2020, 2018) and, since mature tissues make up the better part of the leaves used to isolate protoplasts, only few of the isolated cells would be useful. In addition, we previously showed that AMV accumulation is reduced in alkbh9b protoplasts compared to WT (Martínez-Pérez et al., 2021. Front. Microbiol.), which suggests that m6A levels of vRNAs are critical for the first stages of the infection, but in that case no problems with the expression pattern of the demethylase were expected.

      ii) The title and conclusion of this manuscript defined YTHDF proteins (ECTs) as "direct effectors of antiviral immunity", which is misleading. Effector molecules of an antiviral immunity cannot be identified when the effector mechanism is unknown;

      In this regard, we have a very different vision from the one the reviewer proposes. We believe that it is not correct to say that the effector molecules of an antiviral immunity cannot be identified until its mechanism is demonstrated. In fact, RNA silencing effectors were discovered long before their mechanism was elucidated in detail. One molecular interpretation of the Flor’s seminal gene-for-gene model, in terms of receptor/effector recognition, is that specific interaction between the receptor and its recognized (cognate) effector protein triggers resistance.

      Furthermore, we strongly believe that we provide enough arguments to propose a model, although, as we comment in the end of the discussion, “we view this model as a conceptual framework of value in the design of future experiments to test its validity”. The reasoning that we show here is the following:

      1. The m6A binding proteins are necessary for the antiviral response.
      2. At least ECT2 recognizes AMV RNAs in vivo and that its m6A-binding capacity is necessary to play a role in AMV infection.
      3. Simply losing methylase activity – with the same developmental defects as ect2/3/ – does not lead to the same degree of loss of resistance, and you can affect AMV resistance without affecting developmental functions of ECT2. Altogether, these observations justify the proposal that m6A exerts antiviral effects by acting as binding sites of ECT proteins in viral RNA, which we consider a clear mechanistic advance.

      Bearing in mind that m6A-modified vRNAs might concentrate in replication complexes and that MeRIP-seq methodology to map m6A revealed site multiplicity in the genome of some RNA viruses (Gokhale et al., 2016. Cell Host&Microb; Martínez-Pérez et al., 2017; Lichinchi et al., 2016. Nat Microbiol; Lichinchi et al., 2016. Cell Host&Microb; Marquez-Molins et al, 2022), our results recalled the previously proposed model in which m6A sites multiplicity causes the phase separation of these RNAs through the interaction of the IDRs of the YTH proteins (Ries et al, 2019; Fu & Zhuang, 2020; Gao et al, 2019). Now, with the new simulations of phase separation behavior, although still a model that requires further experimental tests, we have better evidence to support the model that it is related to LLPS of ECT-bound viral RNA. Therefore, we firmly believe that our title conceptually reflects the basic concepts of resistance induction in virus-plant interactions.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1

      Although this is an interesting, and generally well-performed study, it is primarily observational and there are few mechanistic insights provided into how MUC13 modulates barrier function. The authors propose a presumably direct interaction between MUC13 and PKC, which apparently sequesters PKC, preventing this kinase from triggering PKC-dependent increases in TJ barrier function; however, there is no evidence that a MUC13-PKC interaction occurs, that MUC13 is phosphorylated by PKC, or that phosphorylation of MUC13 has any impact on its function or overall barrier function. Thus, the hypothesis is not directly tested and all observations in this manuscript are generally correlative in nature.

      While the MUC13 cytoplasmic tail contains a putative PKC-binding motif, we indeed do not show a direct interaction between MUC13 and a member of the PKC family in this manuscript. Unfortunately, we have so far not been able to successfully perform (co-)immunoprecipitation of MUC13 with our current anti-MUC13 antibodies.

      To provide more insights into the possible MUC13-PKC interaction, we plan to perform several experiments.

      • First, we will determine the expression levels of the different PKC isotypes (PKC alpha, beta, gamma, delta, epsilon, and zeta) in the HRT18 cell lines by western blot.
      • Next, we will determine the localization of the relevant PKC isoforms and MUC13 by immunofluorescence microscopy. We are curious to see if we can find a colocalization between MUC13 and a PKC member on the lateral or apical membrane. If we can demonstrate a colocalization, we could follow up with a proximity ligation assay, but this would require the MUC13 antibody directed against the cytoplasmic tail (which only detects the lateral population) and might therefore be challenging.
      • Furthermore, since PKC delta protein levels were upregulated in the total lysate of ∆MUC13 cells, we will test a PKC delta-specific inhibitor in the TEER assay.

        Consider quantifying all blots (Fig. 5C, Fig. 6B).

      As suggested, we will quantify both blots.

      Consider using dot-plots for all quantified data.

      The graphs will be altered to include individual measurement points.

      Reviewer 2

      Fig2E showed two bands with different size in the two MUC13 WT control cell lines. They hypothesized that this could be the consequences of glycosylation different patterns. A sample with untransfected HRT18 might be included in the western blot panel. Additionally, what is the 100kDa band?

      Mucin blots are notoriously difficult and these MUC13 blots are the result of a lot of trial and error. We repeated the Western Blot with original HRT18 cells, HRT18 original cell line, as well as the two CRISPR control cells used in the study (WT 1 and WT 2) and one of the full-length MUC13 knockout cells. The higher band was absent from the MUC13 knockout cells, but a small shift in the MUC13 band size can be noted in the WT 1 cells compared to the original and the WT 2 cell lines, possibly indicating a change in the glycosylation pattern. The 100 kDa band remains detectable in all cell lines including the ∆MUC13 cell line, therefore we consider this to be an aspecific background band of the MUC13 antibody. We will add a more extensive Western Blot analysis to the manuscript.

      Did the transfection of the inducible GFP-MUC13 plasmid induce any decrease of Claudin1/3/4 in HRT18 or Caco2 cells? Same question regarding PKCdelta.

      These are indeed interesting questions. We will perform these experiments with our MUC13-overexpression HRT18 cells.

      Reviewer 3

      Moreover, the authors should determine if MUC13∆CT localize to TJs, as suggested by the working model in Figure 7C. The subcellular localization of MUC3∆CT could give critical clues for its function, but Figure 2G fails to provide any information and the authors do not present any additional data concerning the localization of MUC13∆CT. Detection of MUC13 in membrane fractions of WT, MUC13∆CT and cells lacking the mucin domain could be a feasible strategy forward.

      We will perform additional immunofluorescence experiments to determine the subcellular localization of MUC13-∆CT more accurately. However, detection of the extracellular domain by western blot, as suggested, is not possible due to the incompatibility of the extracellular MUC13-directed hybridoma antibody with the western blot technique. We currently do not have a suitable antibody that recognizes the ED and can be used for western blot.

      The authors introduce an inducible MUC13-GFP fusion protein into WT and ∆MUC13 cells and show that it reverses the enhanced TEER upon MUC13 deletion. Unfortunately, the "Materials and Methods" section lacks adequate information on how this fusion protein was designed. Critical questions are the position of the GFP tag within MUC13, whether the fusion protein is correctly processed in HRT18 cells, and if it localizes to the apical or apico-lateral membrane domains? Figure 2H is of low magnification and fails to provide information on the subcellular localization of the MUC13-GFP fusion protein.

      The materials and methods section will be adjusted to describe all the design details of the fusion protein. The GFP tag was added to the MUC13 C-terminus with a GGGS linker sequence in between. Processing of the fusion protein seems correct as we observed MUC13-GFP localization to both lateral and apical membranes and no access intracellular build up. As suggested by the reviewer, we will add more detailed immunofluorescence pictures to the manuscript.

      Figures 6B-C suggest that PKCdelta levels increase in ∆MUC13 cells, which correlates with higher enrichment of Claudins in membrane fractions. The authors then inhibited PKCdelta and observed reduced recruitment of Claudins to membrane fractions. Since the family of Claudins are differentially regulated by phosphorylation (PMID: 29186552), the authors should investigate the TEER phenotype of WT, ∆MUC13 and MUC13∆CT upon PKC inhibition.

      We must clarify that figures 6C-D are done using the PKC inhibitor targeting all conventional PKCs (alpha, beta, gamma) as well as delta (https://www.tocris.com/products/gf-109203x_0741). We recently obtained a PKCdelta-specific inhibitor which we will test in the TEER build-up experiments.

      Moreover, the authors predict phosphorylation sites in MUC13CT and suggest a link between PKC and MUC13 (Figure. 6A), however no evidence is presented to support this hypothesis. The authors should either determine if PKC phosphorylates MUC13 and if this modification has implication on MUC13 localization and TJ function, or remove statements regarding MUC13 phosphorylation. The data provided suggest that PKC regulates TJ proteins independent of MUC13.

      We will adjust the manuscript to put less emphasis on the putative PKC motifs in the MUC13 cytoplasmic tail. For further details on how we will proceed regarding the possible MUC13-PKC interaction see question 1 from reviewer #1.

      Figure 5C. Quantification of at least 3 independent experiments is required.

      These data will be added to the manuscript.

      Figure 6B. Quantification of at least 3 independent experiments is required.

      These data will be added to the manuscript.

      Reviewer 4

      OPTIONAL: MUC13 is expressed both, in the basolateral membranes and in the apical membrane of intestinal epithelial cells (IECs). Does the authors check the relevance of MUC13 in the formation of microvilli in IECs? Are microvilli different (microvilli staining, number of positive cells to microvilli, length, width or distribution of microvilli) in ΔMUC13 and in MUC13-ΔCT? How the glycocalyx looks like in these cells genetically modified for MUC13?

      HRT18 cells do not seem to develop microvilli. However, we plan to stain these cells with a microvilli-specific antibody (ACTUB). The HRT18 cells express mostly MUC13 and relatively low levels of the larger TM mucin MUC1. To study changes in the glycocalyx, we will stain using a MAL-II antibody which targets α-2,3 sialic acids, which are abundantly present in mucins. In this way, we will determine any big changes in the total glycocalyx that may occur in response to the removal of MUC13.

      In the figure 1D would be nice to represent the co-localization of MUC13 together with occluding in a graph in each Z-stack so you can visualize in which part of the cell is maximum colocalization of these both components.

      These data will be provided.

      In the figure 1E, would be great to compare between the two different MUC13 antibodies the apical fraction stained in HRT18 and Caco-2. Specially in the HRT18 cell line since the first antibody did not label apical MUC13 expression meanwhile the second antibody detects the apical expression in these cells. How much lateral lateral stain the C terminal antibody compare with the extracellular antibody for MUC13 and how much stain apically the C terminal antibody compare with the extracellular antibody? Would be nice to see some comparative results using the intensity by Z-stack and plotting in a graph.

      This is a good suggestion as it is quite intriguing that both MUC13 antibodies seem to target (partially) different MUC13 populations. We will perform co-staining with both MUC13 antibodies to provide information on which MUC13 populations are detected by each antibody (apical vs lateral membrane).

      Manuscript would be improved if in the figure 2H to compare within the same cell line the number of MUC13 positive cells in the WT, number of MUC13 positive cells in WT+pMUC13 and the number of MUC13 positive cells in the ΔMUC13+pMUC13

      We will quantify the percentage of MUC13-GFP positive cells in both the WT and ΔMUC13 backgrounds by either microscopy or flow cytometry.

      In figure 5C would be helpful to plot in a graph the normalized expression of each TJ protein and compare between the different cells used (WT, ΔMUC13 and MUC13-ΔCT) as you did in figure 5A

      We will provide the quantification data of three independent experiments.

      Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer 1

      In addition, this model does not explain why all kinase inhibitors tested reverse the increase in TER observed in deltaMUC13 cell lines. Does this reflect the lack of inhibitor specificity or the likelihood that many kinases are involved?

      As stated in the manuscript, we think that MLCK, ROCK, and PKC are all essential for TER buildup in the ∆MUC13 cells. Because the roles of MLCK and ROCK are well established, we choose to follow up on the PKC results. We adjusted the text to clarify this point.

      The authors do observe that there is an increase in expression of several tight junction-associated proteins, including the claudins, in deltaMUC13 cells. Affected CLDNs include 1, 2, 3, 4, 7, 12. (1) While it appears the authors are arguing that this increased claudin expression results in increased barrier function, they do not sufficiently highlight the well-known role that CLDN2 has in cation transport, and both CLDN-4 and -7 have also been implicated in paracellular ion flux (although this is apparently cell-type specific). These observations would seem to argue against a simple correlation between claudin expression and tight junction barrier function.

      The reviewer is right about the different functions of claudins. Claudin-2, -4 and -7 have (potentially) pore-forming properties, while the other claudins restrict paracellular passage. It has been previously demonstrated that the magnitude of paracellular ion and water flux is reflected by the specific repertoire of claudin family members (Shashikanth et al., 2022). In this paper, overexpression of claudin-4 was shown to mobilize and affect polymeric strands of claudin-2, thus blocking its channel activity. Our mass spectrometry data demonstrated a striking increase in claudin-1, -2, -3, -4, -7, and -12 in the MUC13 knockout membranes compared to WT. We hypothesize that the claudin repertoire in the MUC13 knockout cells leads to a more restricted paracellular route (as observed in the TEER and tracer experiments). The pore-forming claudins may be subject to “interclaudin interference” therefore leading to restriction of the total paracellular ion and water flux. We have adjusted the text of the manuscript to clarify this point.

      We attempted to investigate claudin-2 expression levels in isolated membranes by Western Blot but were unsuccessful as the antibody did not detect any protein while claudins-1 and -4 could be detected with the same method.

      Furthermore, the authors should note the disconnect between paracellular ion flux mediated by claudins and the flux of markers such as dextrans and lucifer yellow, which can be dissociated from claudin function.

      We acknowledge that the flux of larger particles (the leak pathway) is not regulated by claudins (which regulates the pore pathway). We aimed to assess both the pore and the leak paracellular pathways, by using different techniques including TEER, small solutes (Lucifer Yellow CH), and larger molecules (4 and 70 kDa FITC-Dextrans). HRT18 wild type cells are already very restrictive to the pass of larger molecules (FITC-Dextrans) but are more permeable to smaller solutes such as Lucifer Yellow (400 Da). We observed that removal of the MUC13 cytoplasmic tail did not affect the TEER, but reduced the paracellular passage of Lucifer Yellow, demonstrating that manipulation of MUC13 can affect both the pore and leak pathways. We adjusted to text to include this point.

      The increased expression of claudins in the nominally tail-minus MUC13 without a corresponding change in TER would again seem to argue against a simple correlation;

      MUC13-dCT cells showed consistently increased levels of claudins-1 and -2, but not the other claudins. This claudin repertoire (with high claudins-1 and -2, but lower claudin-3, -4, -7, and -12) is apparently not enough to increase TEER. We think that this again reflects the importance of the total claudin composition for the control of the paracellular pathway.

      Watch the use of decimal points instead of commas (lines 253 and 256).

      Corrected.

      Line 543: MilliQ is not a washing agent (or is it?). (Line 535) We use MilliQ as a final step before mounting the glass slides to remove any possible salt deposition that would affect the visualization by microscopy.

      We have specified this in the text.

      Line 553: TER is the product of total resistance times the area. The units are ohms times area.

      Indeed, we have changed this mistake (line 545).

      Line 630: Please provide the transfer conditions (voltage, amp, watts?) and transfer buffer when describing the Western blot protocol.

      For immunoblotting of MUC13, protein lysates were transferred to 0.2 µm PVDF membranes using the Trans-Blot Turbo Transfer system (Biorad). The transfer was run using the protocol (High MW) which consisted in running for 10 min at 25 volts (V) and 1,3 amperes (A). These experimental data were added to the manuscript.

      Reviewer 2

      My main concern about this manuscript is that the authors analyzed MUC13 role in intestinal homeostasis and function using colorectal cancer cells. As helpful as cancer cells are, we should always be cautious about extrapolating roles in normal intestinal epithelium or IBD pathology. Obviously, these finding are also interesting in a cancer context. Using GEPIA (http://gepia.cancer-pku.cn/), I observed that MUC13 is overexpressed in colorectal cancer COAD-TCGA dataset (compared to normal colon from GTEX). Similar results were obtained previously by Gupta et al. (ref #10). I am aware that this would be difficult to confirm the main findings in a non-cancerous intestinal cell line but this limit (normal intestine using cancer cells) should be at least discussed in the manuscript.

      We appreciate the reviewers’ comments and are aware of the downsides of using cancer-derived cell lines. We have performed the GEPIA analysis ourselves and have an ongoing project about the possible role of MUC13 in colorectal cancer progression. In a separate project, we are collaborating with the Gaultier Laboratory at the University of Virginia which has generated a MUC13 knockout mouse. This model will allow us to study the role of MUC13 in non-cancerous tissue. We recently received intestinal biopsies from these mice which will be stained with MUC13 and claudin antibodies to determine localization in healthy tissue. These experiments will reveal if MUC13 colocalizes with claudin on the lateral membrane in the healthy mouse intestinal tract. In future experiments, we will also address MUC13 localization and function in human intestinal organoids. We have adjusted the discussion to refer to the limitations of using cancer cell lines.

      Massey et al (Micro 2021, PMC7014956) previously showed that MUC13 overexpression increased rigidity in PDAC cells and discussed involvement MUC13 link with EMT. MUC13-Her2 interaction was also associated with decrease of E-cadherin suggesting an EMT phenotype. This should be included in the discussion section.

      The discussion has been adjusted to include the link with EMT.

      The authors performed mass spectrometry analysis. Results are deposited on ProteomeXchange but are not yet publicly released. Among the 1189 membrane protein identified. Did the authors observed alteration of EMT proteins? (decrease of vimentin for example). In the discussion section (lane 347), the authors mentioned the relationship between other membrane bound mucins such as MUC1, MUC4, MUC16 or MUC17 and AJ/TJproteins. Did the authors observed any alteration of these mucin in the mass spectrometry data?

      The mass spec analysis was performed on membrane fractions, therefore our dataset will not contain true cytosolic proteins. One of the key EMT proteins, Vimentin, is a cytosolic protein, and indeed it was not found in our dataset. Other EMT-related proteins are shown in the following table. TGF beta 1 was slightly decreased, while E-cadherin and Integrin beta 6 were slightly increased in the ∆MUC13 cells compared to WT cells.

      Gene Name

      Mean WT

      Mean ∆MUC13

      Mean MUC13-∆CT

      TGFBI (TGB beta 1)

      20,54

      16,48

      18,83

      CDH1 (E-cadherin)

      22,69

      24,57

      24,24

      ITGB6 (Integrin beta 6)

      18,86

      21,74

      19,19

      Vimentin - Cytosolic

      -

      -

      -

      CDH2 (Cadherin-2, N-cadherin)

      -

      -

      -

      Mucins are large proteins comprised of densely O-glycosylated mucin domains, which makes them extremely challenging to study by mass spectrometry (MS) (Rangel-Angarita et al., 2021). We did not specifically employ mucin-directed technologies in this dataset, thus making the detection of mucins hard. No mucins other than MUC13 were detected. For MUC13, two peptides corresponding to the EGF-like domains in the extracellular domain, a region that is less densely glycosylated. We added a sentence to the description of the mass spec results to include the EMT proteins and other mucins.

      Minor points:

      Lane 126: HRT18 and Caco2 colon cancer cells instead of intestinal epithelial cells

      Corrected.

      Lane 181 and lane 514: add "full length" MUC13 DNA sequence

      Corrected.

      Lane 234: TEER was measured every 12h. How the authors did observed the largest increase at 42h? Was it 48h? Please clarify.

      We aimed at measuring every 12 h, however the exact measurements were done at 18h, 24h, and 42 h post-infection. We have corrected this in the manuscript.

      Reviewer 3

      Line 43 and 46. "Enterocytes" should be replaced with "intestinal epithelial cells", since enterocytes are themselves a distinct subpopulation of IECs.

      We have changed it in the manuscript.

      Lines 58-60. References in support of the statements should be added.

      We added a reference to this sentence.

      Lines 188-190. Authors comment on "roundness" of different cell lines. If the parameter is critical for the manuscript, the authors should quantify this phenotype.

      The parameter is not critical for the manuscript. We removed the sentence.

      Figure 3A. Staining of cell lines should include panels showing localization of MUC13.

      Co-staining of MUC13 with occludin in HRT18 cell lines can be found in figure 1D, and MUC13 with E-cadherin in supplementary figure 1.

      Lines 323-327 and 390-392. Sentences on these lines contradict each other. The sentences should describe/discuss quantified data presented in Figure 6D.

      The reviewer is right that we should be discussing the quantified data in 6D. We adjusted the sentence in line 323-327.

      Proteomic data sets should be made publicly available on data depositories.

      All proteomics raw data were deposited to the ProteomeXchange Consortium with the dataset identifier PXD029606.

      Reviewer 4

      OPTIONAL: In the figure 2E, is the extracellular antibody still detecting the MUC13-ΔCT?

      No, unfortunately the antibody directed against the MUC13 ED is not compatible with western blot.

      In the figure 2G, would be nice to comment possible reasons why the deletion in the first cell line of the MUC13-CT you can still detect with the extracellular antibody some lateral expression of MUC13 meanwhile in the second cell line, the same deletion (MUC13-CT) you cannot see any lateral MUC13 staining with the extracellular antibody.

      Yes, this is indeed a puzzling finding, especially because the CRISPR deletion is the same in both cell lines. We will add a sentence about possible reduced stability of the MUC13 without CT domain that leads to a different outcome in both cell lines.

      It would be nice that the results from Figure 3H are better explained since it is difficult to follow.

      We adjusted the text to explain the experiment in more detail.

      2. Description of analyses that authors prefer not to carry out

      Reviewer 1

      The authors may be overly reliant on TER measurements. Epithelial cells have two parallel resistive pathways: transcellular and paracellular. TER measure the contribution of both. Thus, an increase in TER could result from a decrease in transcellular ion transport. The authors need to measure transcellular ion flow or selectively measure the junctional resistance in a select set of experiments to rule this possibility out.

      The reviewer is right that TEER is a sum of the resistance of the transcellular and paracellular pathways. However, due to the high resistance of cell membranes, the current predominantly travels via the paracellular route (Elbrecht et al., 2016). For this reason, TEER measurements are widely accepted techniques for the assessment of ions passage through the paracellular pathway (Shen et al., 2011).

      Reviewer 3

      Figure 1C. Caco2 and HRT18 cells exhibit distinct MUC13 expression patterns when probed with an antibody against the MUC13 CT; MUC13 localizes almost exclusively to lateral cell junction in HRT18 cells, while a higher portion of MUC13 is present on the apical surface of Caco2 cells. This observation has two possible explanations: 1) the two cell lines express distinct forms of MUC13, or 2) the two cell lines carry distinct machineries for anchoring MUC13 to apical versus apico-lateral membranes. Thus, The authors should take the opportunity to determine the impact of MUC13 deletion on TEER and TJ function in Caco2 cells. Proteomic analysis and functional assays in Caco2 cells may provide more a general mechanism for how MUC13 regulates TJ proteins.

      Yes, this would be a great line of investigation. However, we aimed to knockout MUC13 in Caco-2 cell lines (with the same CRISPR/Cas9 protocol as the HRT18 cells) but were unable to obtain Caco-2 knockout clones. We think this might be a consequence of the poor capability of Caco-2 cells to grow as single colonies (a required step in the protocol). Another option is Caco-2 MUC13 knockout cells have reduced viability.

      The authors generate cell lines that either lack MUC13 or express MUC13 lacking the cytoplasmic domain. Loss of MUC13 cells resulted in enhanced TEER and increased recruitment of TJ proteins to membrane fractions. MUC13∆CT cells show moderate recruitment of TJ proteins to membranes and no increase in TEER but inhibit paracellular diffusion of Luciferase Yellow across monolayers. Figure 3A suggests that Occludin redistributes to tricellular junctions in ∆MUC13 cells, whereas it is found more laterally in WT and MUC13∆CT cells. These finding suggest that full-length MUC13 interferes with TJ protein complexes. However the impact of the extracellular and intracellular (CT) domains is not fully elucidated. Does the O-glycosylated mucin domain interfere with the extracellular domains Occludin and Claudins? The authors should clarify the contribution of the mucin domain to the observed phenotype, for example by performing the described experiments in a cell line expressing MUC13 lacking the mucin domain.

      Mucins are type I membrane proteins with the N-terminal part of the protein on the extracellular site. Therefore, a CRISPR method to specifically remove the glycosylated domain but leave the remainder of the protein in frame is challenging. An additional difficulty is that the ED contains a lot of repeats, complicating the design of specific guide RNAs. To specifically address the contribution of the glycosylated domain, we could complement the MUC13 knockout cell with a construct lacking the ED. However, this would not be comparable to the endogenous MUC13∆CT cell line presented in this manuscript. In future studies, we will strive to address the functions of the different MUC13 domains in more detail.

      Figure 5A. Turnover of TJ proteins in membrane fractions occurs faster than over a period of 1-3 days (PMID: 18474622). The authors should determine TJ protein turnover over a period of minutes and hours.

      We acknowledge the findings in this interesting paper concerning the continuous remodeling of tight junctions. However, the readout of our biotinylation assay is degradation and the timeframe of degradation turns out to be days and not hours. Within this timeframe remodeling is taking place but it cannot be captured in the total lysate.

      Reviewer 4

      OPTIONAL: The authors show that the probiotic Lactobacillus plantarum increase epithelial barrier independently of MUC13. Have the authors considered to use other probiotics as Lactobacillus paracasei (10.3389/fcimb.2015.00026), Akkermansia muciniphila (10.1038/emm.2017.282) or some metabolic products from intestinal microbiota as short-chain fatty acids (SCFAs) (10.3389/fphys.2021.650313) to check what is the role of MUC13 and if it is related with other microbe or microbiota metabolite?

      Thank you for the suggestion. We have an ongoing project in which we investigate the impact of different probiotic bacteria and plan to investigate whether they have an impact on the epithelial barrier function in a MUC13-dependent manner. This study will lead to a separate publication.

      OPTIONAL: The authors successfully delete MUC13 in IECs, both, full length and the cytosolic tail. Have the authors considered targeting the deletion of the PTS domain in MUC13? Could affect that something different from paracellular trafficking as the extracellular detection of microbes and microbial products?

      Removal of a domain in the extracellular domain of MUC13 with CRISPR is challenging because mucins are type I membrane proteins, the repeats and possible frameshift, as described above.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Official Revision Plan Document:

      Manuscript number: #RC-2022-01681

      Corresponding author(s): Nicholas, Leigh

      1. General Statements

      We sincerely appreciate these positive and helpful reviews. We are grateful for the constructive comments and we outline our responses below. Addressing these comments will further broaden the impact of the work and increase the power, reliability, and application of single cell approaches while decreasing the cost and labor intensive collection steps.

      As single cell sequencing approaches have entered the mainstream, we are still finding flaws and artifacts from these methods. A major limitation of widely used collection approaches is a difficulty in obtaining biological replicates, which are required to generate robust sequencing datasets. In general, a lack of biological replicates has been a major oversight in the vast majority of single cell studies, and any technique that can facilitate biological replicate collection should be widely applied. The elegance of SNP-based demultiplexing lies in the fact that it can be applied regardless of any external label, applied to previously collected data, and the data are already collected for every sample sequenced. We were pleased to have the reviewers agree and identify the many conceptual advances in this manuscript, with one major critique being noted by one reviewer as a lack of novelty.

      Regarding the lack of novelty, we appreciate that SNP-based demultiplexing was not developed as a method within this manuscript, but disagree that a broad benchmarking and validation study that opens the doors to the use of SNP-based demuxing in any species with sufficient between animal genetic heterogeneity lacks novelty. To address this concern, we will now further emphasize the drawbacks and artifacts that can arise in the currently common practice of pooling samples and choosing not to demultiplex, while improving our explanation of our discoveries in this manuscript. The lack of biological replicates in single cell sequencing studies is rampant and needs to be addressed with approaches such as those demonstrated here. We also want to emphasize the importance of validating and benchmarking bioinformatic approaches with orthogonal, priorly established approaches (eg. wet-lab based methods), which had previously not been conducted for SNP-based demultiplexing, outside of human samples. The inbred nature of common lab animals and broad range in quality and availability of genomic resources make this a major step forward in bringing SNP-demultiplexing to all labs. We believe that our paper broadly extends, benchmarks and most importantly validates the advantages and limitations of SNP-based demuxing across various species.

      2. Description of the planned revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      “Cardiello et al tested if souporcell (https://pubmed.ncbi.nlm.nih.gov/32366989/) can be used to demultiplex samples for some model organisms, based on identified SNPs. For this, they used synthetic multiplexed data, publicly available datasets and some new datasets, spanning samples from five model organisms. Their analysis indicates that souporcell could be used to

      demultiplex scRNA-seq experiments for multiple species, which offers a cost-beneficent approach.

      The manuscript reads well and shows this approach can work for different model organisms. However, unfortunately, I am confused about the amount of novelty in this manuscript. The method, souporcell, is already published. The authors indicate souporcell is not validated in non-human samples, but the original paper states that their method works with malaria parasite data (Fig 3b, FigS4). Adapting and using an available tool for different model organisms is good and groups working on different model organisms may find this manuscript useful, but the same could be said for the original article. Due to these reasons, I am not sure whether this manuscript has novelty sufficient for publication.”

      __Our response: __We appreciate this constructive criticism that helped us realize that our novelty was not clearly stated in the first version of the manuscript. We need to improve our Introduction and our verbiage as to what has been previously performed and how this current manuscript provides novel insight into multiple previously unanswered questions which broadly extend the utility of SNP-based demultiplexing. To address this comment, we will revamp our Introduction, Results, and Discussion to more clearly highlight the novelty of this work.

      Planned revisions:

      Defining “validation”. We define validation as establishing the accuracy or validity of a method. Therefore, validation of SNP-based demultiplexing for use in non-human species requires comparison to an already proven, orthogonal method, such as a wet-lab based demultiplexing approach. The souporcell paper does not validate (i.e., confirm with an orthogonal wet-lab method) the results from souporcell in any species but humans. This lack of validation for SNP-based demultiplexing in samples from non-human species made it unclear how and if these approaches would work in other species. Human samples are expected to perform exceptionally well in this approach due to their extremely high genetic diversity and wealth of available genomic resources. Thus, while it was exciting that the original souporcell authors chose to try applying their algorithm to a non-human (e.g., malarial parasite) dataset, the paper left many unanswered questions about potential uses and accuracy. In addition to validating the accuracy of souporcell results in many species, we demonstrated that souporcell shows a relatively poor ability to call doublets in many non-human vertebrates. In addition to highlighting a novel drawback of the method, this demonstrates the need to validate the accuracy of different aspects of tools like souporcell when applied to new systems rather than use souporcell or other SNP-demuxers prior to validation. Highlighting other novel findings in this work: For instance, our assessment of which genomic resources are required for using SNP-based demultiplexing in different species, whether this could be applied to lab animals likely to be inbred to various degrees (and to address other reviewers comments, the inbred level permitted), assessment of the accuracy of SNP-demultiplexing in species with alignment references of varying qualities (i.e., only de novo transcriptome) and genomes of varying sizes (up to 30Gb, 10 times larger that of human, which can be extremely computational intensive), and the exploration of pooling and demultiplexing of multiple species in a single library. Making clear how we made the necessary adjustments to the original souporcell pipeline to successfully apply it to datasets with various resources available in these species.

      (Reviewer #1): I also wrote down two minor points below:

      1- Doublets assigned by souporcell compared to the fluor-based assignment look random. In Fig 2 doublet recovery rate looks smaller, and in fig 3 doublet rate prediction looks more random. This is a bit confusing. Is there any explanation for this?”

      __Our response: __We agree and thus noted in the manuscript that the detection of doublets in these datasets by Souporcell are not very reliable.

      Planned revisions:

      We will expand our Discussion to include brief hypotheses for factors that likely contributed to poor doublet detection by souporcell in these analyses. In the Discussion we will clearly suggest complementary approaches for improving the detection/removal of doublets in pooled scRNA-seq experiments through applying external gene expression-based doublet detection programs. We will also attempt to use these programs on at least one of our datasets to see how well independant doublet detection methods complement souporcell on pooled datasets. A full benchmarking of these doublet detection methods already exists and will be referenced in our Discussion.

      Reviewer #1: “2- The authors discussed the immune system cells might show some variability in their discussion (referring to fig 3), but this is not clearly shown in the figures as data. Having a percentage bar graph could make it clearer for the readers.”

      __Our response: __This is a valid point that we plan to address with the addition of a new figure as well as some clarifications in the text.

      Planned revisions:

      We will make a supplemental figure for Figure 3 in which we clearly demonstrate animal to animal variability. (bar plot of absolute cell numbers present from each individual animal present in each cell cluster as requested). In the new supplemental figure we will also include a new UMAP plot of fluorescently assigned cell identities belonging only to one of the three animals, which makes it easier to visualize the difference in numbers of cells from each animal present in each individual cell cluster. We will also cite papers that have already demonstrated the phenomena of animal to animal variability in scRNA-seq datasets. We will further emphasize that even in the absence of animal-to-animal variability in co-clustering, that demultiplexing pooled datasets is important because differential expression analysis is greatly enhanced with biological replicates.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Major comments:

      “1. SNP-based demultiplexing performed well on some species, such as zebrafish and Africa green monkey, from which over 90% of the cells analyzed were correctly identified. However, this accuracy decreases in Pleurodeles samples when a common SNPs VCF file is absent (Fig.3). It showed that cell identity can be more precisely defined with the increase of average read depth (Fig 3B). So, I am wondering whether the mis-defined cells shown in Fig. 3E, actually are cells with lower reads. It is better if the authors can test such a correlation between the cell identity and the depth of reads using the data from Fig. 3E.”

      __Our response: __We are thankful to reviewer #2 for raising such a great point. We do see the accuracy of the benchmarking results for this experiment increase with increasing sequence depth/cell quality. However, the reasons for this are potentially more complex than just higher accuracy of souporcell in higher quality cells: The fluorescent-based demultiplexing that is being used for “ground truth” in benchmarking souporcell for this figure is more accurate in cells with higher read depth because more fluorescent gene reads are likely to be captured. Therefore analyzing the accuracy of souporcell relative to fluorescent-based demultiplexing over varying read depths can be confusing because it is possible that both methods improve in accuracy with higher read depth. Figure 3B attempts to illustrate this concept, and to demonstrate why we chose to benchmark only the cells with sufficient read depth (read depth between 5K, and 40K, and >1 fluorescent gene read per cell). We plan to complement our manuscript with additional figures and text that will make this clearer.

      Planned revisions:

      We will produce a plot similar to Figure 3B, but with a Y axis that is the percent agreement between the two methods. For Figure 4 we will also make a plot showing percent agreement between demux methods versus read depth. This plot will be a useful comparison to investigate whether scRNA read depth is directly affecting the quality of souporcell’s SNP-based demux results. Plotting this comparison for a dataset in which Cellplex/Cell hashing is the benchmarking demux method is a more fair test of the effect of sequencing depth on the souporcell demux results because cellplex results rely on reads from the cellplex library, which are an independent sequencing library from the scRNA reads. We will investigate whether the use of a common VCF file or lack thereof affects souporcell accuracy. To test this, we will try repeating souporcell demux of one dataset with and without a common VCF file input to see if the VCF file inclusion affects the accuracy of souporcell results.

      Reviewer #2:

      2. Please discuss limitations of this approach in the manuscript. (1) To which extent, when SNPs are roughly present in the individuals of same species, SNP-based demultiplexing can be applied, e.g., individuals from an inbred strain (c57bl6 mice) would not work.(2) The authors experimentally tested two newt species using SNP-based demultiplexing. When multiple species are experimentally applied, may the cell/nuclei size variation cause problem?”

      __Our response: __We agree with Reviewer 2 that this paper brings up many technical questions about the limits to which SNP-based demultiplexing will succeed. These limitations should be addressed more thoroughly in our Discussion section.

      Planned revisions:

      We will expand our Discussion to more fully discuss the predicted limits for SNP-based demuxing for separating pooled cells from genetically similar individuals. We referenced the single paper previously published which reported that Freemuxlet, a similar approach to souporcell, did not succeed when applied to cells pooled from multiple animals within an inbred mouse strain, but did succeed across mouse strains (though without any validation of results). We will expand this Discussion to address the expected effects of genetic diversity on the success of SNP-based demultiplexing methods. We will also note in this expanded Discussion that SNP-based demuxing worked in this paper on siblings (some of the xenopus, some of the zebrafish), and other SNP-based demuxers have been used successfully for demuxing cells from closely related individuals including human siblings (scSplit) and human maternal/fetal pairs (souporcell). We will expand our Discussion to address the potential drawbacks of pooling cells from different species or tissue types including the possibility of a bias in scRNA-seq sample preparation methods. We expect that variations in cell or nuclei sizes between species could cause biases in cell capture depending on the scRNA-seq library preparation method, especially with microfluidic based scRNA-seq preparation methods. We will search for a dataset that would allow for synthetic pooling of inbred mouse data and, if available, put this through our synthetic pooling and demuxing pipeline. While other papers have reported this does not work with other SNP demux tools, and on comments on the souporcell github (https://github.com/wheaton5/souporcell/issues/154) it does not seem to be working, we feel this would be a nice test/reference for showing the limitations for SNP-based demuxing in highly genetically similar individuals.

      (Reviewer #2)* *

      “3. What is the upper limit number of samples when using this model. Please make some estimation or discussion about it.”

      __Our response: __We think this is a pressing question for the future of SNP-based demuxing and deserves further discussion in this manuscript. This is directly addressed by the authors of souporcell in a github thread with regard to human samples (worked on 21 human samples, may work in up to 40). At this point, we have no reason to believe that the limit on sample numbers should be different in other species.

      Planned revisions:

      We will include discussion about potential limits for the maximum number of samples that can be pooled and demuxed using this approach. As discussed below in response to reviewer 3, we will quantify the genetic differences in pooled datasets in this manuscript in order to give readers an improved prediction of how well SNP-based demuxers are likely to work on their animals of interest. We will look for previously published pooled dataset from zebrafish that includes multiple dozens of samples and attempt to SNP-demultiplex this pool. While we will be unable to validate the accuracy, given how well SNP-based demuxing has performed we can at least determine if cell origins are assigned.

      Reviewer #2: Minor comments:

      “1. Please add an algorithm principle of this model.”

      __Our response: __Thanks for the suggestion, we will do so.

      Planned revision:

      We will direct readers to the algorithm principle of souporcell in the original paper and include a flowchart of our workflow for running souporcell piece by piece as we have done in the manuscript. As mentioned above, we will make clear how we made the necessary adjustments to the original souporcell pipeline to successfully apply it to datasets with various resources available in these species.

      Reviewer #2:

      “2. Give a clear definition of doublets including the ground truth and Souporcell result.”

      __Our response: __We appreciate this recommendation. For the purposes of this paper our definition of a ‘doublet’ is a dataset represented by a single cell barcode that actually contains more than one cell. However, true doublets can be identified with absolute certainty only in our synthetically pooled datasets, because no demultiplexing approach used for benchmarking is 100% accurate. Therefore, ‘true doublet’ will refer to known doublets based on synthetically pooled dataset ground truths. Further, for our experimental datasets we will also use ‘confirmed doublet’ to refer to cells that were called doublets by both the ground truth and souporcell. And we will use ‘contested doublet’ to refer to cells in which the experimentally derived ground truth and souporcell result disagree about a potential doublet.

      Planned revision:

      We will insert a clear definition of doublets used in this paper as described above, including the complexity in identifying which doublets are real given the relationship between ground truth and the souporcell results for each experiment.

      Reviewer #2:

      “3. Authors should indicate the time cost of running one round of such analysis, the minimal computational requirements?”

      __Our response: __This is an important point and will be helpful to readers.

      Planned revision:

      We will add to the manuscript information on the required time, RAM consumption, and computational requirements for running various setups for souporcell.

      Reviewer #3: Major comments:

      “The manuscript makes a convincing case for the ability of a preexisting SNP-based demultiplexing tool, called souporcell, to demultiplex pooled samples. The study uses three methods for validation: 1. In silico data pooling; 2. Pooling of transgenic lines; 3. Pooling of cells tagged with CMOs (10x genomics). The results are consistent across experiments.

      The authors propose that souporcell is a solution for demultiplexing pooled samples whenever sample tagging methods are not feasible. Although the authors test this approach in several species and conditions, the validation does not cover all possible cases and situations, obviously. Indeed, the authors recommend potential users to run pilot validation experiments with a secondary demultiplexing methods.

      However, the manuscript would become more useful if the following points are addressed:

      First, what is the genetic relatedness of the individuals pooled in the experiments? What is the SNP frequency in the samples analyzed, and how does that compare to SNP frequency in mouse strains? (The number of SNPs in the VCF is reported in a supplementary table but not discussed in the main text). This point is extremely important: as the authors mention, it is not possible to demultiplex samples from the same mouse strain. Inbreeding is relatively common in laboratory species, even unconventional ones; therefore, information on genetic relatedness and SNP rate would help readers assess whether SNP-based demultiplexing has a good chance to work in their systems. Addressing this point does not require any additional experiments, and computing from the single-cell reads how many SNPs distinguish the individuals pooled here should be straightforward.”

      __Our response: __We appreciate the comments raised by reviewer #3.These are valuable critiques and will greatly improve the manuscript.

      Planned revisions:

      We will expand our Discussion with a paragraph on the limits for genetic differences required for SNP-based demuxing to work, as mentioned in response to Reviewer 2. This will include references to Table 1 values on SNP numbers utilized in each analysis, and hypotheses on the absolute limits for genetic relatedness. We will expand Table 1B to include green monkey. As mentioned in response to Reviewer 2, if previously published data we will also try applying souporcell to data from an inbred mouse line to test run an extreme case of applying SNP-based demuxing to data from very inbred animals. We will more clearly annotate the known relationship between individuals in our experiments, and will discuss this within our Discussion. We will contact the zebrafish and axolotl authors and ask if these animals were siblings. We will identify and apply a method for quantifying the genetic relationship between individuals in each scRNA-seq experiment in this study, to enable us to provide readers with a quantitative measure of genetic diversity present in each experiment. This analysis should shed some light on the requirements for genetic variability in order for SNP-based demultiplexers to succeed.

      Reviewer #3:____

      “Moreover, the relatively limited number of samples pooled does not validate the use of souporcell with a larger number of samples. For example: in developmental studies, often dozens of embryos are collected and pooled. What are the potential caveats of using souporcell for demultiplexing larger number of samples? The Discussion would be a good place to warn potential users of the limitations of the approach.”

      __Our response: __We agree this could still be a limitation, and for developmental studies with multiple dozens of samples, further exploration of optimal demultiplexing methods or the combination of computational and wet-lab based demux methods may be required.

      Planned revision:

      We will expand our Discussion on predicted limits for SNP-based demuxing of high sample pools, as discussed in response to Reviewer 2. We agree that developmental projects often involve pooling large numbers of samples, so it is worth clearly outlining the benefits and risks of planning to use SNP-based demultiplexing on such high sample pools, and to outline the limits as discussed by the developer of souporcell. As stated above, we will work to identify a previously published pooled zebrafish dataset with multiple dozens of samples and run souporcell on it. While this will not provide any validation it will at the least determine if we are able to assign cell origins, which have thus far been very reliable when assignments have been made.

      Reviewer #3: Minor comments:

      “- is the accuracy of doublet detection rate a function of number of samples? This can be tested by repeating the monkey in silico experiment with three individuals.”

      __Our response: __This is a good question. We do not thing that the number of samples substantially affects the accuracy of doublet detection by souporcell, but we will test this.

      Planned revision:

      As suggested, we will repeat the monkey analysis with 3 samples to see how this changes doublet detection. Overall, due to the low quality of doublet detection by souporcell found in this manuscript, we will expand our Discussion of doublet detection to propose some potentially useful recommendations for making conservative doublet calls with souporcell external programs (addressed above in response to Reviewer 2. We expect that the more substantial filtering of the monkey datasets relative to the zebrafish dataset prior to pooling contributed to this question. To make these differences more obvious we will more deliberately emphasize the differences in dataset filtering for each experiment.

      Description of the revisions that have already been incorporated in the transferred manuscript

      4. Description of analyses that authors prefer not to carry out

      From Reviewer 1:

      “More generally, showing more direct evidence for the variability of different cell types (not just the immune system) could be informative for scRNA-seq users.”

      __Our response: __We do not plan to conduct extensive analyses of other published single cell datasets to provide a further reason for why it is important to have biological replicates for single cell experiments. When building this manuscript, we chose not to pursue the option of publishing an analysis of published single cell datasets in which we could identify artifactual results and animal to animal variability, because we worried that this would be harmful to future open science efforts, and therefore, counterproductive. Further, past papers have already demonstrated the issue of batch effects and animal to animal variability in scRNA-seq datasets, and the requirement for biological replicates to facilitate differential expression analysis. As mentioned above, we will do a better job citing the papers that address these points.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The manuscript "An Sfi1-like centrin interacting centriolar plaque protein affects nuclear microtubule homeostasis" by Wenz and co-authors describes the detection and analysis of the Sfi1-like protein in apicomplexan parasite Plasmodium falciparum. The authors examined the protein localization and function in asexual stages during parasite replication in the red blood cells. The authors detected PfSlp in the PfCentrin1 pulldown, created PfSlp conditional knockdown strain, and evaluated growth and morphological deficiencies associated with the PfSlp deficiency. The study's primary finding is that PfSlp inhibits the extension of nuclear MTs.

      Major comments

      The key conclusion is appropriate but is poorly supported by experimental evidence. The transitional, experiment-to-experiment conclusions are preliminary and may require additional experiments. The authors did not present a convincing model of the PfSlp1 function in mitosis.

      We appreciate the reviewer’s evaluation that our key conclusions are appropriate, but also have taken some of the valid comments below into account and added some conclusive experimental data and partly modified the choice of words when interpreting the data. We are now fully convinced that our conclusions are appropriate and supported by experimental evidence. To understand the function of PfSlp, which was described for the first time in this study, precisely will require a more detailed model of the still very much understudied malaria parasite centrosome and will be the subject of future inquiries.

      If PfSlp inhibits the MT polymerization, then the PfSlp reduction should lead to an extension of the bipolar spindle, which is partly supported by longer MTs in the hemispindles. How is the excess of the nuclear MTs prevent the spindle resolution in anaphase?

      Intranuclear hemispindle microtubules are indeed elongated. Increased microtubule polymerization does not necessary lead to an increased spindle length but could just as well promote the nucleation of multiple short microtubules or increase overlap between antiparallel microtubules. We, however, want to emphasize that our key conclusion is that PfSlp is implicated in the regulation of nuclear tubulin levels, rather than “inhibits extension of nuclear MT”. In our view this is an important distinction since microtubule misorganization is merely a consequence of changing nuclear tubulin levels. At no point we want to suggest that PfSlp somehow directly inhibits polymerization of microtubules and therefore did not provide any specific evidence. The fact that PfSlp and microtubules are in different compartments underlines this. Yet, we have noted that our abstract uses the word polymerization. Although we mention that it occurs as a consequence of increased tubulin concentration, which thermodynamically favors microtubule polymerization, we acknowledge that this could be misleading and removed this term (line 30). Concerning how the excess nuclear MTs prevent anaphase spindle resolution we propose several explanations in the discussion (lines 381ff). All line numbers refer to document with “tracked changes”.

      Fig 4C misrepresents mitotic phases: bipolar spindle should be broken into two in anaphase, while the drawing shows one elongated spindle connecting two poles.

      Indeed, we frequently observed, anaphase spindles being “split” ourselves (Simon et al. LSA, 2021, Fig. 2A). Although sometimes we would see one elongated spindle and sometimes more than two as in Liffner et al. 2021 Fig. 3A. For simplicity we only drew one elongated interpolar microtubule bundle but have now corrected this for more accurate representation.

      The authors should correct the use of terminology. Throughout the manuscripts, the parasite division stages are called life stages. Life stages are merozoites, gametocytes, ookinetes, sporozoites, etc. The division stages apply to a single life stage and, in the case of schizogony, are rings, trophozoites, and schizonts.

      We once falsely referred to life cycle in line 182 when we should have referred to the intraerythrocytic development cycle. The paragraph using the incorrect wording was removed in the revision.

      Please, note that schizogony does not follow the ring and trophozoite stages (line 119); it includes them as the distinctive morphological stages of one round of schizogony. The cell cycle terminology is incorrectly applied.

      We have the impression that the usage of the term schizogony is rather “fluid” in that it is occasionally also employed to just the describe the phase where DNA replication, nuclear division, and cytokinesis occur (hence schizont stage), but we clearly note the more canonical use as equivalent of the asexual intraerythrocytic development cycle as whole. We modified the terminology accordingly (e.g. by employing “schizont stage”) lines 43, 142, 184, 238, 265.

      What is the "mitotic spindle stage," "mitotic spindle nuclei, "or "mitotic spindle duration" (Fig. 4B)?

      It has now been conclusively demonstrated that nuclei go through independent nuclear cycles with different morphological stages (Simon et al. 2021 LSA, Klaus et al. 2022 Sci Advances). Hence, we use the term “mitotic spindle stage” to contrast it with the “hemispindle stage”, which can be morphologically distinguished using microtubules as a marker and occurs just prior to S-Phase. Consequently, “mitotic spindle nuclei” are nuclei in the “mitotic spindle stage”. “mitotic spindle duration” designates the time nuclei spend in that stage i.e. from hemispindle collapse until anaphase spindle elongation. We have adjusted and more accurately defined the terminology throughout the text and complemented Fig. 1A for clarity.

      Minor comments

      The PfSlp knockdown is inefficient: the 55% reduction at the RNA level translates into a minor change at the protein level (Fig.2 and S4). The evaluation of the protein changes should be done by western blot analysis with appropriate controls. The intensity of the IFA signal (used in the study) changes depending on the focal plane, as seen in Fig 1D.

      Due to the exceptionally big size of PfSlp of around 407 kDa and the low expression levels western blot analysis was not feasible in our hands. For quantification of the IFA signal we used image projections and background subtraction to integrate the signal of the full z-stack containing the entire cell and our measurement was therefore independent of the focal plane. We have now described this a bit more thoroughly in the methods section (lines 620ff). The change in signal as measured by IFA is still clearly significant and shows a reduction of about 45%, which is coherent with the reduction of 55% found by RNA analysis and ultimately results in a specific phenotype.

      Growth defects of the PfSlp KD: It is unclear what causes the reduced parasitemia of the GlcN untreated Slp parasites (Fig. 2C and D).

      A likely explanation is that the C-terminal tagging of PfSlp already slightly impairs the function of the protein causing a mild growth phenotype that is not observed in wild type although it is not statistically significant (Fig. 2C). Importantly, the reproduced analysis of parasite growth, shown as multiplication rate in Fig. 2C (and growth curve in Fig. S6) now more clearly demonstrates that when normalizing for GlcN treatment and GFP-glms tagging (“3D7 corr.”) the growth defect is still significant and can therefore be attributed to Slp KD and not to tagging or GlcN treatment addition, which on their own do not cause a significant phenotype.

      To conclude that the kinetics of DNA replication is affected, the authors will need to perform the real-time measurements of DNA replication forks.

      We thank the reviewer for pointing this out and removed the term “kinetics” (line 182, 269).

      The presented data supports that fewer S/M rounds were performed by PfSlp lacking parasites but gives no way to determine whether the S or the M phase was affected.

      We thank the reviewer for this valuable comment. Our data so far showed that the very first spindle extension, and therefore M-Phase, is clearly affected (Fig. 4A-B). If the first division fails all subsequent S phases and M phases might be affected at the population level. To test whether S-phase is affected we now acquired time lapse imaging of single cells labeled with the quantitative DNA dye 5-SiR-Hoechst and saw no difference in DNA signal increase for PfSlp KD parasites, while nuclear number was reduced, showing directly that M phase rather than S-Phase is affected (Fig. 4C, lines 280ff).

      DNA quantification graph (Fig. 2D) is confusing and does not correlate with the quantification of merozoites (Fig. 2E). Why is the DNA intensity of Slp- parasites lower than the DNA intensity of the Slp+ parasites, even though Slp deficient line produces less progeny? Is it possible that you missed the actual peak of DNA replication? Authors may consider more tight time courses with a few additional time points.

      This is a good point. We have repeated this experiment with longer sampling time and shorter intervals. We now plot the fraction of cells with DNA content above 2N (also to exclude double infections and cells that arrest prior to the schizont stage) as a measure to see how many cells are replicating (Fig. 2D, lines 175ff). Although the replication peak was, as observed before, shifted by GlcN treatment we found no significant differences in height. Although the lack of PfSlp tagging and GlcN treatment in the 3D7- control might favor the slightly more productive replication. We complement this analysis by plotting the average DNA fluorescence intensity over time (Fig. S7A) and the area under the curve (see below), as an approximation of “total replication activity” and still found no significant differences (Fig. S7B). The fact that the DNA fluorescence intensity peak does not correlate with the slightly reduced merozoite number observed in Fig. 2E is not very surprising as the fixed time point sampling for DNA quantification can’t differentiate between cells slowing or even halting progression and thereby confounding the averages. This limitation of single timepoint population analysis specifically highlight the importance of our time resolved single cell analysis presented later in Fig. 4, which clarifies the phenotype. Further, merozoite number counting does not give any insight about ploidy of the individual merozoites. Considering the significant nuclear division defect we also show in Fig. 4 it is plausible that some merozoites in the Slp KD could be polyploid, while globally replication is not strongly affected.

      Given the main claim, the study lacks the spatial-temporal analysis of tubulin described only in words. The tubulin quantifications by WB (Fig. S6) are not convincing, as well as the resulting conclusion of the cell cycle retardation.

      We are not completely sure what the reviewer is indicating by a lack of spatial-temporal analysis of tubulin given that we show time-resolved imaging data of tubulin organization in dividing cells and quantify intranuclear tubulin levels. Those data (particularly Fig. 4A) clearly show a retardation in the mitotic spindle stage. We, however, acknowledge that the data on tubulin quantification via western blot could, as Reviewer 2 also points out, be improved through the addition of biological replicates. We have repeated those experiments twice and can now confirm by statistical analysis that total tubulin, aldolase, and centrin protein levels are not affected by Slp KD at 24, 30, and 36 hpi (Fig. 3E, Fig. S8, lines 232ff). This indicates that the increase in intranuclear tubulin is not a consequence of globally increased tubulin expression.

      It is unclear how the authors arrived at the conclusion that the mitotic spindle is deficient in PfSlp KD parasites. Fig. 3C does not show visible differences in GlcN treated and untreated parasites.

      PfSlp KD parasites show unusual microtubule protrusions branching of the main microtubule mass, which have never been observed in wild type parasites. This should have been indicated more clearly by adding an arrow in Fig. 3C. We further think our observation that the tubulin content in mitotic spindles is almost three times higher on average than in wild type spindles (Fig. 3D) and that those spindles do not properly extend (Fig. 4A-B) justifies this claim.

      How many nuclei are in the cells shown in figure 4 and supplemental movies? It seems as if GlcN treated Slp parasites form one long spindle.

      In a previous study (Simon et al. 2021, LSA, Fig. 1B) we have demonstrated that the number of distinct microtubule foci, i.e. mitotic spindles, observed in cells corresponds directly to the number of nuclei. Hence we can assume that prior to successful spindle extension in the PfSlpKD there is one nucleus or two nuclear masses that are in the process of separation. We now added some new time-lapse microscopy data of DNA- and tubulin-stained parasites that confirms that arrested Slp KD parasites fail to properly divide their nuclei (Fig. 4C, Mov. S4-5) and confirms our previously published findings about nuclear number.

      A majority of PfSlpKD parasites indeed seem to form one long spindle. However, this “long spindle” appears only after a significant time delay during which wild type parasites already have undergone multiple nuclear divisions and could be a downstream effect of this retardation through e.g. increase of total tubulin levels over time (Fig. 3E).

      The conclusion of anaphase block is unsupported: the authors need to demonstrate the accumulation of the metaphase nuclei with a bipolar spindle.

      Anaphase describes the phase of chromosome segregation and includes the full extension of the spindle, as discussed above, both of which fails in more than half of the PfSlpKD parasites (Fig. 4A, Mov. S3, S5) and is therefore interpreted as “failure to properly progress through anaphase” for the first time in the discussion (line 381). We currently can’t think about a more direct way to demonstrate this than by time lapse imaging of the very first mitosis in individual parasites. Any analysis of populations at later time point or using fixed cells will be skewed by the phenotype occurring in the very early stages of nuclear division.

      Reviewer #1 (Significance (Required)):

      The eukaryotic centrosome is a microtubule organizing center that guides the segregation of duplicated chromosomes. Despite being an essential regulator of the parasite division, the apicomplexan centrosome remains poorly understood. Recent studies in Toxoplasma gondii (Suvorova et al., 2015) and Plasmodium species (Simon et al., 2021) demonstrated high diversity of the centrosome organization making the studies of microtubule organizing centers in apicomplexans, particularly challenging. Examining the protein composition is one of the ways to uncover organelle function. The current study would help to understand the evolution of the MTOC and mechanisms of cell division in understudied eukaryotic models.

      The focus of my research is the apicomplexan cell cycle. I previously showed the bipartite organization of the Toxoplasma centrosome and identified and characterized several centrosomal constituents, including centrin partner Sfi1. Our most recent study presented evidence of the functional spindle assembly checkpoint in Toxoplasma tachyzoites.


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Plasmodium falciparum parasites undergo several rounds of asynchronous nuclear divisions to produce daughter cells. This process is controlled by the centriolar plaque, a non-canonical centrosome that functions to organize intranuclear spindle microtubules. The organization and composition of this microtubule organizing center is not well understood. Here, Wenz et al. identify a novel centrin-interacting protein, PfSlp, that, following knockdown, leads to fewer daughter cells and aberrant intranuclear microtubule homeostasis and organization.

      Wenz et al. identify PfSlp via co-immunoprecipitation of P. falciparum 3D7 strain with an episomally expressed PfCen1-GFP, noting PfSlp as a gene of interest based on the presence of several centrin-binding motifs. The authors go forward to generate a transgenic 3D7 strain, equipping PfSlp with GFP and glmS ribozyme, to localize and evaluate the function of PfSlp in asexual blood stage parasites. PfSlp appears to, using immunofluorescence and STED microscopy, localize to the outer centriolar plaque in schizonts, based on its colocalization with PfCen3. The authors show, utilizing the inducible glmS ribozyme knockdown system, that PfSlp is required for proper parasite growth, noting a defect following addition of GlcN. This defect is noted to cause a delay in the initiation of nuclear division, or schizogony. Analysis of intranuclear microtubule dynamics reveal abnormal microtubule organization, specifically an increase in nuclear microtubule abundance and length following PfSlp knockdown. Together, these findings characterize the role of a novel protein, PfSlp, that contributes to nuclear tubulin homeostasis and organization during schizogony.

      Major comments:

      The major claims made by Wenz et al. are largely convincing with the data provided.

      1. One area that requires additional attention is the following: Wenz et al. claim PfSlp and centrin to be interacting partners based on 1) co-immunoprecipitation (without prior protein crosslinking), 2) the presence of centrin-binding motifs in PfSlp and 3) colocalization of PfSlp and PfCen3. This interaction is not interrogated fully and claims specific to this point need to be clarified and described as preliminary. As it is written, Wenz et al. claim PfSlp is required for centrin recruitment to the centriolar plaque but this is not investigated fully. The data show lower levels of endogenous centrin at the centriolar plaque in PfSlp knockdown parasites but centrin protein levels are similar in wildtype and knockdown PfSlp parasites. As is, the phenotype attributed to PfSlp knockdown could be attributed to PfSlp or aberrant centrin recruitment to the centriolar plaque. Experiments manipulating PfSlp centrin-binding motifs would strengthen these claims and elucidate the role of PfSlp apart from centrin. If not included, less emphasis should be placed here.

      We agree with the reviewer that additional evidence to demonstrate the direct interaction between PfSlp and centrin would be adequate. Due to the presence of multiple widely spaced centrin binding motifs in PfSlp, which would require multiple highly challenging rounds of genome editing to be modified, we have opted for reciprocal co-IP using PfSlp-GFP (line 139, Fig. S3, see below). The exceptionally large size of PfSlp of 407 kDa and low expression prevented us from detecting it directly on the western blot, but we found a clear centrin band in the Slp IP that was absent in the control.

      We have also further qualified our formulation about centrin recruitment depending on PfSlp (lines 138, 146). Finally, we agree that there are many factors downstream of PfSlp that can contribute to the observed phenotype, which might include centrins and will be subject of future investigations.

      The 3.5 mM glucosamine has some toxicity in the parental 3D7. Is it possible to use a lower concentration so the growth of 3D7 is unaffected but the grow of the Slp-GFP GlmS parasites is still reduced?

      We acknowledge that the used Glucosamine concentration is on the higher end of the classically used range. The slight toxicity of Glucosamine is dose-dependent and only vanishes at submillimolar concentrations. During initial experiments we have found to generate a robust phenotype with 3.5 mM and decided to carry out all experiments at this concentration. We think that the added effect of PfSlpKD over GlcN treatment alone is sufficiently show as e.g. the merozoite number phenotype (Fig. 2E) and the mitotic delay (Fig. 4B) only occurs in Slp+ parasites.

      Fig 3E - the quantification of tubulin levels requires biological replicates to have means and error bars.

      We fully agree with reviewer 2 (and reviewer 1 who commented along the same lines) and now generated two more biological replicates that allow us to confirm by statistical analysis that total tubulin, aldolase, and centrin protein levels are not affected by Slp KD at 24, 30, and 36 hpi (Fig. 3E, Fig. S8, lines 235ff).

      The use of "centrin" is somewhat imprecise throughout. The authors should specific which centrin (PfCentrin1 or PfCentrin3 or others) they are referring to each time in the text.

      Thank you for requesting this clarification. We have used “centrin” on purpose but have failed to properly explain our terminology in the text. For the detection of endogenous centrin we use a polyclonal antibody raised against PfCentrin3 (Simon et al. 2021). Due to the very high sequence identity between PfCentrin1-4 we can’t exclude cross-reactivity of any polyclonal antibody. Throughout the field so far polyclonal antibodies raised against Chlamydomonas centrin and Toxoplasma centrin 1 have been successfully used to label centrin pool at the centriolar plaque. Since we can’t distinguish with certainty which of the centrins (PfCen1-4) is targeted we chose the general description “centrin”. We were however able to show that all four centrins (PfCen1-4) colocalize at the centriolar plaque (Voss et al. biorxiv, /10.1101/2022.07.26.501452) and that Plasmodium centrins interact with each other was demonstrated previously (Roques et al. 2019) while the interaction between PfCen1 and PfCen3 was shown in this study. Therefore, this will not limit our conclusions. We now explain this better in the text (lines 132ff) and adjusted the labeling in Fig. 1E.

      The mention of the cell cycle checkpoint is an interesting and appropriate point in the discussion. However, the discussion of it in the last sentence of the introduction is less appropriate. It should be removed from line 92-93.

      We are excited by the prospects of this study to finally be able to investigate the presence of checkpoint induced delays using time-lapse microscopy, but absolutely agree with the reviewer and have removed the statement in the introduction.

      Minor comments:

      1. Line 50 - "are remaining unclear" should "remain unclear"

      Has been corrected.

      Line 65 - "players" is quite informal. A better word should be selected.

      Was replaced with “factors”.

      Line 223 - "were" should be "where"

      Has been corrected.

      The delay in schizogony which is observed following addition of GlcN (Figure S5) may be made more convincing if the experiment is performed hours post invasion rather than hours post treatment. The synchronization of the parasites is in question as it is described in the methods.

      We have included this data from our initial exploratory analyses and since it was not central to our argumentation, we choose to add it as supplemental figure. After producing further data, we came to realize that the classical morphological characterization using Giemsa-staining partly mispresents the relevant transition from the pre-mitotic to mitotic stages as the onset of first spindle formation and DNA replication can’t be detected. Previous studies have also indicated that parasites which were drug arrested at the trophozoite to schizont transition were morphologically similar to mid- to late schizonts (Naughton and Bell, 2007). In a context that investigates nuclear division phenotypes we feel that this analysis might rather be misleading and that the provided growth assays, DNA replication quantification, and time lapse movies are significantly more informative. Therefore, we have decided to remove the figure altogether. However, we have moved Fig. S7 to Fig. 4 to show the results of the 3D7+GlcN movie quantification in the context of the Slp+/-GlcN results.

      In general, data presentation is clear and readable. The growth defect observed following GlcN treatment (Figure 2C) could be made more clear with data normalization to emphasize that which can be attributed to PfSlp knockdown and not GlcN.

      This is a good suggestion and we have reproduced the initial dataset (Fig. 2C, Fig. S6, see below) and normalized the 3D7 multiplication rate, which shows the effect more directly than the growth curves displayed before, for Slp-tagging and GlcN treatment (“3D7 corr.”). We still found Slp +GlcN to be the only condition to have a significant reduction in multiplication rate in the first cycle after treatment (24-72hpi) with respect to 3D7 control as well as the normalized 3D7 value (“3D7 corr”).

      Line 276 - Why is nuclear tubulin homeostasis more relevant for closed mitosis? This is difficult to understand. It should be phrased differently or provided with additional explanation.

      We thank the reviewer for the comment and agree that this is poorly formulated. We were meaning to express that in e.g. mammalian organisms the nuclear envelope gets disassembled during mitosis and thereby removes the need to regulate import of tubulin into the nucleus for spindle assembly. This is a self-evident statement and has been removed for clarity.

      Line 316 - "were" should be "was"

      Has been corrected.

      The identity, source, and dilution for each antibody must be reported for each use in the methods.

      We noticed that we had not fully referenced Table S3, where we listed all used antibodies and dilutions, which we have now done throughout the methods section.

      Reviewer #2 (Significance (Required)):

      The mechanisms by which intranuclear microtubule dynamics are regulated by Plasmodium falciparum parasites are not well understood. Furthermore, the proteins that are present near the centriolar plaque remain mostly unknown. Understanding the role of the Plasmodium centriolar plaque and its members is critical to describing these dynamics and contributes to our growing understanding of schizogony, an atypical mode of cell division mode with several rounds of nuclear division lacking cytokinesis. Therefore, the identification and initial characterization of PfSlp1 is useful for malaria parasite cell division community.


      __

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The work by Wenz and Simon approaches the function of a novel component of the malaria parasite centriolar plaque, a structure whose complexity has begun to be unraveled only recently__, greatly by the same group. __The authors identify a homolog of Sfi1, a centrin binding protein highly conserved in eukaryotes. Sfi1 homologues usually co-localize with centrioles.

      As a tool to characterize its function, the authors uses a conditional knock down strategy, based on GlcN addition, to downregulate PfSfi1-like protein (PfSlp). The authors analyze the impact of pfSlp downregulation on cell division progression, and go on detailly characterizing the progression of mitotic nuclear division. In sum the study finds that expression of Slp1 is required for proper progression of cell division in Plasmodium parasites.

      The study is well conducted, and the manuscript clearly written. In general terms I found the data shown to support the author's claims. However, I do have a few points of concern to raise, particularly pertaining overinterpretation of the data, and points that need clarification before the manuscript is fit for publication. In particular the authors should explain more clearly how the data based on fluorescence intensity quantifications was acquired and processed, and how this information is intertwined with the expected kinetics of structures measured, along the cell cycle.

      We appreciate the positive feedback and the constructive comments made by the reviewer and now adapted our interpretation of the data or provide additional experimental data to strengthen our argumentation as outlined below. Further we have added some detail to the description of our experimental approaches in the methods section.

      I outline below major and minor points that require attention,

      Major Points

      The manuscript stems off the premise that PfSlp interacts with PfCen1. Despite the fact that Sfi1 is a known interactor of centrin, that the identified protein in Plasmodium has centrin binding motifs, and these proteins co-localize, the support for the direct interaction between the two proteins is based solely on the IP/MS result. No reciprocal IP results are shown.

      We thank the reviewer for the suggestion and have now added the reciprocal co-IP, which shows a specific interaction between PfSlp and centrin without need for cross-linking (Fig. S3, see also reply to comment 1 by reviewer 2).

      Line 118 specifies that co-localization of Slp-GFP with centrin "corroborates their direct interaction." Co-localization most certainly does not show direct interaction. In addition, Figure 1D shows co-localization with Cen3, not with Cen1, which was the only protein shown to have a physical interaction with Slp via immunoprecipitation. Hence, the claim is unplaced and this section should be reworded for clarity.

      The reviewer is correct to point out that co-localization even at STED nanoscale resolution does not demonstrate interaction. We have reworded this statement. Cen3 was the only other specific protein found in the Cen1 immunoprecipitation (Table S1) and the interaction between the four centrins Cen1-4 was shown in an earlier study in P. berghei (Rogues et al. 2019). However, as the Reviewer 2 also indicated, we did not clearly communicate what the targets of our centrin antibody are. We, indeed used an antibody raised against PfCen3. Due to the very high sequence identity between centrins it is, however, unrealistic to exclude cross-reactivity between centrins for a polyclonal antibody (as explained in more detail in our response to Reviewer 2). We have added an explanatory statement in the main text (lines 132ff). Our recent finding that GFP-tagged PfCen1-4 all colocalize at the same position in the centriolar plaque (Voss et al. biorxiv, /10.1101/2022.07.26.501452) and our previously published study of the centriolar plaque (Simon et al. 2021) gives us additional confidence that the antibody specifically labels the compartment of interest.

      I was surprised to see how little recovery of PfCen1-GFP the authors obtained from their IP experiments. Whilst I understand that a western blot is not quantitative, I wonder, were the amounts of protein loaded onto each lane normalized for comparative purposes in any way? Please comment on this at least in the figure legend so the reader can gage whether the little PfCen1-GFP recovery was a consequence of the IP experiment, or whether the WB is not representative of the actual IP results but rather show a fraction of the recovered material.

      We did not determine the total protein concentration (by e.g. Bradford assay) and therefore did not normalize for protein amounts per lane. Instead, we determined the number of infected red blood cells per ml before Saponin-lysis of the red blood cells and loaded protein lysate equivalent to 1 x 107 cells per lane. We now explain this more clearly in the legend for Fig. S1. During the IP, much of the total protein amount might got lost during the washing steps, which might explain the weak Centrin1-GFP band and the absence of a protein signal in the eluate lane by Ponceau staining (neither a signal for Centrin1-GFP nor unspecific protein signal in the Ponceau). We would conclude that the WB, or at least the lane with the eluate, shows a fraction of the recovered material.

      If the WB is indeed representative of the actual PfCen1-GFP recovery rates, I suggest you discuss the possible outcomes of having pulled down so little from the total cell lysate - could it be that the recovered proteins are representative of interactions happening only for a subset of soluble PfCen1 molecules? Can the little protein recovery be explained by Cen1 interactions with insoluble cell components such as the cytoskeleton?

      As described above, the eluate lane does likely not represent the actual amount of Cen1-GFP that was pulled down and therefore the WB is not representative of the PfCentrin1-GFP recovery rates. Based on our previous studies we are not aware of any cellular PfCen1 pool beside the cytoplasm and the centriolar plaque. Although they might be below the detection limit. The reviewer raises an interesting hypothesis but we don’t have sufficient data to assume an association with the cytoskeleton and verifying this would require extended further studies.

      Were other IP conditions tested? Were the same results obtained?

      We carried out three PfCen1-GFP IPs. Once without cross-linking as shown in the study and twice with cross-linking. The two IPs with crosslinking had different amounts of targets identified (24 vs 162). While we did not detect PfSlp in the one with the low number of peptides we detected PfSlp in the second IP. In both IPs we additionally detected PfCen2 and PfCen3.

      Do you get the same interactors if the IP is done using anti-Centrin instead of anti-GFP?

      We did not test an anti-Centrin antibody for IPs as the protocol from the Brochet group was optimized for the highly specific bead-coupled anti-GFP antibody.

      Please define how you identified "specific hits." This is, please describe your criteria for determining "specificity." Was it an all or nothing selection approach? Are Cen1, Cen3 and PfSlp significantly enriched? And if so, how did you define "enriched for" in the context of your experiment?

      We thank the reviewer for given us the chance to clarify our candidate selection. We specifically selected the Cen1-GFP IP targets without cross-linking since it produced a short list of hits detected by mass spectrometry. We used an all or nothing approach in that we subtracted from that list any protein that was ever identified in a GFP control IP analysis by the Brochet lab using the same protocol (Balestra et al. 2021). This left only three proteins Cen1, Cen3, and Slp, as our “specific” hits. We have modified the text to explain our selection criteria more explicitly (lines 112ff) while avoid using the term “enrichment” since this is an all or nothing selection.

      I'm not at all suggesting here that you repeat this experiment. I understand that the focus of the manuscript is the description of PfSlp, and this stands regardless of the IP results. However, I suggest you include a lengthier discussion of the results shown in SFig1 and Fig1, and the limitations of the approach.

      We appreciate the assessment by the reviewer that the focus of the manuscript is otherwise and acknowledge that this is not an extensive analysis of PfCen1 interaction partners. We have, as requested, added a comment addressing this limitation in the discussion (lines 331ff).

      Line 123 mentions that Cen3 and Slp1 are recruited together only because they co-localize in most cells showcasing hemi-spindles. Please simply keep "simultaneously" here, as this is the only thing you can conclude from your quantification data. Being recruited "together" implicitly means by "the same mechanism", which is not shown by your data.

      We agree that simultaneously is more accurate and we have modified the text (line 146).

      Please specify which statistical test was used for determining significance in Figure S4, and what *** refers to in this case. It is hard to judge really how different these data sets are in light of the overlapping error bars. Also, what is quantified here? Integrated density from an immunofluorescence assay? How are the data normalized to be comparable? How many replicates did you quantify? Or are the data shown representative of a single experiment? I could not find these details in the M&M section or the figure legend.

      We have revisited all figure legends and consistently defining the p-value and number of replicates (usually N=3) and briefly explain the measurement. Further we have extended the methods section to make our image quantification approach clearer.

      Also, on the interpretation of these data; If Slp1 causes a delay in cell cycle progression, and taking into account that the fluorescence intensity of Slp1 varies along the cell cycle, with Slp1 intensity increasing as cell cycle progresses from the ring stages onwards, are these comparable measurements? In other words, are you selecting the same stages whereby the same Slp1 intensities at the centriolar plaque would be expected?

      If I understand correctly these measurements are carried out at 55hs post GlcN addition (when the growth phenotype starts evidencing itself?). At this time point, the relative abundance of ring and trophozoite stages (stages at which Slp1 is not expected to be detectable at the CP) is considerable higher than that of the control condition, hence a reduction in Slp1 is expected, and a mechanistic claim about recruitment or stability would be incorrect. Please clarify.

      As the reviewer correctly points out it is important to normalize for the stages when quantifying the PfSlp intensities. To achieve this, we only selected schizont stage parasites with a similar distribution of cells containing 3-10 nuclei between the conditions to ensure we are looking at comparable stages. We then quantified the integrated density at each individual centriolar plaque, designated by the presence of a centrin signal. Outside of centriolar plaques no PfSlp signal can be detected. As for ring and trophozoites stages, they do not have a discernable centriolar plaque, or at least not with the markers available in the field, and likely do not express PfSlp based on published transcriptomics data (Plasmodb.org). We have revisited the text to make our quantification strategy clearer (line 170, 621ff).

      To understand the relative contribution of Slp1 to the growth delay phenotype, please include 3D7+GlcN control in the quantification of stages shown in Fig. S5. Please check how the data shown in Fig S5 was normalized; the 49 and 73hs bars in the -GlcN condition exceed 100%.

      As indicated in our reply to Reviewer 2 we only included this data from our initial exploratory analyses and since it was not central to our argumentation, we chose to add it as supplemental figure. After producing further data, we came to realize that the classical morphological characterization using Giemsa-staining partly mispresents the relevant transition from the pre-mitotic to mitotic stages as the onset of first spindle formation and DNA replication can’t be detected. Previous studies have also indicated that parasites which were drug-arrested at the trophozoite to schizont transition were morphologically similar to mid- to late schizonts (Naughton and Bell, 2007). In a context that investigates nuclear division phenotypes we feel that this analysis might rather be misleading and that the provided growth assays, DNA replication quantification, and time lapse movies are significantly more informative. Therefore, we have decided to remove the figure altogether. However, we have moved Fig. S7 to Fig. 4 to show the results of the 3D7+GlcN movie quantification in the context of the Slp+/-GlcN results.

      What is "centrin signal" shown in Figure 2B? Centrin1? Centrin 3? Please clarify which centrin protein you are referring to throughout the manuscript, or provide evidence that they could be interchangeably used for localization and intensity measurement experiments.

      We thank the reviewer for pointing out this vagueness. As explained above in the second major point and in the reply to reviewer 2 we use the term “centrin” to emphasize that we cannot be certain to which degree PfCen1,2,3 or 4 contribute to the signal. Our recent preprint (Voß et al. 2022) and Roques et al. 2019 and Simon et al. 2021 however suggest that all centrins co-localize and interact at the outer centriolar plaque. As mentioned we now discuss this in the text (lines 130ff).

      Line 149 outlines that Slp1 and centrin intensities are simultaneously reduced, and that this fact alone "affirms" they are part of one complex, and that this implies that Spl1 is somehow involved in centrin recruitment. This claim is not supported by the data shown. There are multiple possible explanations as to how the intensities of both proteins could simultaneously decrease without them conforming the same structure, the same complex or even directly interacting. For example, if the centriolar plaque homeostasis is altered, or the "intensities" are simultaneously dependent on cell cycle progression, they will both be affected without necessarily ever interacting. In fact, if the centrin intensity monitored is that of Cen3, a direct interaction between Slp1 and Cen3 is not demonstrated at any time. At best, the authors could argue that both proteins are directly interacting with Cen1. Again, even this is no definitive proof that they form the same complex.

      The reviewer is correct to point out that there are multiple explanations for the decrease of centrin and Slp signal and we have phrased some of the relevant statements more carefully (lines 138, 146, 172). We, however, think that our new reciprocal co-IP data (Fig. S3) in combination with the already provided evidence now significantly strengthens our claim about the interaction between centrin and Slp.

      Measurements of DNA content, shown in Figure 2D, show that +GlcN Slp1 knockdown parasites exhibited reduced DNA amounts at 42hs post induction. These results are interpreted as "defects in nuclear division," however, 1. Nuclear division is not analyzed directly, but rather approximated by measuring DNA content. 2. Even in the presence of perfectly normal nuclear division, the DNA content reduction for these parasites at this time point is expected, as cell cycle progression is affected.

      Line 160 states that a reduction in merozoite number corroborates a defect in nuclear division. However, the data shown only quantifies merozoites per schizont. As mentioned above, nuclear division is not directly assayed.

      We thank the reviewer for emphasizing this important distinction (alongside Reviewer 1). Making the conclusion about nuclear division based on the reduced number of merozoites was premature and we now phrased this more carefully (line 198). Even our data showing inhibition of spindle extension (Fig. 4A-B), although being a strong indicator, do not strictly speaking observe nuclear division. Hence, we have added time-lapse imaging data of nuclear number in KD vs control conditions using the quantitative live cell DNA dye 5-SiR-Hoechst (Fig. 4C. Mov. 4-5). These data now clearly show that the nuclear division or M-phase is affected, while the increase of DNA signal, which represents replication, is not distinguishable from the control. This confirms that nuclear division is the initial and relevant phenotype.

      What the nuclear division defects observed are is unclear. Is there fusion, fission? loss of nuclear content? defects in mitosis completion? defects in DNA replication? A reduction in merozoites per schizont, with a concomitant reduction in overall DNA levels could also be explained by a general arrest in the final stages of division. Do other processes linked to nuclear division progress normally? For example, is there daughter cell formation during schizogony without the expected accompanying nuclear division? Are daughters forming in the correct number and position? Are there more daughter cells than nuclei? Or are parasites dying before completing schizogony and producing merozoites? These possibilities need to be carefully teased out before a nuclear division defect can be assigned as the sole causing factor of the division phenotypes observed.

      These are all very pertinent questions some of which go beyond the scope of this very first characterization of PfSlp function but we are keen to include those in our future analysis. Some of them we can answer while I will try to offer an interpretation for the remaining ones:

      It isn’t fully clear to us what is meant by “Is there fusion, fission”. We will assume that the reviewer refers to the process of karyofission where the nuclear membrane is constricted and fused between the segregating chromatin masses. The field is still lacking a nuclear membrane marker, which makes a direct analysis of this question difficult. Under normal circumstances it has been demonstrated that mitosis is fully closed and the nuclei are completely surrounded by membrane right after division (Klaus et al. 2021). To maybe clarify further we use the term nuclear division to designate the formation of two physically distinct nuclei from one progenitor. We can’t and don’t comment on the integrity of the nuclear membrane and if we had to speculate, it is probably not affected.

      Our new data on DNA dynamics (Fig. 4C) shows a delay in nuclear division while DNA replication seems unaffected in the early division stages. The failure to complete mitosis is also shown by the lack of proper spindle extension. It is possible that PfSlp KD affects final stages of division, but since we treat parasites at ring stages and detect a strong phenotype already at the very first division which occurs only a couple of hours after centrin/Slp recruitment one must assume that this is the defining phenotype, which likely has repercussion on later rounds of division. This makes it virtually impossible to clearly define late phenotypes. We actually have to assume that parasites that proceed to later stages of division do so because PfSlp KD was less efficient.

      Our data directly shows that more than half of our PfSlp KD parasites “fail to properly divide their nucleus” in the first round of mitosis and therefore can’t construe any other way than to designate this as a “nuclear division phenotype”. We purposefully don’t comment on potential later phenotypes and an impact on cytokinesis (budding) but look forward to investigating this in the future.

      Minor Points

      • Line 49: consider "...mechanisms remain unclear" instead of "... mechanisms are remaining unclear"

      We have corrected this sentence as suggested.

      • Readers not familiar with Plasmodium cell division would benefit from having the different stages shown schematically in Figure 1A labeled (ring, merozoite, trophozoite, etc.)

      Good suggestion. We have expanded the labeling in Fig. 1A, but still choose to focus on the division stage, which is relevant for the presented data.

      • Figure 1 legend: Please specify that "centrin" staining is approximated by centrin 3 specifically. Figure 1E is missing a legend in Figure 1's legend.

      Thank you for pointing this out. We have expanded the figure legend accordingly.

      • To ease the reader's interpretation of the data, please consider using a different color for 3D7 +GlcN in the plots shown in Figure 2. It is difficult to distinguish the light magenta from the red color at first glance, especially when the lines are partially overlapping.

      We explored many different color combinations and consulted with several colleagues and concluded that the chosen color combination is most suitable to convey the logic of the strains (while accounting for green-red blindness).

      • Please clarify how long after GlcN addition are phenotypes assessed - ex. Microtubule cumulative length measurements shown in Figure 3.

      We mentioned in the previous Fig. 2 that we add GlcN at the ring stage preceding the schizont stage we analyze but failed to specify that we consistently do so for all experiments. We have added more information in the results (line 221) and to the methods section in more detail.

      • For Figure 3C please provide a separate image for the Slp channel alone. The overlay of the green centrin signal and the magenta from the tubulin staining render a yellow signal. It is difficult to appreciate the level of Slp knockdown in these cells. Moreover, in the inset, the label "zoom in" is on top of the centrin signal in green, precluding the proper assessment/observation of any yellow signal left over.

      Thank you for this remark. We have removed the centrin signal, which is clearly shown in the main panel, from the zoom ins to render the residual PfSlp signal clearly visible.

      • When describing Sf1 in T. gondii, please also cite PMID: 36009009 PMCID: PMC9406199 DOI: 10.3390/biom12081115

      When submitting our manuscript this study was not yet published, but we are happy to now include it in the introduction (line 92).

      The notion of "checkpoint" is mentioned in the introduction and revisited in the discussion. This is a topic under current discussion/evaluation in the field. As mentioned by the authors, demonstration of a checkpoint implies demonstrating reversibility of the putative checkpoint. Though the authors do not make claims about Slp1 or the phenotypes observed activating a specific checkpoint, the manuscript could be further strengthened if the authors showed that the anaphase arrest is reversible upon wash out of GlcN and restored levels of PfSlp expression. I'm including this comment as a "minor points" because it is a only suggestion. I understand that carrying out these experiments is not within the scope of this work. However, if the authors decided to pursue this, it would certainly strengthen the manuscript.

      We highly appreciate the suggestion made by the reviewer and already considered ways to inactivate the putative spindle assembly checkpoint or reverse the phenotype. Wash out of GlcN would theoretically be an option although we are unsure that the kinetics of the subsequent protein synthesis would unfold on a short enough time scale. As suggested by Reviewer 2 we try to remain cautious about directly addressing the checkpoint issue, since e.g. PfSlp due to its localization can’t be a direct component of the checkpoint itself. The mention of “checkpoints” has also been removed from the introduction. We are, however, excited that using our time lapse microscopy protocols there now is a framework to investigate this in more depth in the future.

      Reviewer #3 (Significance (Required)):

      Plasmodium species lack centrioles, and display a divergent mitosis. It is therefore of interest and relevance to understand the peculiarities of the centriolar plaque, as it likely underlies the ability of Plasmodium to upscale its numbers.

      Our molecular understanding of the underpinning factors controlling nuclear and cell division in Plasmodium is limited to a few recent publications. The data presented herein is novel and contributes to the body of work with molecular insight and high resolution microscopy coming on for the malaria field.

      My expertise is in cell division in Apicomplexan parasites

    1. Author Response

      Reviewer #1 (Public Review):

      In this manuscript, the authors investigate the genes involved in the retention of eggs in Aedes aegypti females. They do so by identifying two candidate genes that are differentially expressed across the different reproductive phases and also show that the transcripts of those two genes are present in ovaries and in the proteome. Overall, I think this is interesting and impressive work that characterizes the function of those two specific protein-coding genes thoroughly. I also really enjoyed the figures. Although they were a bit packed, the visuals made it easy to follow the authors' arguments. I have a few concerns and suggested changes, listed below.

      1) These two genes/loci are definitely rapidly evolving. However, that does not automatically imply that positive selection has occurred in these genes. Clearly, you have demonstrated that these gene sequences might be important for fitness in Aedes aegypti. However, if these happen to be disordered proteins, then they would evolve rapidly, i.e., under fewer sequence constraints. In such a scenario, dN/dS values are likely to be high. Another possibility is that as these are expressed only in one tissue and most likely not expressed constitutively, they could be under relaxed constraints relative to all other genes in the genome. For instance, we know that average expression levels of protein-coding genes are highly correlated with their rate of molecular evolution (Drummond et al., 2005). Moreover, there have clearly been genome rearrangements and/or insertion/deletions in the studied gene sequences between closely- related species (as you have nicely shown), thus again dN/dS values will naturally be high. Thus, high values of dN/dS are neither surprising nor do they directly imply positive selection in this case. If the authors really want to investigate this further, they can use the McDonald Kreitman test (McDonald and Kreitman 1991) to ask if non- synonymous divergence is higher than expected. However, this test would require population-level data. Alternatively, the authors can simply discuss adaptation as a possibility along with the others suggested above. A discussion of alternative hypotheses is extremely important and must be clearly laid out.

      We agree with the reviewer’s point that rapid evolution is not the same as positive selection. We also agree with the reviewer’s point that McDonald-Kreitman test (MK test) is more powerful than dN/dS analysis. We took advantage of a large population dataset from Rose et al. 2020. After filtering the data, we kept 454 genomes for MK tests. We found both genes are marginally significant or insignificant (tweedledee p = 0.068; tweedledum p = 0.048), despite that these are small genes and have low Pn values. This suggests that it is likely the genes evolve under positive selection.

      In line with the reviewer’s suggestion, we performed another analysis using a large amount of population data. We asked if the SNP frequencies of tweedledee and tweedledum are correlated with environmental variables. We found that when compared to a distribution of 10,000 simulated genes with randomly-sampled genetic variants, both tweedledee and tweedledum showed significant correlation to multiple ecological variables reflecting climate variability, such as mean diurnal range, temperature seasonality, and precipitation seasonality (p<0.05). These results are now incorporated into the manuscript in Figure 5 and Figure 5 – Figure supplement 1.

      2) The authors show that the two genes under study are important for the retention of viable eggs. However, as these genes are close to two other conserved genes (scratch and peritrophin-like gene), it is unclear to me how it is possible to rule out the contribution of the conserved genes to the same phenotype. Is it possible that the CRISPR deletion leads to the disruption of expression of one of the other important genes nearby (i.e., in a scratch or peritrophin-like gene) as the deleted region could have included a promoter region for instance, which is causing the phenotype you observe? Since all of these genes are so close to each other, it is possible that they are co-regulated and that tweedledee and tweedledum and expressed and translated along with the scratch and peritrophin-like gene. Do we know whether their expression patterns diverge and that scratch and peritrophin-like genes do not play a role in the retention of viable eggs?

      This is a fair criticism; however, we think the chance that the phenotypes are caused by interrupting nearby genes is very low. First, peritrophin-like acts in the immune response, and scratch is a brain-biased transcription factor. Neither of the genes show expression in the ovary before or after blood feeding (TPM <1 or 2 are generally considered unexpressed, while scratch and peritrophin-like expression levels are overall lower than 0.1 TPM).

      This suggests that peritrophin-like and scratch are not likely to function in the ovary. Thus, although we cannot completely rule out the gene knockout impacts regulation of very distant genes, it is unlikely. Since the mounting evidence we show in this manuscript that tweedledee and tweedledum are highly translated in the ovary after blooding feeding, under the principle of parsimony, we expect the phenotypes came from knocking out the highly expressed and translated genes.

      Reviewer #2 (Public Review):

      This manuscript is overall quite convincing, presenting a well- thought-out approach to candidate gene detection and systemic follow- ups on two genes that meet their candidate gene criteria. There are several major claims made by the authors, and some have more compelling evidence than others, but in general, the conclusions are quite sound. My main issues stem from how the strategy to identify genes playing a role in egg retention success has led to very particular genes being examined, and so I question some of the elements of the discussion focusing on the rapid evolution and taxon- uniqueness of the identified genes. In short, while I believe the authors have demonstrated that tweedledee and tweedledum play an important role in egg retention, I'm not sure whether this study should be taken as evidence that taxon-specific or rapidly evolving genes, in general, are responsible for this adaptation, or simply play an important role in it.

      We have revised the paper to make it clearer that the focus is indeed on these two genes on not on the greater question of taxon-specific or rapidly-evolving genes.

      First, the authors present evidence that Aedes aegypti females can retain eggs when a source of fresh water is lacking, confirming that females are not attracted to human forearms while retaining eggs and that up to 70% of the retained eggs hatch after retaining them for nearly a week. This ability is likely an important adaptation that allows Aedes aegypti to thrive in a broad range of conditions. The data here seem fairly compelling.

      Based on this observation, the authors reason that genes responsible for the ability to retain eggs must: 1) be highly expressed in ovaries during retention, but not before or after. 2) be taxon-specific (as this behavior seems limited to Aedes aegypti). While this approach to enriching candidate genes has proven fruitful in this particular case, I'm not sure I agree with the authors' rationale. First, even genes at a low expression in the ovaries may be crucial to egg retention. Second, while egg-laying behavior is vastly varied in insects, I'm not sure focusing on taxon-restricted genes is necessary. It is entirely possible that many of the genes identified in Figure 2E play a crucial role in egg retention evolution. These are minor issues, but they are relevant to some later points made by the authors.

      We regret framing the discovery of tweedledee and tweedledum in the original submission using this somewhat artificial set of filtering criteria. The reality is that the genes caught our attention for their novel sequence, tight genetic linkage, and interesting expression profile. That really is the focus of the paper, not these other peripheral questions that have been the focus of attention of the reviews. We really do apologize for all of the confusion about what this paper is about.

      Nonetheless, the authors provide very compelling evidence that the two genes meeting their criteria - tweedledee and tweedledum, play an important role in egg retention. The genes seem to be expressed primarily in ovaries during egg retention (some observed expression in brain/testes is expected for any gene), and the proteins they code seem to be found in elevated quantities in both ovaries and hemolymph during and immediately after egg retention. RNA for the genes is detected in follicles within the ovary, and CRISPR knockouts of both the genes lead to a large decrease in egg viability post retention.

      My earlier qualms about their search strategy relay into some issues with Figure 4, which describes how the two genes are 1) taxon- restricted and 2) have evolved very rapidly. Neither of the two statements is unexpected given the authors' search strategy. Of course, the genes examined precisely for their lack of homologs do not have any homologs. Similarly, by limiting themselves to genes that show a lack of homology (i.e. low sequence similarity) to other genes as well as genes with high expression levels in the ovaries, a higher rate of evolution is almost inevitable to infer (as ovary expressed genes tend to evolve more rapidly in mosquitoes). I agree with the authors that inferences of the evolutionary history of these genes are quite difficult because of their uniqueness, and I especially appreciate their attempts to identify homologs (although I really dislike the term "conceptualog").

      We have removed our term “conceptualog” and replaced with the mor conventional “putative ortholog”

      This leads to my main (fairly minor) issue of the paper - the discussion on the evolutionary history of these genes and its implications (sections "Taxon-restricted genes underlie tailored adaptations in a diverse world" and "Evolutionary histories and catering to different natural histories"). As noted, inferring this history is very difficult because the authors have focused on two rapidly evolving, taxon-restricted genes. The analyses they have performed here definitely demonstrate that the genes play an important role in egg retention, however, they do not show that taxon-restricted genes play a disproportionate role in egg retention evolution. Indeed, the only data relevant to this point would be the proportion of genes in Figure 2E that are taxon-restricted (3/9), but I'm not sure what the null expectation for this proportion for highly expressed ovary genes is to begin with. Furthermore, the extremely rapid evolution of this gene makes it hard to judge how truly taxon-restricted it is. My own search of tweedle homologs identified multiple as previously having been predicted to be "Knr4/Smi1-like", and while no similar genes are located in a similar location in melanogaster, there is generally little synteny conservation in Drosophila (for instance Bhutkar et al 2008), so I'm unsure what can really be said about their evolutionary origins/lack of homologs in Drosophila.

      In short - the manuscript makes clear that tweedledee and tweedledum play an important role in egg retention in A. aegypti, nonetheless, it is not clear that this is a demonstration of how important taxon- restricted genes are to understanding the evolution of life-history strategies.

      Again, we should have never framed the paper the way we did in the original version. We make no claims whatsoever that taxon-restricted genes in general should play a role in this biology, only that the two candidate genes under study influence egg viability after extended retention. We hope that the framing is clearer in this revision.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This manuscript reported the CryoEM ring-like structure of the full-length human E3 assembly ligase UBR5, showing its assembly into a tetramer. The authors identified critical determinants for antiparallel homodimer and tetrameric assembly. They further described AKIRIN2 as UBR5 substrate and provided evidences of a preferential interaction and activity of UBR5 towards monoubiquitinated proteins. Based on these findings, they proposed UBR5 as chain-elongating E3 ligase.

      CryoEM data are solid, and the model interpretation of the tetrameric structure provides a precise description of the domain composition of the protein that well fit with biochemical data. Additional experiments are suggested to corroborate few statements of the authors.<br /> We believe they are realistic in terms of time and resource.

      1. Authors should address the importance of tetramerization by mutating SBB2 at the tetramerization interface and comparing the mutant with wild type in mass photometry and ubiquitination assays. In silico analysis of the interaction interfaces (e.g by using PISA software) could be useful to select amino acids to be mutated. The authors suggested a role for oligomerization in catalysis and mutants are needed in order to define the real "functional unit" of the enzyme.
      2. The authors used sucrose gradient sedimentation assay to prove UBR5 and substrate interaction (Fig. 3). Control experiment that showed UBR5 protein sedimentation in presence of GFP only is instead in Supplementary Fig. 3D. Unfortunately, in that panel the signal of UBR5 is not visible. Main figure should be revised showing proper controls of the experiment.
      3. The authors need to better clarify the features of the AKIRIN-UBR5 interaction. According to the data, the enzyme is equally active on both AKIRIN-Ub and Securin-Ub, suggesting a Ub-specific engagement. What would be a correct explanation of these results? Is the UBA domain directly involved in this process? Testing the activity of a UBA-impaired mutant should help to solve this issue.
      4. The authors identified a 25 aa sequence, called Plug loop, preceding the HECT domain. In the structure it is inserted between N and C-lobe subdomains of the HECT and appears to lock the enzyme in an open L-conformation. These structural findings are interesting, but no supported by experimental data. Which is the effect of the Plug loop deletion in a ubiquitination assay? Without further validation the last chapter of the results remains purely speculative and may better fit in the discussion.
      5. The datasets are clearly affected by preferential orientation as showed by the angular distribution and 2D classes (reason why the authors correctly performed data collection with tilt). A comment on this is required in the experimental section. In addition, it is not clear whether the presented maps (Fig 1 and 2) derive from merging of the two datasets or only the model has been built using the two different datasets.
      6. As a general comment, authors should enlarge panels in which structural details are described, highlighting the side chain residues involved in binding interfaces. Fig. 5 and Fig. 6 are particularly small and incomplete. Most of the structural figures miss key labels needed for a proper understanding. E.g. among the others, numbering of the helix composing the armadillo domain.
      7. The overall organization of the figures is quite confusing. Pag. 7 Figure 2C should represent a "box stabilized by three zinc ions mediated by two histidine and seven cysteine residues" according to text citation, but none of these details is highlighted in the corresponding figure. The eye in Figure 1,2,4 does not mean much if a proper box is not linked to the actual site to be seen. In addition, arrows indicating the rotation axis is hard to interpret. Few panels miss the legend. Figure 1A and many other panels miss the reference in the text. More details below.

      Additional points:

      • Mass Photometry data need additional comments and labels. Please comment on the MP concentration used to analyze the samples. Being a dynamic system, you are probably seeing an equilibrium of species at 10 nM in MP. For better completeness of MP figures, labels that includes counts, % of species and sigma should be added to the nice representation of oligomers. Which condition/fraction represent the MP data showed in 1B?
      • If Alphafold models are mentioned and used for model building, it would be nice to provide at least a pLDDTscore and ptm score. Since some details of the AF model are described in the text, an additional superposition of the AF model with the final model derived by EM would be useful to the community.
      • A simple workflow describing the cryoEM data processing that includes how many particles have been used in each step is required, at least in the methods section. The authors need to show the cryoEM 2D classes of the dimer as well.
      • Please add the domain boundaries in Figure 1A and highlight the domains on the alignment included in Supplemental Table 1.
      • Pag. 8 please decide which abbreviation to use, either UBR or Ubr.
      • Page 8, line 192. I found annoying to find the same sentence used by competitors who posted a bioRxiv paper 3 days before the one we are reviewing (doi.org/10.1101/2022.10.31.514604 page 4, line 135).
      • In supp. 1C legend, "high concentration of NaCl" is a bit vague
      • Complementary to Supp Fig 2A, a zoom in of the density map with traced model would be beneficial to show the actual map quality obtained.
      • Pag. 6 lines 133-134, the helix residues involved in homodimerization are cited in the text, but not highlighted in the Figure 1.
      • Figure 1 legend, panels H-I-J description are missing.
      • Figure 3, panel B, meaning of the asterisk is not reported in the figure legend.
      • Figure 4, 5 panels from A to E are cited in the text while figure reported only 4.

      Referees cross-commenting

      I think all the reviews are fairly consistent and agree with the comments raised by my colleagues with the one exception of Point 3 of Reviewer 1. The issue is certainly important yet the experiment suggested is not clear. I personally have troubles designing an informative experimental set-up.

      Significance

      This paper presents the intriguing Cryo-EM structure of the full-length HECT E3 ligase UBR5. As it stands, this work provides evidence of the existence of a tetrameric RING-like conformation that could represent the functional unit of the catalysis. Very little validation of the features identified in the Cryo-EM structure is given, thus the paper remains quite descriptive, but in any case interesting and informative for the ubiquitin field.

      Considering that UBR5 is a quite competitive subject in these days (e.g. at least one additional Cryo-EM structure was posted in BioRxiv, doi.org/10.1101/2022.10.31.514604), I would positively consider this manuscript for publication if the authors reply in full to the issues raised.

      My field of expertise: Ubiquitin regulation and interactions, biochemistry, biophysics and Cryo-EM.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      We would like to thank the reviewers for taking time in reviewing and commenting on our paper. The comments were very constructive and conscientious, thanks to their expertise in the field. These comments and the revisions would surely make this paper a better and more robust finding in the field.

      The comments were about clearer explanations, increasing the quality of the data and additional experiments for a stronger conclusion, all of which we are eager to accomplish. Now we have sorted out the problems and planned the experiments required in the revision, as detailed below.

      2. Description of the planned revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary In this manuscript, Komori et al. examined the role of the LRRK2 substrate and regulator Rab29 in the lysosomal stress response. Briefly, in chloroquine (CQ)-treated HEK293 cells the authors observed an apparent LRRK2-independent increased in Rab29 phosphorylation which was accompanied by translocation of Rab29 to lysosomes. Intriguingly, the authors detected a similar increase in Rab29 phosphorylation when Rab29 was tethered to lysosomes in the absence of CQ treatment. Using mass spectrometry, mutagenesis and a phospho-specific anti-body, the authors mapped the CQ-induced phosphorylation site to S185 and demonstrated its independence from LRRK2. Next, the authors found that PKCa was the kinase responsible for S185 phosphorylation and lysosomal translocation of Rab29. Lastly, the authors showed that in addition to PKCa the lysosomal translocation of Rab29 was also regulated by LRRK2. Overall, Komori and colleagues provide interesting new insights into the phosphorylation-dependent regulation of Rab29. However, there are. Number of technical and conception concerns which should be addressed.

      Major points 1) Figure 1F: the localization of Rab29 to lysosomes is not convincing at all. The authors should either provide more representative image examples or image the cells at a higher resolution. The authors should also confirm the CQ-induced lysosomal localization of Rab29 in a different cell type (e.g., HEK293).

      We will replace Fig 1F pictures with slightly more magnified images with higher resolution. We will also include additional cell types (HEK293, and other cells that are predicted to express endogenous Rab29); Reviewer #2 also raised this point (see Reviewer #2 comment on Significance).

      Moreover, the authors should show that prenylation of Rab29 is required for its CQ-induced phosphorylation.

      We will test the effect of lovastatin, a HMG-CoA reductase inhibitor that causes the depletion of the prenylation precursor geranylgeranyl diphosphate from cells (Binnington et al., Glycobiology 2016, Gomez et al, J Cell Biol 2019), or 3-PEHPC, a GGTase II specific inhibitor that also causes the inhibition of Rab prenylation (Coxon FP et al, Bone 2005).

      2) The rapalog-induced increase in Rab29 phosphorylation in Figure 2D is not convincing since there is at least 2-3-fold more Rab29 in FRB-LAMP1 expressing cells compared to their FRB-FIS1 counterparts. An independent loading control is also missing. This is a key experiment and should be properly controlled and quantified. In addition, can CQ treatment drive 2xFKBP GFP-Rab29 from mitochondria to lysosomes (in the presence of rapalog and FRB-Fis1)?

      We will carefully examine another round of rapalog-induced phosphorylation of Rab29, with an independent loading control such as alpha-tubulin. The immunoblot analysis will be made against the intensity of non-p-Rab29. The response to the latter question was described in the section 4 below.

      3) Figure 4A-C: Are these stable Rab29 expressing cells? If not, the quantification of "the size of largest lysosome in EACH cell" becomes very problematic. This analysis should be repeated with stable Rab29 variant cells in a background lacking endogenous Rab29. Furthermore, the LAMP1 signal is too dim to see any convincing colocalization (e.g., with WT) or the lack thereof (e.g., in the case of S185D).

      The cells shown in Figure 4 are HEK293 cells transiently expressing Rab29, and the issue of quantification was described in the section 3 below. We agree that the signal of LAMP1 was dim, and it turned out that the confocal microscope we used had problems with the sensitivity of the red channels. We will be taking another round of these images with a new confocal microscope.

      Lastly, the authors should corroborate their findings with an ultrastructural analysis since the electron microscopy would definitively be more suitable for this type of measurements.

      We are planning to obtain electron microscopic images, according to this reviewer’s request. We plan to invite an expert in electron microscopy analysis as a co-author.

      4) The lysosomal colocalization of Rab29 in Figure 5C is again not convincing. This analysis needs to be repeated with high resolution imaging.

      Again, we will repeat this experiment with a new confocal microscope, with the hope that it would yield better images.

      5) The authors need to show the level of LRRK2 depletion (Figure 6). Given the role of LRRK2 in driving lysosomal Rab29 translocation, the importance of the LRRK2 independent pS185 for this process remains unclear.

      We will add the level of LRRK2 on its knockdown; we have experienced that LRRK2 knockdown usually occurs with more than 50% efficiency every time. The response to the latter comment was described in the section 3 below.

      6) In general, the authors employ an alternative, biochemical assay (e.g., LysoIP) for the lysosomal translocation of Rab29. This would in particular help to clarify the effect of the Rab29 variants and LRRK2 inhibition.

      We have previously shown that the overexpressed Rab29 (and LRRK2) is enriched in the lysosomal fraction from CQ-treated cells, which was performed using dextran-coated magnetite (Eguchi et al, PNAS 2018). Using the same biochemical method, we will show the enrichment of endogenous Rab29 in the lysosomal fraction.

      Minor points

      9) Figure 2C is lacking the control IF staining for mitochondria (to which 2xFKBP-GFP-Rab29 is assumed be recruited upon co-expression with FRB-FIS1).

      We will stain the cells with MitoTracker to ensure that anchoring away of 2xFKBP-GFP-Rab29 by FRB-Fis1 results in mitochondrial localization.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The data in the manuscript convincingly demonstrates that lysosomal overload by Chloroquine treatment induces Rab29 localisation to the lysosomes and that this membrane association is dependent on PKCalpha-dependent phosphorylation at Ser185.

      We have a number of rather minor comments listed below:

      Figure 2

      The increasing levels of non-phosphorylated Rab29 over the indicated time course of AP21967 treatment in Figure 2B are concerning. First, could you provide an explanation for this clear increase in both non-p-Rab29 and p-Rab29 in the phostag but not the normal gel? Second, could all quantifications of p-Rab29 be made relative to the non-p-Rab29?

      We will try another round of rapalog-induced phosphorylation of Rab29, with an independent loading control. The immunoblot analysis will be made against the intensity of non-p-Rab29. Reviewer #1 raised a similar concern on Figure 2D.

      Figure 5

      To further demonstrate that PKCalpha phosphorylates endogenous Rab29 at Ser185, we recommend reperforming the Go3983/PMA treatment in figure B with the anti-p-Ser185 antibody. It may be sufficient to perform the treatment only at 4 or 8 hours, simply to provide stronger evidence regarding the phosphorylation of endogenous Rab29.

      We will give a try, although the anti-phosphorylated protein antibodies that we tried never worked for phos-tag SDS-PAGE. With the conventional western blot, we will be able to try this experiment.

      It is not clear whether the activity of PMA in the assay is due to inhibition of PKCalpha. Are the effects ablated by PKCalpha KD

      We will test the knockdown of PKCalpha, beta, gamma and delta by siRNAs to further narrow down the effects of PKC-dependent phosphorylation of Rab29.

      Reviewer #2 (Significance (Required)):

      These cell biology findings are important in the field as both Rab29 and LRRK2 are implicated in the pathogenesis of Parkinson disease. The phosphorilation of Ser185 of Rab29 by PKCalpha is novel and contributes to our understanding of Rab29 and LKRR2 regulation. One limitation of the study is that is conducted in only two cell types quite unrelated to the disease, so how general and disease relevant are the findings it is not clear. Most of the data are solid. There are two experiments whose results are difficult to interpret and a few controls missing. Also a few issues with quantifications, all of which is described in details above and will need to be fixed prior to publication. My expertise for this paper is in the cell biology of lysosomal function.

      The issue that only two cell types were analyzed was also raised by reviewer #1, so we will examine additional cell types, especially those that are predicted to express endogenous Rab29. Our responses to other issues raised are described elsewhere. Thank you for these insightful comments.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Figure 4A-C: Are these stable Rab29 expressing cells? If not, the quantification of "the size of largest lysosome in EACH cell" becomes very problematic. This analysis should be repeated with stable Rab29 variant cells in a background lacking endogenous Rab29. (Reviewer #1)

      As described in the section 2 above, the cells shown in Figure 4 are HEK293 cells transiently expressing Rab29. We are sorry that the description “the size of largest lysosome in each cell” was misleading. As we analyzed only cells overexpressing GFP-Rab29 that were marked with GFP fluorescence, we believe that transient expression should not be a problem. To avoid any misunderstandings, we have described in Figure 4 legends that only lysosomes in Rab29-positive cells (and all cells expressing Rab29) were included in the analysis of the largest lysosome of each cell.

      Regarding the effect of endogenous Rab29 in Figure 4 experiments, Reviewer #2 similarly raised the issue on whether Rab29 phosphomimetics are acting as dominant active, preventing lysosomal enlargement. On this point, we have previously reported that knockdown of endogenous Rab29 causes the enhancement of lysosomal enlargement upon CQ treatment (Figure 5I,J of Eguchi et al, PNAS 2018), suggesting that the lysosome-deflating effect by phosphomimetics is a dominant active effect rather than dominant negative suppressing endogenous Rab29. This point is considered significant, and thus has been explained in the results section (page 7, lines 168-171).

      Along similar lines: why not all cells in Figure 5E and Figure 5G show Rab29- and LRRK2-positive structures? How do the authors know which of these phenotypes is the prevalent one? (Reviewer #1)

      As for the ratio of cells with Rab29- and LRRK2-positive structures, it seems reasonable given that different cells have different levels of exposure to lysosomal stress and that the response is transient and does not occur simultaneously. The ratio of these positive cells may also vary depending on the cell culture conditions. Since Rab29- and LRRK2-positive structures are rarely seen in control cells, we think this would be a meaningful phenotype even if only 20-30% of cells show such structures. The result that the ratio of localization changes is not 100% is now noted in the results section explaining Figure 1G (page 4-5, lines 108-110) where the immunocytochemical data first appears.

      Given the role of LRRK2 in driving lysosomal Rab29 translocation, the importance of the LRRK2 independent pS185 for this process remains unclear. (Reviewer #1)

      Our data suggested that Rab29 is stabilized on lysosomes only when LRRK2-mediated phosphorylation and S185 phosphorylation both occur on Rab29 molecule (as shown in Figure 7 scheme), so we believe there is no contradiction. We have now described more clearly about this notion at the end of the results section (page 9, lines 235-236).

      It is not clear what the authors mean by "lysosomal overload stress". Since mature lysosomal incoming pathways such as autophagy or endocytosis are disrupted by CQ, it is difficult to picture an overload. Maybe rephrasing would help to clarify this. (Reviewer #1)

      Chloroquine (CQ) is known as a lysosomotropic agent that accumulates within acidic organelles due to its cationic and amphiphilic nature, causing lysosome overload and osmotic pressure elevation, and this is what we call “lysosomal overload stress”. The well-known effects of CQ to disrupt lysosomal incoming pathways are ultimately caused by the above consequences. Also, we have previously reported that lysosomal recruitment of LRRK2 is caused by CQ but not by bafilomycin A1, the latter being an inducer of lysosomal pH elevation, or by vacuolin-1 that enlarges lysosomes without inducing lysosomal overload/pH elevation (Eguchi et al, PNAS 2018), and further found that not only CQ but also other lysosomotropic agents commonly induced LRRK2 recruitment (Kuwahara et al, Neurobiol Dis 2020). We thus have described the effect of CQ as “overload”. However, it is true that we have not provided a clear explanation for readers, so we have added some notes for lysosomal overload stress in the introduction section (page 3, lines 69-71).

      Which cell type is used for the IF analysis in Figure 2C? This information is in general quite sparse. The authors should clearly state the cell type for each experiment/Figure. (Reviewer #1)

      We have added cell type information that was missing in several places in the manuscript. We are very sorry for the inconveniences. For clarification, HEK293 cells were used in Figure 2C.

      Are the images in figure 1F representative? i.e. does Rab29 always colocalise to such enlarged lysosomes upon CQ treatment and does CQ treatment always drastically alter the cellular distribution of Rab29? (Reviewer #2)

      The images in Figure 1F are representative of when Rab29 is recruited, but it is not seen in all cells, and the ratio of recruitment (~80%) is shown in Figure 1G. Reviewer #1 also asked why Rab29 recruitment is not seen in all cells, and we gave the same answer above. It may be reasonable to speculate that different cells have different levels of exposure to lysosomal stress and that the response is transient and does not occur simultaneously. The ratio of these positive cells may also vary depending on the cell culture conditions. For the readers’ clarity, we have added that the ratio of localization change of Rab29 is not 100% and is comparable to that of LRRK2 previously reported (page 4-5, lines 108-110).

      Considering that the "forced localisation technique" induces a non-physiological colocalization of non-endogenous Rab29 to lysosomes, it may be an overestimation to conclude just from these data that phosphorylation of Rab29 occurs on the lysosomal surface. This is also quite in contrast with the later finding that phosphorylation by PKCalpha promotes lysosome localization of Rab29. It seems more reasonable to conclude that Rab29 can be phosphorylated when localised at the lysosomes (as opposed to other organelles such as mitochondria). If the authors feel strongly about this point they might need to find a less non-physiological assay. (Reviewer #2)

      Yes, it could be an overestimation, and as we do not have better means to conduct a less non-physiological assay, we have modified the description from “occurred on the lysosomal surface” to “could occur on the lysosomal surface” (page 5, line 112 (subtitle) and line 128).

      Regarding the comparison with the later finding that phosphorylation by PKCalpha promotes lysosome localization of Rab29, these data (Figure 2 and 5) could be explained with a single speculation: phosphorylation of Rab29 on lysosomal membranes could retain Rab29 on the membranes for a longer time. It is not easy to decipher which comes first, association with membranes or phosphorylation of Rab29, in a physiological assay, but considering reports that show PKCalpha activation happens on membranes (Prevostel et al., J Cell Sci 2000), at least the data favor our conclusion over the idea of PKCalpha phosphorylating Rab29 in the cytoplasm and then promoting lysosomal localization. This point is now clearly described in the discussion (page 10, lines 248-251).

      It is not clear how the Rab29 phosphomimetics are acting as dominant active preventing lysosomal enlargement. Authors should speculate or repeat the experiments in absence of endogenous Rab29 to clarify the matter. (Reviewer #2)

      A similar concern about the effect of endogenous Rab29 was also raised by Reviewer #1 (see above). We have previously reported that knockdown of endogenous Rab29 causes the enhancement of lysosomal enlargement upon CQ treatment (Figure 5I,J of Eguchi et al, PNAS 2018), suggesting that the lysosome-deflating effect by phosphomimetics is a dominant active effect rather than dominant negative suppressing endogenous Rab29. This point is considered important and thus has been explained in the results section (page 7, lines 168-171).

      Overall, there is some missing information regarding repeats for Western blots, such as those in figure 3C, 3D and 3E. Please add indications about repeats in the figure legend or methods. (Reviewer #2)

      We have added the repeat information to each figure legend where it was missing. We are very sorry for the inconveniences.

      The model in figure 7 however seems to suggest that Rab29 associates to lysosomal membranes independently, and is then stabilised at the membranes by LRRK2 and PKCalpha - a point which is not directly supported by the data. (Reviewer #2)

      As noted earlier, we consider that phosphorylation of Rab29 on lysosomal membranes could retain Rab29 on the membranes for a longer time, given the present data and previous reports that phosphorylation of Rab29 is more likely to happen on the lysosomal membrane than in the cytosol. Also, as inhibition of either of the two phosphorylations ends up in disperse Rab29 localization, we have made this figure as a model of what is plausible right now. This explanation is now added in the discussion (page 10, lines 248-251).

      English proofreading should be improved: "CQ was treated to HEK293" (page 4), "As we assumed that this phosphorylation is independent of LRRK2" as an opening line (page 5) (Reviewer #2)

      Thank you for pointing out these incorrect wordings. They were corrected.

      4. Description of analyses that authors prefer not to carry out

      In addition, can CQ treatment drive 2xFKBP GFP-Rab29 from mitochondria to lysosomes (in the presence of rapalog and FRB-Fis1)? (Reviewer #1)

      We do not think that a comparison between the affinities of FKBP-rapalog-FRB and Rab29-[unknown factor that directs Rab29 to lysosomes] is necessary, as the former has a Kd in the single digit nM range (Banaszynski et al, JACS 2005), whereas the latter (based on estimations from related PPIs) is estimated to be in the μM range, which shows a much weaker affinity than the former (McGrath et al, Small GTPases 2019). Furthermore, even if Rab29 appears to have migrated from mitochondria to lysosomes as a result of this experiment, one cannot rule out the possibility that a small portion of the mitochondrial membrane was incorporated into the lysosomal membrane that was enlarged by CQ treatment.

      Molecular weight markes should be provided for all immunoblot experiments. (Reviewer #1)

      The immunoblot pictures without molecular weight markers in our paper are all Phos-tag SDS-PAGE blot analyses. Phos-tag SDS-PAGE results in band shifts of phosphorylated proteins, and writing in markers would be misleading. Moreover, previous representative studies heavily using Phos-tag (e.g., Kinoshita et al, Proteomics 2011, Ito et al, Biochemical Journal 2016) also did not show the molecular weight markers. Here we performed phos-tag SDS-PAGE analysis only to find differences in the phosphorylation state of Rab proteins.

      The use of the quantification ratio of cells with Rab29-positive lysosomes in figure 1G might be slightly misleading as it does not allow the reader to understand to what extent Rab29 localisation at lysosomes upon CQ treatment. We recommend using a simpler quantification, such as by measuring the average colocalisation of Rab29 and LAMP1 per cell. (Reviewer #2)

      For figure 5D and 5F, As with figure 1G, we recommend using a more straightforward and impartial method of quantification such as simply measuring the colocalisation of Rab29 with LAMP1. (Reviewer #2)

      Popular colocalization analyses using Pearson’s or Mander’s coefficients would be a good choice if the amounts of Rab29 varied greatly between lysosomes. However, this may not apply in this case; the amount of Rab29 or LRRK2 on each lysosome is considered to saturate quickly and a relatively low amount of them may not be detected on immunofluorescence observations, whereas the probability of finding these structures has been shown to exhibit a moderate sigmoid curve (as seen in Figure 1E or 2H of Eguchi et al., PNAS 2018). Therefore, the amount of Rab29 or LRRK2 could be approximated to a Bernoulli distribution in terms of colocalization with lysosomes, and this is the reason why we chose to quantify “the ratio of cells with Rab29-positive lysosomes”.

      We recommend using a more transparent and simple quantification method, such as average size of lysosomes per cell. (Reviewer #2)

      As one can see in the inset of Figure 4B, unenlarged lysosomes are unfortunately too small for the quantification of their size, much less tell two small lysosomes apart in our experimental settings and laboratory resources, so we decided to analyze the largest lysosome in each cell as a representative of the cells to minimize measurement errors. This measurement only includes GFP-Rab29 positive cells, and by comparing against CQ-untreated cells we intended to increase the validity of this analysis. This quantification method was also used in our previous report (Eguchi et al, PNAS 2018).

    1. Author Response

      1) Response to the Editor

      We thank the Editor and the Reviewers for the kind words, the helpful suggestions, and the points of critique, which have all helped us substantially strengthen the manuscript in this revised version. Regarding the 3 general critiques highlighted by the Editor:

      Essential Revisions:

      1) Some hypothesis, and in particular the one that all individuals have the same inter-burst interval distribution should be tested/justified/discussed.

      (a) We have generalized the theory to directly address this point by relaxing the assumption of an identical inter-burst interval for all individuals. In short: the main insights continue to hold and we discuss the nuances in the text.

      (b) Experimentally, the hypothesis that all single fireflies isolated from the group exhibit the same interburst interval (IBI) distribution could not be rigorously tested. The main reason is practical: in order to compare IBI distributions across individuals, we would need to collect a large number of fireflies and track them for long durations, which was not realistic given our experimental setup and the short window of firefly emergence. In addition, external environmental factors might slightly alter behaviors as well, making comparisons even more complex. Thus, due to paucity of field data, we eventually use the assumption that all individual fireflies follow the same IBI distribution.

      2) Comparison between the models and the data must be improved, in particular through a quantification of the differences between distributions and sensitivity analysis of the numerical results.

      (a) Regarding the comparison of the agent-based simulations with experimental data, in Fig. 7, we compare the underlying distributions using the two-sided Kolgomorov-Smirnov statistical test for goodness-of-fit. These appear to us the most straightforward and informative approaches, without over-fitting.

      (b) Regarding sensitivity analysis for the agent-based simulations, for each β value from 0 to 1 we statistically compared simulations to the experimental distributions to find the most well-fitted β.

      (c) Finally, owing to experimental constraints leading to sparsity of available data in characterizing the interburst distribution, we strive to strike a delicate balance between sophisticated statistical tools to compare theoretical and simulation distributions (with unrestricted access to large sample sizes) to the finite samples in the empirical distributions. As such, we think it is the apposite to use the first two moments of respective distributions In Fig. 3 to show the striking similarity of trends.

      3) More discussion of the modeling in connection to past theoretical results and existing literature is necessary to better contextualize the present work and assess its originality.

      We have done this closely following the specific suggestions from reviewers.

      2) Revised terminology: removing usage of “model”

      Since unintended ambiguity may be caused by use of the word “model”, which could refer to either (1) the theoretical framework, principle of emergent periodicity, and attendant analytic calculation , or (2) the agent-based simulation in the computational realization, we have removed all instances of the word “model” from the results presented in the paper, and replaced by the specific meaning (theory or simulation) in each context.

      Similarly, in responding to Reviewers’ comments, we clarify what we understand by their use of the word “model” in each case.

      3) Addressing an error in the agent-based simulation code

      We (OM and OP) have now addressed an inadvertent unit typo in the agent-based simulation code. The discharging time (Td) before the typo was fixed was set to 10000ms. After the fix, the Td value was correctly set to 100ms. This caused very slow discharges, keeping the voltage high until any beta addition was received, resulting in more frequent bursts than we’d actually expect from the model dynamics. This has been fixed, and in our responses to the reviewers, we address the results of this fix by referring to the “unit typo”. We corrected the panels corresponding to agent-based simulation in Figs. 3 and 5 to reflect the new numerical simulation results, as well as the corresponding sections in the text of the paper.

      4) Addressing changes to experimental dataset

      We increased the size of our N=1 dataset (N is number of fireflies) to correctly match what was reported in the original text of 10 samples. Additionally, we have added characterization of the size of the datasets for N=5, 10, 15, and 20 fireflies.

      5) Response to Reviewer 1

      We thank the Reviewer for kind remarks, and the highlights of the strengths of the paper.

      Regarding concerns raised, point by point:

      Reviewer #1 (Public Review):

      Weaknesses:

      The work presented here is an excellent start at understanding the collective behavior of this particular species of firefly. However, the model does not apply to other species in which individual males are intrinsically rhythmic. So the model is less general than it may appear at first.

      We take the Reviewer’s point well. We have added text to the paper to clearly highlight this point.

      The modeling framework is also developed under the very stylized conditions of experiments conducted in a small tent. While that is a natural place to begin, future work should consider the conditions that fireflies encounter in the wild. Swarms that are spread out in space would require a model with a more complicated structure, perhaps with network connectivity and coupling strengths that both change in time as fireflies move around. This is not so much a weakness of the present work as a call to arms for future research.

      We agree with the Reviewer that this is an exciting call to arms for future research!

      Other comments:

      This assumption that all individuals have the same IBI distribution could be directly tested. Has this been done? If not, why not? e.g. Are there difficulties with letting one firefly flash long enough to collect sufficient data to fill out the distribution?

      1. We have generalized the theory to directly address this point by relaxing the assumption that all individuals exhibit the same inter-burst interval distribution. In short: the main insights continue to hold and we discuss the nuances in the text.

      2. Experimentally, hypothesis that all single fireflies isolated from the group exhibit the same interburst interval (IBI) distribution could not be rigorously tested. The main reason is practical: in order to compare IBI distributions across individuals, we would need to collect a large number of fireflies and track them for long durations, which was not realistic given our experimental setup and the short window of firefly emergence. In addition, external environmental factors might slightly alter behaviors as well, making comparisons even more complex. Thus, due to paucity of field data, we eventually use the assumption that all individual fireflies follow the same IBI distribution.

      The derivation given in 6.2.1 is clearer than the approach taken here, which unnecessarily introduces Q, q, and c and then never uses them again.

      We agree with the Reviewer and have accordingly revised the manuscript.

      We have also implemented the suggested edits in the marked up manuscript. We are grateful for the detailed feedback, which helped us substantially extend results, and improve presentation and clarity.

      6) Response to Reviewer 2

      We thank the Reviewer for their thorough feedback. We provide point by point responses below.

      Reviewer #2 (Public Review):

      1) The biological relevance of certain hypotheses is insufficiently discussed. This is important because if the observed behaviour is a universal one, alternative models may explain it as well.

      We thank the reviewer for raising this point. The main hypotheses underlying our models are: 1) individual fireflies in isolation flash at random intervals; 2) these random intervals are drawn from the empirical distribution reported (implicitly: all fireflies follow the same distribution); 3) once a firefly flashes, it triggers all others. Hypothesis 1) is directly supported by the data presented. Hypothesis 2) is comprehensively addressed in the revised manuscript, as discussed previously. Hypothesis 3) is central to the proposed principle, and enables intrinsically non-oscillating individuals to oscillate periodically when in a group. The resulting phenomenon has been compared to experimental data and extensively discussed in the manuscript. Further, we have also simulated the effect of changing the strength of coupling between fireflies based on this hypothesis in the revised section on agent-based simulation.

      2) Comparison between the models and the data could be improved, in particular through quantification of the differences between distributions and sensitivity analysis of the numerical results.

      1. Regarding the comparison of the agent-based simulations with experimental data, in Fig. 7, we compare the underlying distributions using the two-sided Kolgomorov-Smirnov statistical test for goodness-of fit. These appear to us the most straightforward and informative approaches, without over-fitting.

      2. Regarding sensitivity analysis for the agent-based simulations, for each β value from 0 to 1 we statistically compared simulations to the experimental distributions to find the most well-fitted β.

      3. Finally, owing to experimental constraints leading to sparsity of available data in characterizing the interburst distribution, we strive to strike a delicate balance between sophisticated statistical tools to compare theoretical and simulation distributions (with unrestricted access to large sample sizes) to the finite samples in the empirical distributions. As such, we think it is the apposite to use the first two moments of respective distributions In Fig. 3 to show the striking similarity of trends.

      Reviewer #2 (Recommendations for the authors):

      A. The assumption that single-firefly spikes obey the same distribution (there is no individual variation in the frequency, or even of the composing number of bursts, of the flash) does not seem to have been verified on the data, that are instead pulled together in one single distribution (Fig. 1D). Moreover, the main feature of such distribution is that it has a minimum at 12 secs (discarding the faster bursts that are not considered in the model) and that it is sufficiently skewed so that it takes a minimal coupling for collective synchrony to emerge. I think that the agreement between the distributions for different N would be more meaningfully discussed having previous work as a reference, whereas now this is relegated to the discussion, so that it is unclear how much of the theoretical results are novel and/or unexpected. Quantification of the distance between distributions would also be interesting: it looks like the two models (analytical and simulations) disagree more among themselves than with the data.

      Regarding the hypothesis that all individual fireflies exhibit the same interflash interval, please see our response to Main Point 1. Regarding comparing the analytical theory and numerical simulation analysis, Figs. 3 and 5 have been revised after a unit typo was found in the code (see Section 2). Following the update, the analytical and numerical models agree in (1) the location of the peak in Fig. 3 for all N values, and (2) the peak approaches the minimum of the input distribution as N increases.

      B. If I understand correctly, simulations are introduced as a way to get a dependence on the intensity of the coupling (\beta). There are several issues here. First, I do not see how the coupling constant could change in the present experimental setup, where all fireflies presumably see each other (different from when there is vegetation). Second, looking at Fig. 3, the critical coupling strength appears to depend very weakly from N, and it is not clear how the 'detailed comparison' that leads to the fit is realized (in fact, the fitted \betas look larger that those at which the transition occurs in Fig. 3A). I think a sensitivity analysis is needed in order to understand how do results change when \beta is changed, and also what is the effect of the natural Tb distribution (Fig. 2 F). Results of the simulations might be clearer if instead of using the envelope of the experimental results, the authors tried to fit it to a standard distribution (ex. Poisson) so that it can be regularized. This should allow to trace with higher resolution the boundary between asynchronous and synchronous firing.

      We have included agent-based numerical simulations as a way to provide a concrete instantiation of the theory principle and analytical results in the preceding section. While the analytic theory results are fitting parameters free, in the agent-based simulations, we introduce an additional fitting parameter, to see what happens when we relax one hypothesis of the analytical theory: the instantaneous triggering of all fireflies upon an initial flasher. Additionally, the agent-based simulations pave the way for future work, allowing for convenient exploration of the connectivity between individuals and analysis of the behavior of individual fireflies. in this context, please note that Fig. 5 has been corrected (see above), leading to a stronger co-dependence of β and N. In addition to the envelopes, we also report the trends in the first empirical moments (mean and STD) for comparison and tracking of the transition to synchrony.

      C. More care should be put in explaining what are the initial conditions hypothesized for the different models. For instance, the results of paragraph 3 are understandable if all fireflies are initialized just after firing, something that is only learnt at the end of the paragraph. I also wonder whether initial conditions may be involved with T_bs in the low-coupling region of Fig. 3A not being uniformly distributed, as I would have expected for a desynchronized population.

      We have clarified that, indeed, all fireflies are re-initialized after firing. The initial conditions then become a new random vector of interflash intervals. Importantly, we found after receiving the reviews that, due to inconsistent units in our numerical simulation code, Fig. 5 was incorrect. With proper units, the new results show a much more widespread distribution at low coupling, as expected by the Reviewer.

      D. I found that equations were hard to understand either because one of the variables was not precisely (or at all) defined, or because some information was missing: Eq. 1: q is not defined Eq. 2: explain what it means: the prob. that others have not flashed times that that one flashes. Also, say explicitly what is the 'corresponding PDF. Eq. 3: the equation for \epsilon(t) to which this is coupled is missing Why introduce \beta_{i,j} and T_bi if they are then taken independent of the indexes? Definitions of collective and group burst interval should be provided. It would be clearer if t_b0 was defined in the first paragraph of the results, so as to clarify as well its relation with T_b. Define T^i_b in the caption of Fig. 3 (they are defined later than the figure is first discussed). The definition of 'the vertical axis label' (maybe find a word for that...) is pretty cumbersome. I could imagine that other definitions would allow the lines in Fig. 3 E to converge to the same line for large betas, which would make more sense, considering that in the strong coupling limit I see no reason why the collective spiking should not be the same for different N (the analytical model could help here).

      Thank you for these comments; we have incorporated these and related changes.

      E. I think that the author's reading of the two 'dynamical quorum sensing' papers they cite is incorrect: De Monte et al. was not about the Kuramoto model, but the same limit cycle oscillators as in Strogatz; Taylor et al. considers excitable systems, potentially closer to noisy integrate-and-fire, at least in that they do not have self-sustained oscillations. Both papers show that oscillations appear above a certain density threshold, and that the frequency of oscillations increases with density, as found in this work. A more accurate link to previous publications in the field of synchronization theory, including the models by Kurths and colleagues for fireflies, would be useful both in the introduction and in the discussion, and would help the reader to position this work and appreciate its original contributions.

      1. Thank you for pointing out an inaccuracy in our literature citations regarding synchronization. We have now made corrections to address this point.

      2. While we take the Reviewer’s points well, our theory framework (“model”), building off of the principle of emergent periodicity we propose here, is fundamentally different in the nature of individuals from extant “models”. The reference in question has individuals as oscillators, and the fastest frequency is the frequency of the fastest individual oscillator. In contrast, in our work there is no fastest individual oscillator and the “fastest frequency” has a completely different meaning, since individuals do not have a particular frequency associated with them. In this sense, our work is not inspired by theirs. That said, we have included citations as suggested by the Reviewer.

      F. The authors say that part of the data is unpublished. I guess they mean that the whole data set will be published with this manuscript. I think the formulation is ambiguous.

      Thank you for this comment. We have now clarified that the data will indeed be published with the manuscript.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Basier and Nurse revisit the fundamental question of how the rates of RNA and protein synthesis scale with cell size. The strong null hypothesis is that synthesis scales linearly with cell size: cells that are twice as big should make stuff twice as fast. This hypothesis has been tested many times, in many systems, using many approaches over the past century and, in general, the null hypothesis has been sustained. However, there have been many examples of evidence for more complicated synthetic patterns. Whether these results indicate that biosynthesis rates vary across the cell cycle, or in response to other factors, in addition to increasing with cell size, or whether observed deviations from the predictions of the null hypothesis has been due to artifacts of cell synchronization and labeling, is thus an open, interesting and, because biosynthesis rates have critical implications in cellular function and metabolic robustness, important question.

      The authors address the question in fission yeast using metabolic pulse labeling with a ribonucleoside or amino acid analog in asynchronous cells and single cell analysis to directly compare incorporation levels with cell size and cell cycle stage. The experiments are well designed, well executed and well controlled. Furthermore, the data is well presented and appropriately interpreted. In particular, the presentation of the size-v.-label data in Figures 2A and D, with the averages and variances in 2B and E and the normalized data in 2C and F are easy understand and interpret. It is thus notable that the size-v.-label data for the longer (cdc22-22) cells is omitted in favor of just the average (2H,J) and normalized (2I,K) data. This size-v.-label data should be added to Figure 2.

      We added two panels to the Figure supplementary 2 showing the requested data, the size-v.-global translation (S2E) and size-v.-global transcription (S2F).

      The authors should also explicitly state how they chose 15 µm as the inflection point in 2H; 16-17 µm seems like it would give a horizontal plateau, which would better fit their saturation explanation.

      This comment relates to the second comment of reviewer 4, see below for the detailed answer.

      The authors measure DNA content with a DNA-binding dye, the signal from which should linearly scale with DNA content. However, instead of reporting and analyzing total signal from the DNA-binding dye (or better yet, total signal in the nucleus, which they could do, having segmented the nucleus in their images), they report max signal. Using max signal is complicated because, as cells and thus nuclei increase in size the concentration of DNA and thus the max (but not total) DNA-binding-dye signal in in the nucleus decreases, requiring two-dimensional dye/size analysis (such as shown in Figure 3B) to distinguish G1 and G2 cells. The authors should use the more straight forward measure of total nuclear DNA-binding dye signal, or explicitly explain why they can't or prefer not to do so.

      The total fluorescence intensity signal of the DNA-binding dye is noisy because we had to use a low concentration of the dye. This was necessary as it allows a clearer distinction between cells with a one 1C DNA content and cells with a 2C DNA content that higher concentrations did not. The maximum signal per cell-v.-cell length produces distinct populations of cells in G1, or G2/M phase (see Figure 3H, and Figure 4B), and populations identified in this way have the distributions of total fluorescence intensity expected from cells in G1 and G2 or M phase (see Figure 3I and Figure S4D). We added one extra panel to Supplementary Figure 4 showing the distributions of the total fluorescence intensity signal of the DNA-binding dye for the G1, S, and G2 or M populations (S4D) for comparison.

      The authors should state in figure legends the strain numbers used for all experiments.

      We have modified all the figure legends to include the strain numbers.

      They should also cite the source of all the constituent parts (e.g. hENT1, hsvTK, EGFP-pcn1, and synCut3-mCherry) of their strains.

      The missing reference for the source of hENT1 and hsvTK (Sivakumar et al. 2004) has been added, the references for EGFP-pcn1 (Meister et al., 2003) and synCut3-mCherry (Patterson et al., 2021) were already present.

      CROSS-CONSULTATION COMMENTS My colleagues make constructive points. I agree with all of them, although I am less concerned about the use of cdc2-22 and CCP∆ to alter cell length and cell cycle distribution. Although these mutations alter CDK specific activity (and thus length and distribution) and could alter specific patterns of translation, the fact that they double at normal rates makes it seem unlikely that they could be significantly changing bulk synthesis rates.

      Reviewer #1 (Significance (Required)):

      As noted above, this work addresses an open, interesting and important question. Moreover it provides useful data in a specific system and a useful example of a general experimental approach to the problem. However, it does not settle the question of how biosynthesis scales with size, even in the specific case of fission yeast. In particular, it shows that protein synthesis plateaus just above normal cell size, whereas RNA synthesis scales up to twice normal cell size. This observation is striking, because there is no obvious mechanism that would (and the authors offer no suggestion of how to) explain how protein synthesis could be limited if RNA synthesis is not. Therefore, the strength of the paper is that it identifies an intriguing phenomena and its limitation is that it does not provide any testable hypotheses to explain that phenomena.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: Basier and Nurse investigate how "cell size, the amount of DNA, and cell cycle events affect the global cellular production of proteins and RNA molecules". Both transcription and translation, driving the production of biomass, have been shown to increase as a function of cell size in various systems. However, whilst cell size generally correlates with cell cycle progression there are inconsistent results in the literature if global cellular translation and transcription is affected by cell cycle state. They argue that this might be due to perturbations induced by different synchronisation methods used in the various studies.

      Therefore, in this study, to avoid potential perturbation from synchronisation methods, they developed a system that allows to assay unperturbed exponentially growing populations of fission yeast cells. The assay is based on single-(fixed)cell measurements of cell size, cell cycle stage, and the levels of global cellular translation and transcription. This allows them to correlate cell cycle state, cell size and global cellular translation and transcription levels at the single cell level under unperturbed conditions.

      Their results show that translation and transcription steadily increase with cell size, but that the rate of translation, but not transcription, becomes rapidly restricted when cells become larger than wild type dividing cells. This suggests that it is unlikely that the synthesis of RNA is the limiting factor for translation rate in large cells. In addition, their data indicates that translation scales with size, but that the rate increases faster at late S-phase/early G2 and even faster in early in mitosis before decreasing in mitosis and return to interphase. Transcription, on the other hand, increases as a combination of size and the amount of DNA. Overall, this suggests that cell cycle control affects global cellular translation and transcription, which is in line with some studies, but not others. As far as I can tell the assays and data analysis are robust and the data supports the general conclusions.

      Major comments I agree that inconsistent results published on this topic might be due to perturbations induced by different synchronisation methods used in the various studies. However, but much less emphasised in the paper, it also likely depends on the model system used. For example, in budding yeast there is strong evidence for gene expression homeostasis, i.e. gene expression increases as a function of size, independent of gene copy number. Do the authors believe this is a budding yeast specific phenomenon or is this a consequence of specific synchronisation methods used in budding yeast?

      Gene expression homeostasis has been suggested for budding yeast, but in contrast recent work in budding yeast also suggests that gene expression increases with the genome copy number and therefore the gene copy number in addition to cell size (Swaffer et al., 2022 – currently on bioRxiv). The differences that have been reported might be due to perturbations such as synchronisation methods as well as differences between yeast species.

      Whether growth rate increases linearly or exponentially has been the topic of decade long debates. Their data indicates that the translation rate increases faster at late S-phase/early G2 and even faster early in mitosis before decreasing in mitosis and return to interphase, 'resetting' the growth rate. This suggest an exponential, rather than linear, increase in biomass (i.e. growth rate?), but this is not explicitly pointed out. It would be good to get the authors opinion on this in the discussion.

      Assuming that protein degradation remains constant throughout growth, the increase of translation with cell size suggests that the growth rate increases as cells grow in size, possibly exponentially. In addition, our data showing that the translation rate increases from G1 to G2 for the same cell size, suggests that for cells of a given size the growth rate is faster in G2 than in G1. Thus, growth could be basically exponential but the speed of increase accelerates at the transition between S and G2, and early in mitosis, slowing down later in mitosis. We added the following sentence to the discussion section “Global transcription and translation increase with cell size possibly exponentially, but the changes in global translation during transitions through cell cycle stages suggest that the speed of growth is modulated by cell cycle progression, increasing between S and G2, and early in mitosis, and slowing down later in mitosis.”.

      The authors state that their approach has allowed them to determine how cellular changes are arising from progression through the cell cycle. However, they use fixed cells, rather than live cell imaging, so can't claim to have established changes during cell cycle progression, but only a correlation with cell cycle state/phase. Whilst this could be used as a proxy for progression it should be clearly stated in the abstract and elsewhere to prevent confusion. I for one, based on the abstract, thought they developed a live cell imaging strategy to look at this.

      We have modified the abstract to reflect the fact that the cells were fixed in our assays (line 36).

      In reference to the Stonyte, et al., study, in addition to different conditions (temperature shift and isoleucine medium), why do the authors think their findings are different? Is it the lack of correlation to cell size in the Stonyte paper or something else? For example, would using different growth conditions (as in the Stonyte paper). where fission yeast cells spend more time in G1, be used instead of the CCP mutant? Can the authors exclude that the lack of G1-S/cyclin-CDKs is not at the basis of a lower rate of translation in G1 and S phase cells? Either these experiments should be carried out or this should be discussed in more detail.

      In the study carried out by Stonyte et al., the relative translation rate per cell (a measurement related to our measurement of translation normalised per unit of length) of wild type fission yeast cells grown asynchronously in isoleucine minimal medium is constant between the G1 and the S phase cell populations, and is higher in the G2 population compared to the S phase population (Figure 2D of Stonyte et al., 2018). This is consistent with the lack of increase that we observe for a given cell size from G1 to G2, and the increase we observe from S to G2 in Figure 3K. In the same figure, Stonyte at al., find no difference between the G2 and the M-G1 populations but are not able to distinguish cells at different stages of mitosis or in early G1. Our study suggests that translation increases early in mitosis before decreasing after anaphase A, thus in the Stonyte et al study, pooling all stages of mitosis and early G1 cells might mask the dynamics of what is happening during mitosis. The lack of G1-S/cyclin-CDK could be the basis for the lower rate of translation in G1 and S-phase. We discuss this further in a reply to the first question of the significance part of reviewer 2 and have added a section to the discussion of the paper (see below for details).

      If the signal to noise signal is reduced by 20 minutes EU incubation (rather than 10 minutes) why wasn't it used in all experiments?

      To measure RNA production as closed as possible to the instantaneous rate of RNA synthesis, we sought to use the shortest pulse possible. We did this because the half-lives of some RNA species are short, in particular, the half-life of the pool of mRNA has been reported to be around 13.1 minutes in budding yeast (Chan et al., 2017). In longer pulses, some RNA molecules that have been synthesised after addition of EU will therefore have been degraded before cells are fixed, producing a measurement that underestimates the rate of RNA synthesis. We chose to incubate cells for 10 minutes as we estimated it to be the shortest time generating a signal to noise ration above 1 (Figure 1F). The one exception to this was with the pulsing of the CCP∆ EGFP-pcn1 hENT1 hsvTK mutant cells which incorporates less EU during the same time frame so we incubated this strain for 20 minutes to generate enough signal to be quantifiable (see line 237, “we assayed CCP∆ EGFP-pcn1 hENT1 hsvTK cells for global transcription using a 20-minute EU incubation to compensate for their lower signal production”).

      And the conclusion that the increase in transcription is not showing any discontinuities, are they referring to the triplicates in the supplementary figure 2?

      We think there might be a misunderstanding. We conclude that the increase in transcription shows no discontinuity because the median transcription increases steadily with cell length in Figure 2E. We have added “since global transcription increases smoothly with cell length (Figure 2E)” to clarify the text.

      Minor comments Lines 168-169: should be Figure 2F, S2C, S2D rather than Figure 2C, S2A, S2B.

      The figure numbers have been corrected in the manuscript.

      Line 179: doubling time instead of growth rate?

      The mention of “growth rate” has been changed to “doubling time” in the manuscript.

      Lines 184-186: There is an overall trend of slight decrease in transcription per length in cdc25-22 cells but a slight increase in wild-type cells. How does this differ to wild-type cells? Are these non-significant changes and could these be attributed to the low signal to noise ratio?

      These changes may be due to the low signal to noise ratio in the cdc25-22 transcription assay. We have added “The decrease with cell length in transcription that we observe in the cdc25-22 hENT1 hsvTK (Figure 2K) cells but not in the hENT1 hsvTK cells (Figure 2F) may be due to the low signal to noise ratio”.

      There is no cell size that is specific to S phase, it falls within the range of G1 and G2 cells. Since this strain has a variable onset of S phase, the phase durations could differ. Therefore, could time spent in each phase affect the translation rate (live cell imaging, i.e. progression, could address this, but not fix cell correlation)?

      It is possible that the phase duration of G1 and G2 could differ from one cell to another. There is no evidence that the length of S-phase varies in these cells. It would be interesting to measure how the phase length influences translation, but our techniques do not allow for the measurement of global translation in living cells.

      The data reflects translation/transcription in single cells at a specific cell cycle phase, not during the transition between cell cycle phases. Therefore, it would be more appropriate to only use G1, S, G2 and M rather than S/G2 transition or early G2.

      Our data represents cells at fixed cell cycle phases and we do not monitor the transition themselves directly. However, the discontinuity in signal for cells of the same size in consecutive stages of the cell cycle (for instance the discontinuity in translation between S and G2 cells of the same size in Figure 3J) is indicative that the transition between the two cell cycle phases is a consequence of a rate change.

      In figure 4C, there is a decrease in global transcription after 13 um (black line showing all cells), which they don't see in cdc25-22 mutants. Their conclusion that global transcription is constantly increasing with cell size is based on cdc-25 cells but the experiment in CCP mutant cells shows a decrease in the median of transcription. Are there replicates for these experiments as in figure 2 and supplementary figure S2? Maybe an average trend can be plotted too? Apart from the first set of experiments (figure 2 and supplementary figure 2), they don't show replicates for other strains. Maybe they can include another graph as in figure 3D and 3K of average replicate values?

      The apparent decrease in transcription on Figure 4C in long cells is seen in only one length bin (13.5 µm), which has a smaller number of cells compared to the ones directly before (89 cells, compared to 216 cells for the 12.5 µm bin and 316 cells for the 11.5 µm bin). This might have resulted in a higher variability in the measurement of the population median. We do not see the same decrease at 13.5 µm in the wild type (Fig 5G), the cdc25-22 mutant (Fig 2J), or the CCP∆ strain (Fig S4B) so on balance we favour the interpretation that the decrease observed in the longer length bin of Figure 3J is due to variability caused by the lower number of cells in that bin.

      CROSS-CONSULTATION COMMENTS I believe that since the whole premise of this study is that by using unperturbed conditions their findings are different from previously published work they should either clearly point out that this difference might be due to using mutations affecting CDK activity or carry out an experiment in media that induces a G1 population. CDK has been strongly implicated in promoting translation. Using a strain that lacks the G1 and S cyclin CDKs or compromised M-CDK is therefore likely to have an effect on translation, which could be at the basis of the increase in translation during the G2 (and S) phase of the cell cycle.

      This is addressed in the next comment.

      Reviewer #2 (Significance (Required)):

      As far as I can tell the assays and data analysis are robust and the data supports the general conclusions. However, whilst the cells are assessed in unperturbed conditions, they do use CDK mutants and the cdc25ts mutant to establsih gene expression during the different phases of the cell cycle, which could affect translation/transcription rates. This should either be clearly pointed out or complemented with an experiment where WT cells are grown in conditions that induces distinct G1-S-G2 populations of cells.

      The cell cycle stage and CDK activity are intrinsically linked. CDK activity defines the cell cycle stage so that an increase in CDK activity through the cell cycle is responsible for cells progressing through G1, S, G2, and mitosis (Coudreuse and Nurse, 2010, Swaffer et al., 2016). Nutritional conditions that induce a G1 also rely on repression of CDK activity through increased production of the Rum1 inhibitor (Rubio et al., 2018) to generate a G1 population. Therefore, uncoupling CDK activity from the cell cycle would not be possible in an unperturbed cell population. We have added the following paragraph to the discussion to address the comment “The cell cycle stage of a cell and the activity of its CDK molecules are intrinsically linked since CDK activity defines the cell cycle stage of a cell. CDK activity increases through the cell cycle and is responsible for cells progressing through G1, S, G2, and mitosis [44,53] so that an unperturbed asynchronous population of cells in G1 is achieved by a low CDK activity. Thus our results reflect changes happening through the cell cycle as the CDK regulation network undergoes modifications, and an unperturbed cell cycle therefore cannot be uncoupled from CDK activity.”.

      Overall, the work presented suggests that cell cycle control affects global cellular translation and transcription, which is in line with some studies, but not others. Whilst the study falls short of testing/establishing the (potential) mechanisms involved, these are important findings, which can be used to guide new studies into how the production of biomass is controlled as cells proceed through the cell cycle.

      The cell size field, which is considerable and growing, will be interested in this work.

      I have expertise in cell cycle control and genome stability, with a focus on the G1-to-S transition and cell cycle checkpoints during interphase.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary Basier and Nurse use fission yeast as a model system to investigate how transcription and translation are coupled to cell-cycle progression. They use metabolic labeling in exponentially growing cells and analyze single cells by microscopy. They find that translation scales with size and increases at S/G2 and early mitosis while transcription increases with both size and the amount of DNA. They suggest that changes in CDK activity regulate changes in global translation rates.

      Major comments: 1) The paper addresses a much-disputed question in the field. The approach makes the most of the fission-yeast model system and the experiments are beautifully performed. The conclusions are well supported by the data. The experiments are replicated adequately and the statistical analyses are appropriate.

      2) The use of cdc25 and in particular the cig1Δ cig2Δ puc1Δ mutants to manipulate cell size is not without challenges when monitoring translation rates. A number of reports in different model organisms suggest that CDK activity can regulate translation. Work from the Nurse lab identified translation factors as CDK substrates (Swaffer et al, 2016), RNApolIII activity and thus tRNA levels are regulated in the cell cycle by CDK in budding yeast (Herrera et al, 2018), phosphorylation of the ribosomal protein RPL12 by CDK1 is required for translation of at least some proteins in mitosis in human cells (Imami et al, 2018), as is phosphorylation of DENR (Clemm von Hohenberg et al, 2022). The authors also suggest that changes in CDK activity might be responsible for the observed changes in global translation rates. It is important to consider whether using mutants impinging on CDK activity might lead to under- or overestimating cell-cycle dependent translation. The authors should either discuss this issue and tune down the hypothesis that CDK activity regulates changes in global translation rates, or use another approach to address the issue. One could use a replication mutant such as cdc17 or cdc20 to alter cell size without interfering with CDK activity. These experiments would strengthen the conclusions and might support the idea that CDK activity regulates changes in global translation rates. References Clemm von Hohenberg K, Müller S, Schleich S, Meister M, Bohlen J, Hofmann TG, Teleman AA (2022) Cyclin B/CDK1 and Cyclin A/CDK2 phosphorylate DENR to promote mitotic protein translation and faithful cell division. Nat Commun 13: 668 Herrera MC, Chymkowitch P, Robertson JM, Eriksson J, Bøe SO, Alseth I, Enserink JM (2018) Cdk1 gates cell cycle-dependent tRNA synthesis by regulating RNA polymerase III activity. Nucleic Acids Res 46: 11698-11711 Imami K, Milek M, Bogdanow B, Yasuda T, Kastelic N, Zauber H, Ishihama Y, Landthaler M, Selbach M (2018) Phosphorylation of the Ribosomal Protein RPL12/uL11 Affects Translation during Mitosis. Mol Cell 72: 84-98 e89 Swaffer MP, Jones AW, Flynn HR, Snijders AP, Nurse P (2016) CDK Substrate Phosphorylation and Ordering the Cell Cycle. Cell 167: 1750-1761 e1716

      As discussed above in the reply to reviewer 2, the cell cycle stage and CDK activity are intrinsically linked, CDK activity defines the cell cycle stage so that an increase in CDK activity through the cell cycle is responsible for cells progressing through G1, S, G2, and mitosis (Coudreuse and Nurse, 2010, Swaffer et al., 2016). Therefore, uncoupling CDK activity from the cell cycle is not possible in an unperturbed population. Temperature sensitive mutants of cdc20 (Ramirez et al., 2015, Win et al., 2002) and cdc17 (Jimenez et al., 1992) cause loss of viability when cells are shifted to the restrictive temperature so it cannot be assumed that they are in unperturbed conditions which makes results hard to interpret. It should be noted as far as possible in these experiments we have tried to avoid perturbations. In addition, the fraction of cells permeabilised in our assay decreases significantly when cells are grown above 30 °C, making it difficult to assay such temperature shifts.

      Minor comments: 1) The figures are beautifully presented, easy to understand and the cartoons present the experimental strategies very clearly.

      2) A major feature of the approach is that translation and transcription are monitored in exponentially growing cells, which are not exposed to any stress such as cell-cycle synchronization. However, one could argue that the analogues used for labeling impose some kind of stress, even if this is not very likely at the labeling times employed. A simple control experiment where the growth rates of labeled and unlabeled cells are compared would strengthen the claim that these are indeed happily growing cells.

      It is possible that incubating cells with the analogues could impose some kind of stress on the cell although that could be said about almost any experimental procedure. We have added two supplementary figures with the suggested experiments, showing that incubating cells with EU has little or no impact on their doubling time (we see at most a 2.4 % increase in doubling time in hENT1 hsvTK cells incubated with 20 µM EU, Figure S1I) and that incubating cells with HPG has little impact on their doubling time (we see a 8.6 % increase in doubling time in wild type cells incubated with 10 µM HPG, Figure S1H). Considering the small impact of analogue incubation on the doubling time of the population, and the fact that cells are only exposed to the analogue for a short time in our assays (compared to continuous growth in the presence of the analogue in the growth curves presented in Figure S1H and I), we conclude that the stress imposed is low.

      3) Please comment why the length of the EU labeling differs from figure to figure. In fig 2C, S2C and S2D the labeling on the y axes states 10 min, in Fig 4C it says 20 min.

      Please refer to the reply to reviewer 2 on the same topic.

      4) Lines 118-119 "The pulse signal was five times the background signal." Figure S2A,B show large variation in signal intensity after 5 min labelling. It is not clear how the pulse signal was estimated to be five times the background signal.

      We have added two panels for the supplementary figure 2 showing how the signal to noise ratio was computed for the HPG assay after 5 minutes of incubation (Figure S2G) and for the EU assay after 10 minutes of incubation (Figure S2H).

      5) In Fig S4C transcription is up by ca 60 % from G1 to G2, while in Fig 4D transcription is up by ca 25-30%, also from G1 to G2. The only difference I can see is the use of PCNA-GFP. Please comment what the reason might be.

      In Figure 4D, transcription is up 33 % from G1 to G2 and in Figure S4C, transcription is up 62 % from the 1C to the 2C population. It is possible that the EGFP-pcn1 strain might have a small growth defect which could possibly explain its lower signal production, the slower growth rate might mean that the concentration of RNA polymerase could be lower in this strain and the dynamic equilibrium model predicts that this would results in a smaller increase from G1 to G2 compared to cells with a higher concentration of RNA polymerase. But obviously this is speculative.

      6) Fig 1 B images of unlabeled control cells should also be shown.

      We have added 2 panels to the supplementary figure 1 showing the background controls in which cells are fixed immediately after addition of the analogue for the HPG assay (Figure S1F) and for the EU assay (Figure S1G).

      7) Lines 156 "to investigate how global cellular translation and transcription are affected by cell size, and by progression through the cell cycle" should be amended. Throughout the description of data in figure 2 binucleated and septated cells were excluded from the analyses, meaning that the data only represent cells in G2. The text should make this clear.

      "to investigate how global cellular translation and transcription are affected by cell size, and by progression through the cell cycle" has been changed to "to investigate how global cellular translation and transcription are affected by cell size and by progression through G2" to reflect the fact that binucleated and septated cells are excluded from the analysis on this figure.

      8) Lines 241-243 "the S-phase subpopulation was found to have an intermediary global transcription value between the G1 and G2/M subpopulations of around 20-25 %." And Lines 310-313 "the rate of transcription is increased in cells undergoing S-phase by 20 % and is 35 % higher in G2 cells which have completed S-phase, indicating that DNA content is limiting the global rate of transcription." It is unclear what the percentage values refer to and which populations exactly are being compared.

      "the S-phase subpopulation was found to have an intermediary global transcription value between the G1 and G2/M subpopulations of around 20-25 %" has been changed to “the S-phase subpopulation was found to have an intermediary global transcription value between the G1 and G2/M subpopulations with an increase of around 20-25 % compared to the G1 subpopulation” and “the rate of transcription is increased in cells undergoing S-phase by 20 % and is 35 % higher in G2 cells which have completed S-phase, indicating that DNA content is limiting the global rate of transcription” has been changed to “the rate of transcription is increased in cells undergoing S-phase by 20 % compared to G1 cells and is 35 % higher in G2 cells which have completed S-phase compared to G1 cells, indicating that DNA content is limiting the global rate of transcription”. These changes hopefully will clarify what populations comparisons the percentage values are referring to.

      9) Line 85 "Asynchronous cultures ... have not detected" rephrase; change detected to displayed or similar.

      “detected” has been changed to “displayed”

      10) Line 243 Figure 4J, K should read Figure 4C, D.

      “Figure 4j, K” has been changed to “Figure 4D, C”

      CROSS-CONSULTATION COMMENTS

      I also agree with the comments made by the colleagues. As for the use of the cyclin and cdc25 mutants: I agree with Reviewer #1 that it is unlikely that bulk synthesis rates are conisedarably different, since these strains are going at more or lass normal rates. However, I also agree with reviewer #2 that these mutants cannot be considered as unperturbed conditions. I suspect subtle regulation and in particular cell-cycle dependent regulation might well be lost. At the very least the focus of the interpretation should be on translation/transcription as a function of size, rather than in terms of cell-cycle regulation.

      Reviewer #3 (Significance (Required)):

      Basier and Nurse address a long-standing question in the cell-cycle field, namely how/whether transcription and translation are coupled to cell-cycle progression. This is technically challenging to address, and many previous studies were hampered by the necessity to synchronize the cells in the cell cycle. The approach of this study of using metabolic labeling in non-synchronized cells is not novel in itself. However, the analysis by microscopy is superior to previous flow-cytometry based strategies in that it allows the use of cell-cycle markers and thereby precise identification of cells in each cell-cycle phase. In addition, it allows accurate measurements of cell size and thus addressing questions of correlations between cell size and transcription / translation rates. A further strength of the study design is that they investigate both transcription and translation in parallel. The authors very nicely review the existing literature and point out the likely reasons for conflicting conclusions (synchronization methods, choice of model system). The advantages of their approach, such as single-cell analyses in non-synchronized cells and the use of cell-cycle markers make their conclusions less likely to be flawed and thus represent an important advance in the field. These findings are of interest for researchers working on the cell-cycle field and on the translation field. There have been significant technical advances in the translation field in recent years, allowing studying not only global translation but also translation of specific mRNAs. I expect that the old questions of coupling cell cycle and cell growth will be revisited also by others, exploiting these new approaches. My field of expertise extends to the cell-cycle field and the regulation of translation and the use of fission yeast.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      Summary Single cell measurements (flow cytometry and imaging) from unperturbed cells are obtained to investigate scaling of transcription and translation in fission yeast. A key finding is that translation and transcription are somewhat differentially responding to changes in cell size and cell cycle. Perhaps the most central finding of this manuscript is that transcription is not a limiting factor to translation and suggests that transcription is not limiting growth (increase in biomass).

      Major comments: What I like in this manuscript is that the translation and transcription measurements have been carefully checked to reflect the initial rates before the HPG and EU signals lose their linearity. More generally, experiments have been conducted with appropriate controls, and the analysis of unperturbed cells in each cell cycle phase is likely to be highly relevant for resolving some of the controversies in the field. Most claims and the conclusions are well supported by the data. Although it is encouraging that the results for translation match the single cell mass measurements in mammalian cells (e.g., ref 18), I would have liked to see some more discussion about the potential caveats of the performed analyses such as the low signal to noise ratio in EU incorporation and other potential technical issues, which might have confounded the results. As an example, looking at Figs 1B and E, most of the protein and RNA synthesis signal is nuclear localized. Is this due to nucleolar staining and incorporation of the labels into nascent ribosomes? Yet the manuscript mentions that roughly half of RNA is for rRNA and for ribosomal proteins the fraction of HPG incorporation might be even lower. This statement does not sound entirely consistent with the experimental images shown in Fig 1. Please clarify.

      We had initially performed modelling to estimate the proportion of rRNA in transcription but after reconsideration we agree that is difficult to assess whether the special pattern we observe is consistent with the statement that roughly half of the nascent RNA is rRNA. There is signal in the cytoplasm indicating that within the pulse time some RNA are exported from the nucleus, thus the localisation of the RNA signal is not necessarily an accurate indication of the fractions of the different RNA types in global transcription. We have removed the statement “Although the precise fractions of the different types of RNA in global transcription have not been fully characterised, recent work indicated that only half of the newly synthesised RNA consists of ribosomal RNA molecules, suggesting that a significant portion of transcription is dedicated to the production of messenger and other RNA molecules [27].” It cannot be concluded that most of the protein synthesis is nuclear located in Figure 1B. As mentioned in the text we cannot differentiate between proteins being synthesised in the nucleus and proteins being rapidly imported, we also cannot say what fraction of the proteins synthesised are related to ribosome biogenesis.

      A curious thing that has been glossed over is that the transcription and translation seem not to be completely linear but to display opposite patterns (translation slightly reducing, transcription slightly overshooting with cell size compared to a linear model). It remains possible that this could be experimental noise and a visual pattern that is not real, but it could also be relevant for growth control. For example, my interpretation from Fig. 2B is that the signal is not linear and starts to saturate around 10.5 um cell length as seen from the upper IQR. Related to this, I think it is oversimplification to force the data to appear as a discontinuous linear trend by splitting the data in 2H into two segments. Such a treatment will obviously match the data better than a single linear regression, but perhaps some nonlinear model would be actually much more accurate, unless you can point out some kind of regulatory event at the intersection of these two linear segments. In my opinion the current data looks more like a typical (logarithmic) growth curve of the cell population reaching saturation. Please comment.

      We agree that fitting two linear regressions for cells shorter and longer than 15 µm is in Figure 2H and 2I was an oversimplification which could result in a false discontinuity in the data. This echoes a comment from reviewer 1 pointing out that 15 µm might not be the length at which the transition occurs. We have removed the linear regressions and added a locally estimated scatterplot smoothing (LOESS) function which capture the nonlinear transition between the increase of translation with size and the saturation, and we have changed the cell length at the estimated saturation from 18 to 19 µm in the text to better reflect the trend.

      The main conclusion presented in the abstract is that scaling in transcription may result from dynamic equilibrium between RNA polymerases and available DNA template. This is a bit of speculative part, which I was not too fond of. The dynamic equilibrium idea has been suggested also elsewhere (refs 47) and is not well developed in this manuscript. There is a lack of mechanistic understanding and no formal (mathematical) model to support this idea. For example, global transcription increases much less (1.3-1.4x) than expected based on the increase in DNA content from G1 to G2 (2x). Is this expected based on the dynamic equilibrium model?

      The dynamic equilibrium model has been proposed and developed by Swaffer et al. (2022 – currently on bioRxiv) based on mass action kinetics describing the interaction between RNA polymerases and DNA. The model predicts that transcription increases with cell size and with the amount of DNA. With this model, the increase in transcription with DNA for a given cell size is also a function of cell size. Smaller cells are predicted to have a smaller relative increase in transcription from 1C to 2C DNA content than larger cells. This implies that depending on the cell size to DNA ratio of a cell, the span increase in transcription produced by a doubling in the amount of DNA goes from a small increase (at small cell size to DNA ratio) to a doubling (at large cell size to DNA ratio). Thus, in our view the 1.3-1.4x increase in transcription we observe from G1 to G2 is consistent with the dynamic equilibrium model.

      I am somewhat concerned about the interpretation of the S phase data in the global transcription measurement. The quantification in Fig. 4D shows S phase being intermediate between G1 and G2. Yet, when you look at the data in Fig. 4C, the S phase median is clearly discontinuous, with higher transcription in smaller S cells. I believe this could affect the normalized data in Fig. 4D and result in the apparent increase in transcription in S phase cells. Having said that, I am not sure if this small S phase transcription is noise (low cell counts?) or a real S phase specific regulation of transcription which is not DNA content dependent. This results is one of the most central ones in this paper to differentiate between transcriptional and translational scaling. Therefore, additional data or insights would be highly appreciated.

      It is possible that the discontinuity in the medians of the S phase population in Figure 4C could be the result of noise due to the low cell count in the short size bins (115 cells at 6.5 µm, 404 at 7.5 µm). In addition, because we cannot measure DNA with a degree of accuracy high enough to identify how advanced in S-phase each cell is, we do not know the distribution of the advancement into S-phase of cells for each length bin. This is complicated by the fact that some cells of the CCP∆ mutant start S-phase whilst still septated and might be in a late S-phase stage by the time the cell splits so the median global transcription of the shorter length bin does not necessary reflect the median of early S-phase cells. Hence the discontinuity observed with cell length does not necessarily suggest that there is a discontinuity happening through S-phase. We suggest that since the mean global transcription per cell length of cells in S-phase is in between the mean global transcription per cell length of cells in G1 and in G2, the increase happens through S-phase. To reflect this possibility we have added “It is also possible that the increase happens at a certain stage of S-phase independent of the amount of DNA since we do not know the extent of S-phase of each cell.”

      Minor comments: Line 61: "patterns of protein RNA". I guess this refers to patterns of protein/RNA synthesis?

      “patterns of protein and RNA” has been changed to “patterns of protein and RNA synthesis”.

      Line 248: typo "Tanslation"

      “Tanslation” has been changed to “Translation”.

      Line 410 and 416: Move interquartile ranges from line 416 to line 410 as this is the first occurrence of the IQR abbreviation.

      “Interquartile ranges” has been moved from line 416 to line 410.

      Line 473: "Almost linear". This is a subjective expression, please provide some measure such as the R2 value to quantitatively evaluate linearity in this strain.

      We have added a measure of the deviation from linearity in the text “, 15 % deviation from the OLS linear regression shown in Figure 1F”. Line 547: Is there a reason to stress in this experiment that the AREA of the fluorescence signal was measured as the area indicates the total fluorescence intensity?

      “area of the” has been replaced by “total” so the sentence refers to the total fluorescence intensity signal of Sytox Green. Fig1A: The schematic mentions peptides, shouldn't it be more accurate to use "polypeptides" or "proteins" when discussing protein synthesis?

      “Peptides has been changed to Polypeptides”

      Fig 5G: Y axis scale has a typo in the word transcription.

      “Trancription” has been changed to “Transcription”

      CROSS-CONSULTATION COMMENTS I also agree with the points raised by the colleagues. There will always be some technical or interpretation issues related to every experimental technique, every model system and every mutant strain used. I believe after addressing these limitations as pointed out in the reviews, most of those issues have been clarified for the readers.

      Reviewer #4 (Significance (Required)):

      Basier and Nurse revisit the classic question regarding growth and cell size control by examining scaling of global translation and transcription in fission yeast. Knowing how cells alter their transcription and translation has important consequences in cellular functions during proliferation and cellular aging and is of broad general interest. The main driver for this current work is that previous experiments both in fission yeast and other model organisms have yielded conflicting results, possibly due to different cell cycle synchronization methods. The strength of the paper is indeed in the single cell analyses of well defined yeast strains which allow accurate assessment of the cell cycle dependent changes and accurate measurements of cell size using the cell length.

      Reassuringly, the single cell analyses from unperturbed yeast cells resemble those recently obtained from unperturbed growth of individual mammalian cells. The main conclusion that transcription is not limiting translation, and consequently not limiting growth of the cells, is interesting as it is not consistent with some of the prevailing ideas in the cell size field. These ideas include ploidy dependent gene expression where DNA content is thought to be limiting growth or the model for minimal gene expression which assumes RNA polymerases are limiting gene expression and growth. In this regard, this manuscript provides important insights for future thinking of how growth is controlled.

      keywords: cell cycle, cell size control

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We thank the reviewers for their critical analysis of our manuscript. We have addressed all reviewer concerns and questions in our revised version. Along with other improvements requested by the reviewers, we added an MTT assay to validate our flow cytometry assays, normalized binding to surface area to better compare toxin binding between Leishmania and HeLa cells, and revised the discussion. We believe the revised contribution provides important novel insights into membrane integrity in a non-standard organism that will appeal to a broad audience.

      Reviewer comments below are in italics.

      Point-by-point description of the revisions

      Reviewer 1

      *Major Comments. The experimental work has been carried out carefully, including multiple biological replicates, convincing statistical analysis. Data presentation is extensive, including 6 supplementary figures. It is likely that the experiments could be reproduced by others, as the approaches do not seem to be especially difficult, and the methods are well documented. *

      We thank the reviewer for this assessment.

      *My major comment regarding revision is that this paper is quite long and extensive given the relatively restricted body of experiments and discrete conclusions. The principal discovery is that sphingolipids protect Leishmania parasites against somewhat artificial treatment with bacterial sterol-binding pore forming toxins, but they do not do so by obstructing toxin binding to sterols. A similar effect is seen for the antileishmanial drug amphotericin B, the most important agent studied. No further mechanistic insights are provided regarding the process whereby sphingolipids blunt toxicity of either the CDCs or amphotericin B. In addition, the experimental approach relies largely upon one methodology, dose-response curves. A report with such highly focused scope should be presentable with considerably more economy. In particular, the Discussion is long and diffuse, obscuring the presentation of the major conclusions. It could probably be cut in half and would in the process present the major deliverables of the paper with higher impact. *

      We have condensed the discussion as requested, and to address Reviewer 2’s concerns, we provided a summary articulating the significance.

      Significance

      *The most notable advance is the observation that sphingolipids protect Leishmania parasites from the cytotoxic activity of the first line antileishmanial drug amphotericin B that binds to the major sterol in the parasite plasma membrane, ergosterol, and induces pore formation. This discovery suggests that parallel treatments with agents that selectively reduce sphingolipid levels in the parasite might act synergistically with amphotericin B, potentially allowing treatment with lower doses of this inherently toxic drug. This work will likely be of most interest to those with a focus on pharmacology and drug development for this and related parasites, but it will also be of some interest to those working on the basic biochemistry of these organisms. The senior authors are major workers in sphingolipid biochemistry in Leishmania parasites and thus are well positioned to address the relevant background in the field, much of which has come out of their laboratories.

      The major limitation of this study is its relatively circumscribed scope, resulting in one principal conclusion: Leishmania sphingolipids blunt the potency of toxins or drugs that target sterols for pore formation, but they do not do so by impairing binding of these agents to sterols, as they do in mammalian cells. The work would be of higher impact if it addressed mechanistically how sphingolipids do decrease toxicity, e.g., do they prevent these agents from oligomerizing or from intercalating into the membrane to form pores. Such studies would require the application of an expanded repertoire of experimental methodologies going beyond the measurement of dose-response curves with various mutants and drugs.*

      We agree with the reviewer that next steps include determining if Leishmania sphingolipids interfere with oligomerization or pore-insertion. One challenge is that these tools need to first be validated in Leishmania.

      To address the reviewer concern about the limited range of experimental methodologies, we added an MTT assay (Supplementary Fig S2E) as validation of our flow cytometry assays. We have better summarized the significance and broad impact of our work in lines 466-476.

      Reviewer 2

      *In the abstract the authors describe that the pore-forming toxins engage with ceramide and other lipids and while it's clear that the levels of sphingolipids are important for the effect of these toxins there is limited evidence to show they physically interact as the word engage suggests. *

      We agree with the reviewer that we do not show physical interaction. In the abstract, we are careful to only use the word “engage” in association with our proposed model. Our proposed model both explains our data, and uses those data to open new horizons by making falsifiable predictions that can be tested in the future. Direct engagement of toxins with lipids is one such prediction. For these reasons, we prefer to retain the word “engage” in the abstract.

      *The authors conclude that the ergosterol on the Leishmania cell membrane is less accessible to the CDCs as it does not bind as much CDCs as a HeLa cell. What is the relative abundance of sterols in the HeLa membrane in comparison to a Leishmania cell. A HeLa cell is much bigger than a Leishmania cell and will therefore be able to bind a lot more CDC, was the MFI normalised for cell size? This would be important to know as the difference in intensity may be purely related to the difference in cell size. *

      We thank the reviewer for this insight. We had not normalized MFI by cell surface area. We added MFI normalized to cell size (described on lines 573-577) and found that when area was accounted for, the promastigotes bound more toxin than HeLa cells. These data are now included as Supplementary Fig S1A, and discussed on lines 187-189.

      *The authors are keen to prosecute that ceramide is important for differences between PFO and SLO action as the inhibitor has a much greater effect on the PFO treatment of ipcs- cells than SLO, as ceramide will accumulate in these cells. But for the SLO analysis they stated that the treatment of spt2- with myriocin had no change on the LC50 as the target of myriocin was spt2 while they noted was there a drop in the LC50 with PFO. Based on this I think the importance of ceramide is being overstated here, as spt2- cells have little ceramide in them. Moreover the authors also suggest that changes to the lipid environment rather than a single species might be important. Are there alternative targets the myriocin might inhibit when there is no spt2-, it is intriguing that there is a decrease in LC50 for PFO on spt2- myriocin treated cells. *

      Clearly, IPC is very important for determining the cytotoxicity for the CDCs in Leishmania but I think the evidence for the role of ceramide and the sensing of it is less clear cut and the strength of the conclusions about this should be modified. In the results the authors conclude that the L3 loop is sensing ceramide and the data shows that the L3 loop is important but in the discussion they are more circumspect about the moieties L3 can detect. The authors should qualify these conclusions in the results a bit more.

      As requested by the reviewer, we have qualified our statements in the results, lines 282, 297, 315.

      *Minor comments *

      *It would be helpful for the review process to include line and page numbers to highlight areas that I have concerns about. *

      We agree with the reviewer and have added line numbers.

      *In the first paragraph of the results is there a reference for the spt2- cell line that was used here. *

      We have added the Zhang 2003 reference to the first paragraph of the results, line 161.

      *In the second paragraph there is a disconnect between the statements about the phenotype of the ipcs- cells and the reference/evidence for it. *

      We have added the reference to the earlier mention of the ipcs cells, and in the introduction, lines 118-120 and 167-169.

      *On many of the graphs the letters a, b, c are alongside many of the symbols but it was unclear what they represented. *

      The letters represent statistically distinct groups. These are used instead of stars and bars to reduce clutter on the figure. We have now explained the difference in the first figure legend in which they are used, lines 818-823.

      *The colour scheme for figure 4 was confusing - yellow diamonds in A/B are spt2-/+spt2 but in C/D are iscl-, this makes it hard to compare between them. *

      We have changed the color and symbols for the iscl- mutant in Fig 4 and Fig S6.

      *The methodology states that various tests were used to define whether differences were significant but it was not clear from the figures when these were being applied only a few graphs had '*' associated with them. *

      We have clarified this in the figure legends.

      *There is no overall conclusion to the study at the end of the discussion just a series of limitations of the study, which is good to acknowledge but feels an odd way to finish the manuscript. *

      We have revised the discussion in response to Reviewer 1, and included a summary to tie everything together, lines 466-476.

      *Significance: *

      Overall this is a strong manuscript with a set of experiments that have a clear strategy and purpose that was well written. This paper outlines the importance of the lipid composition for the cytotoxicity of both sterol specific toxins and amphotericin B in Leishmania, which will have significant implications for their study for other pathogens but also for the development of combination therapies to enhance the potency of amphotericin B, as such I think this will be of interest to both researchers interested in drug discovery and those interested in lipid metabolism.

      We thank the reviewer for this assessment.

      Reviewer 3

      Major comments: 1) The idea that sphingolipids do not block toxin access relies on the work of CDC-based probes binding the accessible pool of cholesterol in mammalian membranes. The authors make the observation that ergosterol is not shielded by sphingolipids because the presence of them does not prevent CDC binding. Is it possible to show that Leishmania sphingolipids are able to actually sequester ergosterol or would it all be considered free and available to toxin binding?

      Our interpretation of the binding data is that the Leishmania sphingolipids fail to sequester ergosterol from toxins, so ergosterol accessibility is independent of sphingolipids. Similar to mammalian cells, there could be an “essential” pool of ergosterol bound to other proteins/lipids that is inaccessible to toxins. However, detecting that pool is technically challenging.

      We have revised the manuscript to clarify this, lines 454-456.

      * 2) The statistical analysis applied to each experiment, while defined in the figure legends, are presented mostly using uncommon methods of presentation, making it difficult to determine if the correct analysis was applied.*

      We have clarified the statistics and use of letters. The letters represent statistically distinct groups. These are used instead of stars and bars to reduce clutter on the figure. We have now explained the difference in the first figure legend in which they are used, lines 818-823.

      * 3) The binding of these toxins to Leishmania cells appears to be independent of their lipid composition, but Figure 1A-D suggests that these toxins do not bind very well to Leishmania; a ~65 fold increase in toxin added only results in a maximal 3 fold change in amount of toxin bound. Therefore, the authors need to demonstrate that this increase in binding is not simply the result of adding more ug of each CDC. *

      Leishmania are smaller than HeLa cells, which accounts for the apparent reduced binding. We added Supplementary Fig S1A, which normalized MFI to estimated surface area. When normalized to surface area, Leishmania bound to toxin better than HeLa cells. We further note that the dose-dependent increase in cytotoxicity argues against non-specificity of increased toxin.

      * 4) The authors use HeLa cells to compare the ability of these toxins to bind to sterol containing membranes, but it is unclear how a mammalian cell line, which lacks ergosterol, can inform upon the differences in binding to Leishmania membranes when their data shows almost no cholesterol is found in the Leishmania membrane. The use of HeLa cells to compare the toxicity of these CDCs is simply a control experiment for the lytic activity of these proteins, and should not be used as a direct comparison of their LC50s, as a mammalian plasma membrane lipid composition is significantly different from that of Leishmania. If the authors want to use HeLa cells as a direct comparison to show that sphingolipids in mammalian cells also protect them from CDC pore formation, they must demonstrate the HeLa cells which have genetic defects in sphingolipid biology or which have been treated with sphingomyelinases are more sensitive to these CDCs. *

      We agree with the reviewer that to argue sphingolipids in mammalian cells are protective would require additional data beyond the scope of this manuscript. We are not making any statements about the role of sphingolipids in mammalian cells, which have a controversial role in CDC damage and membrane repair (see e.g. Schoenauer et al 2019. PMID: 29979630). Since the head group of sphingomyelin interacts with cholesterol (Endapally et al 2019), but the IPC head group is not expected to interact similarly with ergosterol, we choose to remain focused on Leishmania sphingolipids.

      Given our focus on Leishmania, why include HeLa cells at all? We think including HeLa cells provides an important and relevant point of reference because there are situations where both human cells and Leishmania promastigotes could encounter pore-forming toxins. This comparison provides insight to the following question: “In a mix of promastigotes and human cells (for example during a blood meal), which cells would die first from the bacterial PFT?” Comparing cytotoxicity to HeLa cells provides a point of reference in judging how cytotoxic CDCs are to Leishmania promastigotes, and how sensitive the spt- promastigotes become.

      We have rephrased the manuscript (lines 208-209) to better clarify that HeLa cells are a reference point so readers can evaluate the relative sensitivity of sphingolipid-deficient promastigotes.

      * 5) The authors need to demonstrate that the mutant cholesterol recognition motif (CRM) and the glycan binding mutant proteins can still bind to both Leishmania and Hela cell membranes to serve as controls for their lack of lytic activities. Without this, they cannot conclude that "Leishmania membranes engage the same binding determinants used by CDCs to target mammalian cells". *

      The glycan binding and ΔCRM mutants are unable to bind to HeLa cells. These toxin mutations were previously characterized (Mozola & Caparon, 2015 and Farrand et al 2010), showing that their defect lies in binding to cells, but not oligomerization or pore-formation. Since their defect lies solely in binding, if these toxins were able to bind to spt2- cells, they would kill the spt2- cells. This enables us to use these toxin mutants to ask if the CRM or glycan-binding is essential for toxin binding to Leishmania. Since the only defect in these mutant toxins is binding (either to glycans or cholesterol), the failure of these mutants to kill allows us to conclude that both of these binding surfaces on the toxin are essential for cytotoxicity in L. major.

      We have clarified the manuscript, lines 236-240. *

      Minor comments: 6) Multiple figures lack adequately defined axes. Examples include, but are not limited to: Figure 1A-D where the X-axis is plotted as logarithmic based 2 but this is not defined. Figure 2 the Y axis is plotted as logarithmic based 10 but is not defined. *

      We have updated the figure legends to indicate where log axes are used.

      7) The authors state that "Promastigotes with inactivated de novo sphingomyelin synthesis has a significant increase in total sterols" in reference to Figure 1E. Not only is there no significance indicated for the spt2-/-, the authors only indicate a significance point for the Myr (not yet defined) + WT sample in "Other sterols".

      We have rephrased this to indicate a trend, line 181.

      8) The authors use increases in membrane permeability as a read out for specific lysis using PI uptake, however, they then refer to this read out as killing of Leishmania, without measuring the viability of these cells. Therefore, the authors should provide additional experiments that demonstrate the death of the different Leishmania strains treated with the cytolysins.

      As requested, we have now provided an additional experiment to validate Leishmania death. We have now added MTT assay as Fig S2E, and discussed in the results, lines 202-205.

      9) It is not clear how the authors calculated their LC50 values in Figure 2. According to the figure legends, the authors used HU/ml ranges that would be sub lethal or not completely lysed within this range to most of the Leishmania strains tested. The data presented in Figure are not clear that the correct LC50 calculations were used as none of the Specific Lysis curves do not reach saturation with the concentrations presented, and one does not even reach 50% Lysis.

      We thank the reviewer for catching this discrepancy. The legend in Fig 2 did not include the correct ranges of toxin dose used for PFO. We have corrected the legend to indicate the toxin range used. To calculate LC50, we used linear regression on the linear portion of the death curve to determine the concentration at 50% lysis. This gives us a way to determine LC50 even without the use of very large (and costly) amounts of toxin to get extensive saturation on the kill curve.

      * 10) Figure 4 and Figure S6 are very difficult to interpret. Figure S6 would benefit by breaking up each graph into multiple graphs that would allow the reader to see more of the curves individually. Additionally, there are multiple conditions were it appears that a different number of experiments (2-4 totals) were preformed but statistical analysis was applied to these data. *

      We updated the labels on Fig 4 for improved readability. We broke Fig S6 up into multiple graphs. We have removed unpaired data (eg the n of 4 noted by the reviewer), and re-checked our stats. This change did not alter our conclusions. The apparent n of 2 was overlap of data points due to poor jittering of the datapoints. We have increased the jitter on the data points to make all three reps more distinct.

      * 11) The authors state "In contrast to myriocin-treated ipcs- L. major, which contain low levels of ceramide, myriocin treated iscl- L. major contain low levels of IPC" but do not provide a reference or point to data to support this claim. *

      We have qualified these statements to say ‘are expected to’ on lines 306-307.

      * 12) Figure 5 E would benefit in presentation by being broken up into 4 separate graphs based on the toxin used, as it is difficult to determine which data points are being compared. *

      We compare by toxin used in Fig 5A-D. The purpose of Fig 5E is to compare between toxins. We included all of the data points (including resistant control strains) for completeness. The main focus is the spt2- and ipcs- parts of Fig 5E.

      * 13) The authors state that "myriocin did not inhibit growth more than 25% promastigotes at 10 μM" but this data is not presented. *

      We have now added these data as Fig 6A.

      14) Multiple graphs lack legends or have axis that are not defined.

      In order to improve readability and avoid cluttering the figures, where the legends and axes are the same across multiple graphs, they are included only once for a given row and/or column.*

      Significance:

      Overall, the experiments presented were conducted to analyze each question, but many of the results are observational, without considering the impact of altered lipid species on the findings. The data suggests an existence of a protective mechanism for the parasite from CDCs, but it unclear how these finding inform upon the CDC or Leishmania fields. CDCs have been known to target sterols within membranes and that altered local membrane environments can have substantial impacts on CDC binding. This work suggests that the altered lipid species of Leishmania membranes, compared to a mammalian membrane, could dramatically effect the sequestering power of sphingolipids or other lipids, and thus change how CDCs bind to them. This work advances is likely to have specialized audience of Leishmania researchers looking at the dynamics of their membranes.*

      We believe this work will be valuable to a broad audience because it will be of interest to researchers studying membranes in general, pathogenic eukaryotes and pore-forming toxins. Most membrane biology work is done either in opisthokonts or in model liposomes, so there are few studies on biomembranes in other taxonomic groups, including many different human pathogens. We provide a blueprint for examining the membranes of non-standard organisms, establish L. major as a pathogenically relevant model system, and report on key differences in sterol sequestration compared to mammalian cells. These findings provide important perspectives for the generalization of biomembranes, especially when compared to prior work in opisthokonts.

      We have clarified our significance in lines 466-476.

    1. Author Response

      Reviewer #1 (Public Review):

      This paper tests whether people vary their reliance on episodic memory vs. incremental learning as a function of the uncertainty of the environment. The authors posit that higher uncertainty environments should lead to more reliance on episodic memory, and they find evidence for this effect across several kinds of analyses and across two independent samples.

      The paper is beautifully written and motivated, and the results and figures are clear and compelling. The replication in an independent sample is especially useful. I think this will be an important paper of interest to a broad group of learning, memory, and decisionmaking researchers. I have only two points of concern about the interpretation of the results:

      1) My main concern regards the indirect indicator of participants' use of episodic memory on a given trial. The authors assume that episodic memory is used if the value of the chosen object (as determined by its value the last time it was presented) does not match the current value of the deck it is presented in. They find that these mismatch choices happen more often in the high-volatility environment. But if participants simply choose in a more noisy/exploratory way in the high volatility environment, I believe that would also result in more mismatched judgments. What proportion of the trials labeled as episodic should we expect to be a result of noise or exploration? It seems conceivable that a judgment to explore could take longer, and result in the observed RT effects. Perhaps it could be useful to match up putative episodic trials with later recognition memory for those particular items. The across-subjects correlations are an indirect version of this, but could potentially be subject to a related concern if participants who explore more (and are then judged as more episodic) also simply have a better memory.

      Thank you for this important suggestion. We agree that noisy/exploratory choices could potentially masquerade as episodic on the episodic-based choice index used as one of our behavioral measures. As pointed out, this is because participants may be more likely to make noisier incremental value-based decisions in the high volatility compared to the low volatility environment. In our revision, we provided a new analysis that shows that, as the reviewer predicted, choices are indeed more noisy in the high volatility environment. We answer this concern in two ways. First, we took this noise into account in our analysis of the episodic/incremental tradeoff and show that it does not account for the main findings. And second, we provided a new analysis of subsequent memory that shows that choices that are defined as episodic during the decision-task are also associated with better recognition memory later on. These new analyses are described below as well.

      We used a mixed-effects logistic regression model to test for an interaction effect of environment and model-estimated deck value on whether the orange deck was chosen. We fit this model only to trials without the presence of a previously seen object in order to achieve a more accurate measure of noise specific to incremental learning. In both the main and replication samples, participants did indeed make noisier incremental decisions in the high compared to the low volatility environment (Main: 𝛽 = −1.589, 95% 𝐶𝐼 = [−2.091, −1.096], Replication: 𝛽 = −1.255, 95% 𝐶𝐼 = [−1.824, −0.675]). To account for the possibility that the measured difference between environments in our episodic-based choice index may be related to this difference in incremental noise between the environments, we included each participant’s random effect of the environment by deck value interaction from this model as a covariate in our analysis of the effect of environment on the episodic-based choice index. While each participants’ propensity to choose with greater noise in the high volatility environment did have an effect on the episodic-based choice index (Main: 𝛽 = 0.042, 95% 𝐶𝐼 = [0.012, 0.072], Replication: 𝛽 = 0.055, 95% 𝐶𝐼 = [0.027, 0.082]), the effect of environment was similar to that originally reported in the manuscript for both samples following this adjustment. The reported effects (lines 178 and Appendix 1) and methods (lines 643-655) have been updated to reflect these changes.

      We applied a similar logic to the reaction time analysis, to address the possibility that decisions based on exploration may take longer compared to decisions based on exploitation of learned deck value. We included a covariate in the analysis of the effect of episodic-based choices on reaction time that captured possible slowing due to switching from choosing one deck to the other (lines 656-662) and found that the slower reaction times on episodic choices are not fully explained by exploration. Because in this task a decision to explore is captured by switching from one deck to another, the effect of episodic-based choices on reaction time reported in the manuscript should account for this behavior. We have clarified this reasoning in the methods (lines 661-662).

      Finally, thank you for the idea to sort objects in the recognition memory test by whether they were from episodic- or incremental-based choice trials to provide a further test of whether our approach for sorting episodic decisions withstands an independent test. We performed this analysis and found that, in both samples, participants had better memory for objects from episodic-based choice trials. This result provides further support for the putative episodic nature of these trials and is now reported in the Results (lines 300-304 and Appendix 1), Methods (lines 737-742) and appears as a new panel in Figure 5 (Figure 5A).

      2) The paper is framed as tapping into a trade-off between the use of episodic memory vs. incremental learning, but it is not clear why participants would not use episodic memory in this particular task setup whenever it is available to them. The authors mention that there is "computational expense" to episodic memory, but retrieval of an already-established strong episodic memory could be quite effortless and even automatic. Why not always use it, since it is guaranteed in this task to be a better source of information for the decision? If it is true that RT is higher when using episodic memory, that is helpful toward establishing the trade-off, so this links to the concern above about how confident we can be about the use of episodic memory in particular trials.

      Thank you for raising this important point and for giving us the opportunity to clarify. We now address this point in two ways: first, we provide a new analysis of episodic memory and choice behavior and we address this point explicitly in the discussion.

      As now emphasized in the paper (lines 118-122 and lines 384-388), in this task, it is true that an observer with perfect episodic memory should always make use of it whenever available (i.e. on trials featuring previously seen objects). However, human memory is fallible and resourcelimited, and we find that participants with less reliable episodic memory overall actually relied less on this strategy and more on incremental learning throughout the task (Figure 5C and 5D). In other words, there is noise and uncertainty also in the episodic memory trace. While it is not the main focus of our study, the noise in episodic memory is indeed another reason why trading off between episodic memory and incremental learning is advantageous for behavior. We further agree that while the RT effects show that, relative to using incremental value, episodic memory retrieval takes longer, we cannot make strong statements about effort or “computational expense” per se from our data. Accordingly, we have removed the “computational expense” phrase (line 491), as well as our suggestion that episodic retrieval is “perhaps more effortful overall” (line 181), from the paper.

      Reviewer #2 (Public Review):

      This manuscript addresses the broad question of when humans use different learning and memory systems in the service of decision-making. Previous studies have shown that, even in tasks that can be performed well using incremental trial-and-error learning, choices can sometimes be based on memories of individual past episodes. This manuscript asks what determines the balance between incremental learning and episodic memory, and specifically tests the idea that the uncertainty associated with each alters the balance between them in a rational way. Using a task that can separate the influence of incremental learning and episodic memory on choice in two large online samples, several lines of evidence supporting this hypothesis are reported. People are more likely to rely on episodic memory in more volatile environments when incremental learning is more uncertain and during periods of increased uncertainty within a given environment. Individuals with more accurate episodic memories are also more likely to rely on episodic memory and less likely to rely on incremental learning. These data are compelling, even more so because all of the main findings are directly replicated in a second sample. These data extend the notion of uncertainty-based arbitration between different forms of learning/memory, which has been proposed and evaluated in other contexts, to the case of episodic memory versus incremental learning.

      The weaknesses in the paper are mostly minor. One potential weakness is the nature of the online sample. Many participants apparently did not respond to the volatility manipulation, making it impossible to test whether this altered their choices. It is unclear whether this is a feature of online samples (where people can be distracted, unmotivated, etc.) or of human performance more generally.

      Thank you for your comments. Indeed, we also found it interesting that many participants were insensitive to the manipulation of volatility in our study, as assessed and filtered based on the initial deck learning task. As you note, our study is not positioned to determine the cause and whether this is due to the online population or human performance more generally, and we added a discussion of this point to the paper (lines 477-485). Also, fractions exceeding 1/3 apparently inattentive participants are very much the norm in our experience with other online studies across many tasks. While there is much to say about the implications of this (see e.g. Zorowitz, Niv & Bennett PsyArXiv 2021), our basic philosophy (which we follow here) is that it is best practice, and conservative, to exclude aggressively so as to focus analyses on those participants for whom the experimental questions can meaningfully be asked.

      Reviewer #3 (Public Review):

      The purpose of this work is to test the hypothesis that uncertainty modulates the relative contributions of episodic and incremental learning to decisions. The authors test this using a "deck learning and card memory task" featuring a 2-alternative forced choice between two cards, each showing a color and an object. The cards are drawn from different colored decks with different average values that stochastically reverse with fixed volatility, and also feature objects that can be unfamiliar or familiar. Objects are not shown more than twice, and familiar objects have the same value as they did when shown previously. This allows the authors to construct an index of episodic contributions to decision-making: in cases where the previous value of the object is incongruous with the incrementally observed value, the subject's choice reveals which strategy they are relying on.

      The key manipulation is to introduce high- and low- volatility conditions, as high volatility has been shown to induce uncertainty in incremental learning by causing subjects to adopt an optimal low learning rate. The authors find that the subjects show a higher episodic choice index in the high-volatility condition, and in particular immediately after reversals when the model predicts uncertainty is at a maximum. The authors also construct a trial-wise index of uncertainty and show that episodic index correlates with this measure. The authors also find that at the subject level, the overall episodic choice index correlates with the ability to accurately identify familiar objects, and the reason that this indicates higher certainty in episodic memory is predicting the usage of episodic strategies. The authors replicate all of their findings in a second subject population.

      This is a very interesting study with compelling results on an important topic. The task design was a clever way to disentangle and measure different learning strategies, which could be adopted by others seeking to further understand the contributions of different strategies to decision-making and its neural underpinnings. The article is also very clearly written and the results clearly communicated.

      A number of questions remain regarding the interpretation of the results that I think would be addressed with further analysis and modeling.

      At a conceptual level, I was unsure about the equivalence drawn between volatility and uncertainty: the main experiments and analyses all regard reversals and comparisons of volatility conditions, but the conclusions are more broadly about uncertainty. Volatility, as the authors note, is only one way to induce uncertainty. It also doesn't seem like the most obvious way to intervene on uncertainty (eg manipulated trial-wise variance seems more obvious). The trial-wise relative uncertainty measurements in Fig 4 speak a bit more to the question of uncertainty more generally, but these were not the main focus and also do not disambiguate between trial-wise uncertainty derived from reversals versus within block variation.

      Thank you for your comments. We agree that this distinction was unclear and appreciate the opportunity to clarify. We hope the manuscript is now clear about the conceptual distinction between uncertainty as the construct of theoretical interest vs. volatility as the operational manipulation being used to access it. We have adjusted the presentation and added discussion to clarify this, and also enhanced the trial-wise analyses to strengthen the interpretation of results in terms of uncertainty more generally. Regarding obviousness, we think perhaps there is a difference between areas of study on this point. While trial-wise outcome variance (which we call stochasticity) has been widely used to manipulate uncertainty in perceptual and sensorimotor studies, it has been more rarely manipulated in reward learning studies, where instead the volatility manipulation we use has predominated. We have a recent paper reviewing examples of both and arguing that the field has underemphasized the importance of stochasticity, so we are sympathetic here (Piray and Daw, Nature Communications 2021).

      In any case, to address these points on revision, we have reframed the first section of the results, where we look at effects of environment on episodic-based choice, to focus primarily on volatility. Specifically, we have expanded on our explanation of how volatility induces uncertainty, changed the subtitle of the section from ‘uncertainty’ to ‘volatility’, and have specified that the prediction in this section is primarily about volatility (lines 97 and 116-123). We also reframed the second section of the results to be primarily about the uncertainty induced by volatility: while differences between the environments capture coarse effects of volatility, trialwise uncertainty should be present following reversals across both environments. We have now focused our explanation in this section on trial-wise uncertainty within the environments rather than volatility between the environments (lines 184-192). Further, we agree that there are other sources of uncertainty besides volatility that we did not manipulate in the paper, and that it remains for future work whether their manipulation would produce similar results. To amend this, we have added a new paragraph to the discussion covering these alternative sources and further qualifying the scope of our conclusions (lines 434-446).

      We also agree that our analyses in Figure 4 did not yet speak to differences in episodic-based choice that may arise due to blockwise volatility (as captured by the categorical effect of environment) vs. trial-to-trial fluctuations in uncertainty (as captured by relative uncertainty, over and above the blockwise effect). We have addressed this by adding an additional, separate effect of the interaction between environment and episodic value to our combined choice models which is explained in more detail in the recommendations for the authors portion of our response. These changes and results are described in the Methods (lines 686-694) and Results (lines 276-277; Figure 4C).

      Another key question I had about design choice was the decision to use binary rather than drifting values. Because of this, the subjects could be inferring context rather than continuously incrementing value estimates (eg Gershman et al 2012, Akam et al 2015): the subjects could be inferring which context they are in rather than tracking the instantaneous value + uncertainty. I am not sure this would qualitatively affect the results, as volatility would also affect context confidence, but it is a rather different interpretation and could invoke different quantitative predictions. And it might also have some qualitative bearing on results: the subjects have expectations about how long they will stay in a particular environment, and they might start anticipating a context change after a certain amount of time which would lead to an increase in uncertainty not just immediately after switches, but also after having stayed in the environment for a long period of time. Moreover, depending on the variance within context, there may be little uncertainty following context shifts.

      Thank you for raising this important point. To address the possibility that the task structure could have encouraged participants to infer context rather than engage in incremental learning, we added an alternative contextual inference (CI) model, based on a hidden Markov model with two hidden states (e.g. that either the red deck is lucky and the blue deck unlucky or vice versa). This model is now described in the Results of the main text (lines 226-228), listed in the Methods (line 674), and explained in detail in Appendix 3 alongside the computational models of incremental learning. Following model comparison, we found that this model provided a worse fit than the incremental learning models we previously presented in both samples, suggesting that incremental learning is a better descriptor of participants’ choices in this task than contextual inference. The results of this comparison are reflected in an updated Figure 3A.

    1. Author Response

      Reviewer #1 (Public Review):

      Slusarczyk et al present a very well written manuscript focused on understanding the mechanisms underlying aging of erythrophagocytic macrophages in the spleen (RPM) and its relationship to iron loading with age. The manuscript is diffuse with a broad swath of data elements. Importantly, the manuscript demonstrates that RPM erythrophagocytic capacity is diminished with age, restored in iron restricted diet fed aged mice. In addition, the mechanism for declining RPM erythrophagocytic capacity appears to be ferroptosis-mediated, insensitive to heme as it is to iron, and occur independently of ROS generation. These are compelling findings. However, some of the data relies on conjecture for conclusion and a clear causal association is not clear. The main conclusion of the manuscript points to the accumulation of unavailable insoluble forms of iron as both causing and resulting from decreased RPM erythrophagocytic capacity.

      We are proposing that intracellular iron accumulation progresses first and leads to global proteotoxic damage and increased lipid peroxidation. This eventually triggers the death of a fraction of aging RPMs, thus promoting the formation of extracellular iron-rich protein aggregates. More explanation can be found below. Besides, iron loading suppresses the erythrophagocytic activity of RPMs, hence further contributing to their functional impairment during aging.

      In addition, the finding that IR diet leads to increased TF saturation in aged mice is surprising.

      We believe that this observation implies better mobilization of splenic iron stores, and corroborates our conclusion that mice that age on an iron-reduced diet benefit from higher iron bioavailability, although these differences are relatively mild. More explanation can be found in our replies to Reviewer #2.

      Furthermore, whether the finding in RPMs is intrinsic or related to RBC-related changes with aging is not addressed.

      We now addressed this issue and we characterized in more detail both iron and ROS levels in RBCs.

      Finally, these findings in a single strain and only female mice is intriguing but warrants tempered conclusions.

      We tempered the conclusions and provided a basic characterization of the RPM aging phenotype in Balb/c female mice.

      Major points:

      1) The main concern is that there is no clear explanation of why iron increases during aging although the authors appear to be saying that iron accumulation is both the cause of and a consequence of decreased RPM erythrophagocytic capacity. This requires more clarification of the main hypothesis on Page 4, line 17-18.

      We thank the reviewer for this comment. It was previously reported that iron accumulates substantially in the spleen during aging, especially in female mice (Altamura et al., 2014). Since RPMs are those cells that process most of the iron in the spleen, we aimed to explore what is the relationship between iron accumulation and RPM functions during aging. This investigation led us to uncover that indeed iron accumulation is both the cause and the consequence of RPM dysfunction. Specifically, we propose that intracellular iron loading of RPMs precedes extracellular deposition of iron in a form of protein-rich aggregates, driven by RPMs damage. To support this, we now show that the proteome of RPMs overlaps with those proteins that are present in the age-triggered aggregates (Fig. 3F). Furthermore, corroborating our model, we now demonstrate that transient iron loading of RPMs via iron-dextran injection (new Fig. 3G) leads to the formation of protein-rich aggregates, closely resembling those present in aged spleens (new Fig. 3H). This implies that high iron content in RPMs is indeed a major driving factor that leads to aggregation of their proteome and cell damage. Importantly, we now supported this model with studies using iRPMs. We demonstrated that iron loading and blockage of ferroportin by synthetic mini-hepcidin (PR73)(Stefanova et al., 2018) cause protein aggregation in iRPMs and lead to their decreased viability only in cells that were exposed to heat shock, a well-established trigger of proteotoxicity (new Fig. 5K and L). We propose that these two factors, namely age-triggered decrease in protein homeostasis and exposure to excessive iron levels, act in concert and render RPMs particularly sensitive to damage during aging (see also Discussion, p. 16).

      In parallel, our data imply that the increased iron content in aged RPMs drives their decreased erythrophagocytic activity, as we now better documented by more extensive in vitro experiments in iRPMs (new Fig 6E-H). We cannot exclude that some of the senescent splenic RBCs that are retained in the red pulp and evade erythrophagocytosis due to RPM defects in aging, may also contribute to the formation of the aggregates. This is supported by the fact that mice that lack RPMs as well exhibit iron loading in the spleen (Kohyama et al., 2009; Okreglicka et al., 2021), and that the proteome of aggregates overlaps to some extent with the proteome of erythrocytes (new Fig. 3F).

      We believe that during aging intracellular iron accumulation is chiefly driven by ferroportin downregulation, as also suggested by Reviewer#3. We now show that ferroportin drops significantly already in mice aged 4 and 5 months (new Fig. 4H), preceding most of the other impairments. This drop coincides with the increase in hepcidin expression, but if this is the sole reason for ferroportin suppression during early aging would require further investigation outside the scope of the present manuscript.

      In sum, to address this comment, we now modified the fragment of the introduction that refers to our hypothesis and major findings to be more clear (p. 4), we improved our manuscript by providing new data mentioned above and we added more explanation in the corresponding sections of the Results and Discussion.

      2) It is unclear if RPMs are in limited supply. Based on the introduction (page 4, line 13-15), they have limited self-renewal capacity and blood monocytes only partially replenished. Fig 4D suggests that there is a decrease in RPMs from aged mice. The %RPM from CD45+ compartment suggests that there may just be relatively more neutrophils or fewer monocytes recruited. There is not enough clarity on the meaning of this data point.

      Thank you for this comment. We fully agree that %RPMs of CD45+ splenocytes, although well-accepted in literature (Kohyama et al., 2009; Okreglicka et al., 2021), is only a relative number. Hence, we now included additional data and explanations regarding the loss of RPMs during aging.

      It was reported that the proportion of RPMs derived from bone marrow monocytes increases mildly but progressively during aging (Liu et al., 2019). This implies that due to the loss of the total RPM population, as illustrated by our data, the cells of embryonic origin are likely even more affected. We could confirm this assumption by re-analysis of the data from Liu et al. that we now included in the manuscript as Fig. 5E. These data clearly show that the representation of embryonically-derived RPMs drops more drastically than the percent of total RPMs, whereas the replenishment rate from monocytes is not affected significantly during aging. Consistent with this, we have not observed any robust change in the population of monocytes (F4/80-low, CD11b-high) or pre-RPMs (F4/80-high, CD11b-high) in the spleen at the age of 10 months (Figure 5-figure supplement 2A and B). We also have detected a mild decrease, not an increase, in the number of granulocytes (new Figure 5-figure supplement 2C). Furthermore, we measured in situ apoptosis marker and found a clear sign of apoptosis in the aged spleen (especially in the red pulp area), a phenotype that is less pronounced in mice on an IR diet (new Fig. 5O). This is consistent with the observation that apoptosis markers can be elevated in tissues upon ferroptosis induction (Friedmann Angeli et al., 2014) and that the proteotoxic stress in aged RPMs, which we now emphasized better in our manuscript, may also lead to apoptosis (Brancolini & Iuliano, 2020). Taken together, we strongly believe that the functional defect of embryonically-derived RPMs chiefly contributes to their shortage during aging.

      3) Anemia of aging is a complex and poorly understood mechanistically. In general, it is considered similar to anemia of chronic inflammation with increased Epo, mild drop in Hb, and erythroid expansion, similar to ineffective erythropoiesis / low Epo responsiveness. It is not surprising that IR diet did not impact this mild anemia. However, was the MCV or MCH altered in aged and IR aged mice?

      We now included the data for hematocrit, RBC counts, MCV, and MCH in Figure 1-figure supplement 5. Hematocrit shows a similar tendency as hemoglobin levels, but the values for RBC counts, MCV, and MCH seem not to be altered. We also show now that the erythropoietic activity in the bone marrow is not affected in aged versus young mice. Taken together, the anemic phenotype in female C57BL/6J mice at this age is very mild, which we emphasized in the main text, and is likely affected by other factors than serum iron levels (p. 6).

      4) Page 6, line 23 onward: the conclusion is that KC compensate for the decreased function of RPM in the spleen, based on the expansion of KC fraction in the liver. Is there evidence that KCs are engaged in more erythrophagocytosis in aged mice? Furthermore, iron accumulation in the liver with age does not demonstrate specifically enhanced erythrophagocytosis of KC. Please clarify why liver iron accumulation would not be simply a consequence of increased parenchymal iron similar to increased splenic iron with age, independent of erythrophagocytic activity in resident macrophages in either organ.

      Thanks for these questions. For the quantification of the erythrophagocytosis rate in KC, we show, as for the RPMs (Fig. 1K), the % of PKH67-positive macrophages, following transfusion of PKH67-stained stressed RBCs (Fig. 1M). The data implies a mild (not statistically significant) drop (of approx. 30%) in EP activity. We believe that it is overridden by a more pronounced (on average, 2-fold) increase in the representation of KCs (Fig. 1N). The mechanisms of iron accumulation between the spleen and the liver are very different. In the liver, we observed iron deposition in the parenchymal cells (not non-parenchymal, new Fig. 1P) that we currently characterizing in more detail in a parallel manuscript. Our data demonstrate a drop in transferrin saturation in aged mice. Hence, it is highly unlikely that aging would be hallmarked by the presence of circulating non-transferrin-bound iron that would be sequestered by hepatocytes, as shown previously (Jenkitkasemwong et al., 2015). Thus, the iron released locally by KCs is the most likely contributor to progressive hepatocytic iron loading during aging. The mechanism of iron delivery to hepatocytes from erythrophagocytosing KCs was demonstrated by Theurl et al.(Theurl et al., 2016), and we propose that it may be operational, although in a much more prolonged time scale, during aging. We now discussed this part better in our Results sections (p. 7).

      5) Unclear whether the effect on RPMs is intrinsic or extrinsic. Would be helpful to evaluate aged iRPMs using young RBC vs. young iRPMs using old RBCs.

      We are skeptical if the generation of iRPMs cells from aged mice would be helpful – these cells are a specific type of primary macrophage culture, derived from bone marrow monocytes with MCSF1, and exposed additionally to heme and IL-33 for 4 days. We do not expect that bone marrow monocytes are heavily affected by aging, and would thus recapitulate some aspects of aged RPMs from the spleen, especially after 8-day in vitro culture. However, to address the concerns of the reviewer, we now provide additional data regarding RBC fitness. Consistent with the time life-span experiment (Fig, 2A), we show that oxidative stress in RBCs is only increased in splenic, but not circulating RBCs (new Fig. 2C, replacing the old Fig. 2B and C). In addition, we show no signs of age-triggered iron loading in RBCs, either in the spleen (new Fig. 2F) or in the circulation (new Fig. 2B). Hence, we do not envision a possibility that RPMs become iron-loaded during aging as a result of erythrophagocytosis of iron-loaded RBCs. In support of this, we also have observed that during aging first RPMs’ FPN levels drop, afterward erythrophagocytosis rate decreases, and lastly, RBCs start to exhibit significantly increased oxidative stress (presented now in new Fig. 4H, J and K).

      6) Discussion of aggregates in the spleen of aged mice (Fig 2G-2K and Fig 3) is very descriptive and non-specific. For example, if the iron-rich aggregates are hemosiderin, a hemosiderin-specific stain would be helpful. This data specifically is correlatory and difficult to extract value from.

      Thanks for these comments. To the best of our knowledge Prussian blue Perls’ staining (Fig. 2J) is considered a hemosiderin staining. Our investigations aimed to better understand the nature and the origin of splenic iron deposits that to some extent are referred to as hemosiderin. Most importantly, as mentioned in our reply R1 Ad. 1. to assign causality to our data, we now demonstrated that iron accumulation in RPMs in response to iron-dextran (Fig. 3G) increases lipid peroxidation (Fig. 5F), tends to provoke RPMs depletion (Fig. 5G) and triggers the formation of protein-rich aggregates (new Fig. 3H). Of note, we assume that the loss of embryonically-derived RPMs in this model may be masked by simultaneous replenishment of the niche from monocytes, a phenomenon that may be addressed by future studies using Ms4a3-driven reporter mice (as shown for aged mice in our new Fig. 5E).

      7) The aging phenotype in RPMs appears to be initiated sometime after 2 months of age. However, there is some reversal of the phenotype with increasing age, e.g. Fig 4B with decreased lipid peroxidation in 9 month old relative to 6 month old RPMs. What does this mean? Why is there a partial spontaneous normalization?

      Thanks for this comment and questions. Indeed, the degree of lipid peroxidation exhibits some kinetics, suggestive of partial normalization. Of note, such a tendency is not evident for other aging phenotypes of RPMs, hence, we did not emphasize this in the original manuscript. However, in a revised version of the manuscript, we now present the re-analysis of the published data which implies that the number of embryonically-derived RPMs drops substantially between mice at 20 weeks and 36 weeks (new Fig. 5E). We think that the higher proportion of monocyte-derived RPMs in total RPM population later in aging (9 months) might be responsible for the partial alleviation of lipid peroxidation. We now discussed this possibility in the Results sections (p. 12).

      8) Does the aging phenotype in RPMs respond to ferristatin? It appears that NAC, which is a glutathione generator and can reverse ferroptosis, does not reverse the decreased RPM erythrophagocytic capacity observed with age yet the authors still propose that ferroptosis is involved. A response to ferristatin is a standard and acceptable approach to evaluating ferroptosis.

      We fully agree with the Reviewer that using ferristatin or Liproxstatin-1 would be very helpful to fully characterize a mechanism of RPMs depletion in mice. However, previous in vivo studies involving Liproxstatin-1 administration required daily injections of this ferroptosis inhibitor (Friedmann Angeli et al., 2014). This would be hardly feasible during aging. Regarding the experiments involving iron-dextran injection, using Liproxstatin-1 would require additional permission from the ethical committee which takes time to be processed and received. However, to address this question we now provide data from iRPMs cell cultures (new Fig.5 K-L). In essence, our results imply that both proteotoxic stress and iron overload act in concert to trigger cytotoxicity in RPM in vitro model. Interestingly, this phenomenon does not depend solely on the increased lipid peroxidation, but when we neutralize the latter with Liproxstatin-1, the cytotoxic effect is diminished (please, see also Results on p. 13 and Discussion p. 15/16).

      9) The possible central role for HO-1 in the pathophysiology of decreased RPM erythrophagocytic capacity with age is interesting. However, it is not clear how the authors arrived at this hypothesis and would be useful to evaluate in the least whether RBCs in young vs. aged mice have more hemoglobin as these changes may be primary drivers of how much HO-1 is needed during erythrophagocytosis.

      Thanks for this comment. We got interested in HO-1 levels based on the RNA sequencing data, which detected lower Hmox-1 expression in aged RPMs (Figure 3-figure supplement 1). We now show that the content of hemoglobin is not significantly altered in aged RBCs (MCH parameter, Figure 1-figure supplement 5E), hence we do not think that this is the major driver for Hmox-1 downregulation. Likewise, the levels of the Bach1 message, a gene encoding Hmox-1 transcriptional repressor, are not significantly altered according to RNAseq data. Hence, the reason for the transcriptional downregulation of Hmox-1 is not clear. Of note, HO-1 protein levels in the total spleen are higher in aged versus young mice, and we also detected a clear appearance of its nuclear truncated and enzymatically-inactive form (see a figure below, we opt not to include this in the manuscript for better clarity). The appearance of truncated HO-1 seems to be partially rescued by the IR diet. It is well established that the nuclear form of HO-1 emerges via proteolytic cleavage and migrates to the nucleus under conditions of oxidative stress (Mascaro et al., 2021). This additionally confirms that the aging spleen is hallmarked by an increased burden of ROS. Moreover, we also detected HO-1 as one of the components of the protein iron-rich aggregates. Thus, we propose that the low levels of the cytoplasmic enzymatically active form of HO-1 in RPMs (that we preferentially detect with our intracellular staining and flow cytometry) may be underlain by its nuclear translocation and sequestration in protein aggregates that evade antibody binding [this is also supported by our observation that the protein aggregates, despite the high content of ferritin (as indicated by MS analysis) are negative for L-ferritin staining. Of note, we also cannot exclude that other cell types in the aging spleen (eg. lymphocytes) express higher levels of HO-1 in response to splenic oxidative stress.

      Fig. Total splenic levels of HO-1 in young, aged IR and aged mice.

      Reviewer #2 (Public Review):

      Slusarczyk et al. investigate the functional impairment of red pulp macrophages (RPMs) during aging. When red blood cells (RBCs) become senescent, they are recycled by RPMs via erythrophagocytosis (EP). This leads to an increase in intracellular heme and iron both of which are cytotoxic. The authors hypothesize that the continuous processing of iron by RPMs could alter their functions in an age-dependent manner. The authors used a wide variety of models: in vivo model using female mice with standard (200ppm) and restricted (25ppm) iron diet, ex vivo model using EP with splenocytes, and in vitro model with EP using iRPMs. The authors found iron accumulation in organs but markers for serum iron deficiency. They show that during aging, RPMs have a higher labile iron pool (LIP), decreased lysosomal activity with a concomitant reduction in EP. Furthermore, aging RPMs undergo ferroptosis resulting in a non-bioavailable iron deposition as intra and extracellular aggregates. Aged mice fed with an iron restricted diet restore most of the iron-recycling capacity of RPMs even though the mild-anemia remains unchanged.

      Overall, I find the manuscript to be of significant potential interest. But there are important discrepancies that need to be first resolved. The proposed model is that during aging both EP and HO-1 expression decreases in RPMs but iron and ferroportin levels are elevated. In their model, the authors show intracellular iron-rich proteinaceous aggregates. But if HO-1 levels decrease, intracellular heme levels should increase. If Fpn levels increase, intracellular iron levels should decrease. How does LIP stay high in RPMs under these conditions? I find these to be major conflicting questions in the model.

      We thank the Reviewer for her/his valuable feedback. As we mentioned in our replies we can only assume that a small misunderstanding in the interpretation of the presented data underlies this comment. We show that ferroportin levels in RPMs (Fig. 1F) are modulated in a manner that fully reflects the iron status of these cells (both labile and total iron levels, Figs. 1H and I). FPN levels drop in aged RPMs and are rescued when mice are maintained on a reduced iron diet. As pointed out by Reviewer#3, and explained in our replies we believe that ferroportin levels are critical for the observed phenotypes in aging. We now described our data in a more clear way to avoid any potential misinterpretation (p.6).

      Reviewer #3 (Public Review):

      This is a comprehensive study of the effects of aging of the function of red pulp macrophages (RPM) involved in iron recycling from erythrocytes. The authors document that insoluble iron accumulates in the spleen, that RPM become functionally impaired, and that these effects can be ameliorated by an iron-restricted diet. The study is well written, carefully done, extensively documented, and its conclusions are well supported. It is a useful and important addition for at least three distinct fields: aging, iron and macrophage biology.

      The authors do not explain why an iron-restricted diet has such a strong beneficial effect on RPM aging. This is not at all obvious. I assume that the number of erythrocytes that are recycled in the spleen, and are by far the largest source of splenic iron, is not changed much by iron restriction. Is the iron retention time in macrophages changed by the diet, i.e. the recycled iron is retained for a short time when diet is iron-restricted (making hepcidin low and ferroportin high), and long time when iron is sufficient (making hepcidin high and ferroportin low)? Longer iron retention could increase damage and account for the effect. Possibly, macrophages may not empty completely of iron before having to ingest another senescent erythrocyte, and so gradually accumulate iron.

      We are very grateful to this Reviewer for emphasizing the importance of the iron export capacity of RPMs as a possible driver of the observed phenotypes. Indeed, as mentioned above, we now show in the revised version of the manuscript that ferroportin drops early during aging (revised Fig. 4). Importantly, we now also observed that iron loading and limitation of iron export from iRPMs via ferroportin aggravate the impact of heat shock (a well-accepted trigger of proteotoxicity) on both protein aggregation and cell viability (new Fig. 5K and L). Physiologically, recent findings show that aging promotes a global decrease in protein solubility [BioRxiv manuscript (Sui X. et al., 2022)], and it is very likely that the constant exposure of RPMs to high iron fluxes renders these specialized cells particularly sensitive to proteome instability. This could be further aggravated by a build-up of iron due to the drop of ferroportin early during aging, ultimately leading to the appearance of the protein aggregates as early as at 5 months of age in C57BL/6J females. Based on the new data, we emphasized this model in the revised version of the manuscript (please, see Discussion on p. 16)

    1. Author Response

      Reviewer #1 (Public Review):

      The study by Xie et al., investigates whether the entorhinal-DG/CA3 pathway is involved in working memory maintenance. The main findings include a correlation between stimulus and neural similarities that was specific for cued stimulus and entorhinal-DG/CA3 locations. The authors observed similar results (cuing and region specificity) using inverted encoding modeling approach. Finally, they also showed that trials in which participants made a smaller error showed a better reconstruction fidelity on the cued side (compared to un-cued). This effect was absent for larger-error trials.

      The study challenges a widely held traditional view that working memory and episodic memory have largely independent neural implementations with the MTL being critical for episodic memory but not for working memory. The study adds to a large body of evidence showing involvement of the hippocampus across a range of different working memory tasks and stimuli. Nevertheless, it still remains unclear what functions may hippocampus play in working memory.

      We thank the reviewer’s positive appraisal of the current research, which adds to the growing research interest in the MTL’s contribution to WM.

      Reviewer #2 (Public Review):

      Xie et al. investigated the medial temporal lobe (MTL) circuitry contributions to pattern separation, a neurocomputational operation to distinguish neutral representations of similar information. This presumably engages both long-term memory (LTM) and working memory (WM), bridging the gap between the working memory (WM) and long-term memory (LTM) distinction. Specifically, the authors combined an established retro-cue orientation WM task with high-resolution fMRI to test the hypothesis that the entorhinal-DG/CA3 pathway retains visual WM for a simple surface feature. They found that the anterior-lateral entorhinal cortex (aLEC) and the hippocampal DG/CA3 subfield both retained item-specific WM information that is associated with fidelity of subsequent recall. These findings highlight the contribution of MTL circuitry to item-specific WM representation, against the classic memory models.

      I am a long-term memory researcher with expertise in representational similarity analysis, but not in inverted encoding modeling (IEM). Therefore, I cannot verify the correctness of these models and I will leave it to the other reviewers and editors. However, after an in-depth reading of the manuscript, I could evaluate the significance of the present findings and the strength of evidence supporting these findings. The conclusions of this paper are mostly well supported by data, but some aspects of image acquisition and data analysis need to be clarified.

      We thank the reviewer for positive appraisal of the current study.

      I would like to list several strengths and weaknesses of this manuscript:

      Strengths:

      • Methodologically, the authors addressed uncertainty in previous research resulting from several challenges. Namely, they used a high-resolution fMRI protocol to infer signals from the MTL substructures and an established retro-cue orientation WM task to minimize the task load.

      • The authors selected a control ROI - amygdala - irrelevant for the experimental task, and at the same time adjacent to the other MTL ROIs, thus possibly having a similar signal-to-noise ratio. The reported effects were observed in the aLEC and DG/CA3, but not in the amygdala.

      • Memory performance, quantified as recall errors, was at ceiling - an average recall error of 12 degrees was only marginally away from the correct grating towards the closest incorrect grating (predefined with min. 20 degrees increments). However, the authors controlled for the effects of recall fidelity on MTL representations by comparing the IEM reconstructions between precise recall trials and imprecise recall trails (resampled to an equal number of trials). The authors found that precise recall trails have yielded better IEM reconstruction quality.

      • The author performed a control analysis of time-varying IEM to exclude a possibility that the mid-delay period activity in the aLEC-DG/CA3 contains item-specific information that could be attributed to perceptual processing. This analysis showed that the earlier TR in the delay period contains information for both cued and uncued items, whereas the mid-delay period activity contains the most information related to the cued, compared to uncued, item.

      We thank the reviewer for highlighting the multiple strengths of the current study.

      Weaknesses:

      • The authors formulate their main hypothesis building on an assumption related to the experimental task. This task requires correctly selecting the cued grating orientation while resisting the interference from internal representations of the other orientation gratings. The authors hypothesize that if this post-encoding information selection function is supported by the MTL-s entorhinal-DG/CA3 pathway, the recorded delay-period activity should contain more information about the cued item that the uncued item (even if both are similarly remembered). Thus, the assumption here is that resolving the interference would be reflected by a more distinct representation in MTL for the cued item. Could it be the opposite, namely the MTL could better represent the unresolved interference, for example by the mechanism of hippocampal repulsion (Chanales et al., 2017). It could strengthen the findings if the authors comment on the contrary hypothesis as well.

      We thank the reviewer for pointing out this interesting alternative hypothesis. Because of the different task design (e.g., over the course of learning vs. WM) and stimuli (e.g., spatial memory vs. orientation grating), it is hard to directly compare Chanales et al.’s findings with the current results. That said, we think the idea that the representation of similar information would lead to greater task demand on the MTL is consistent with our intuition regarding the role of the MTL in supporting the qualitative aspect of WM representation. We have now further discussed this issue in our revised manuscript to invite further consideration of the suggested alternative hypothesis,

      “Our data suggest that this process would result in more similar and stable representations for the same remembered item across trials, as detected by multivariate correlational and decoding analyses in the current study. However, under certain task conditions (e.g., learning spatial routes in a naturalistic task over many repetitions), the MTL may maximally orthogonalize overlapping information to opposite representational patterns (hence “repulsion”) to minimize mnemonic interference (Chanales et al., 2017). It remains to be determined how these learning-related mechanisms in a more complex setting are related to MTL’s contributions to WM of simple stimulus features.”

      • It is not clear for me why the authors chose the inverted encoding modelling approach and what is its advantage over the others multivoxel pattern analysis approaches, for example representational similarity analysis also used in this study. How are these two complementary? Since the IEM is still a relatively new approach, maybe a little comment in the manuscript could help emphasizing the strength of the paper? Especially that this paper is of interest to researchers in the fields of both working memory and long-term memory, the latter being possibly not familiar with the IEM.

      We thank the reviewer for this suggestion. In principle, the IEM is a multivariate pattern classification analysis based on an encoding model. There is no fundamental difference between this approach and other machine-learning or classification approaches, except that the IEM is a more model-based approach and therefore can be more computationally efficient (see Xie et al., 2023 for a conceptual overview for multivariate analysis of high-dimensional neural data). The relationship between IEM and representational similarity is grounded in item-specific information that could lead to shared neural variance. How these two analyses are complimented each other is well characterized by a recent theoretical review (Kriegeskorte & Wei, 2021). The rationale is that trial-wise RSA reveals shared neural variance between items, implying the presence of item-specific information in the recorded neural data. And the IEM approach or other classification algorithms can more directly test this item-specific information under a prediction-based framework (e.g., train the data and test on a hold-out set). As a result, the findings of these two methods are correlated at the subject-level (Figure S4), which is important to note for the purpose of analytical reliability. Furthermore, using the IEM also allows us to compare our current findings with that from the previous research (Figure S3), addressing some replicability questions in the field (e.g., Ester et al., 2015).

      We have clarified more on this issue in the paragraph when we first introduce IEM,

      “To directly reveal the item-specific WM content, we next modeled the multivoxel patterns in subject-specific ROIs using an established inverted encoding modeling (IEM) method (Ester et al., 2015). This method assumes that the multivoxel pattern in each ROI can be considered as a weighted summation of a set of orientation information channels (Figure 3A). By using partial data to train the weights of the orientation information channels and applying these weights to an independent hold-out test set, we reconstruct the assumed orientation information channels to infer item-specific information for the remembered item – operationalized the resultant vector length of the reconstructed orientation information channel normalized at 0° reconstruction error (Figure S2). As this approach verifies the assumed information content based on observed neural data, its results can be efficiently computed and interpreted within the assumed model even when the underlying neuronal tuning properties are unknown (Ester et al., 2015; Sprague et al., 2018). This approach, therefore, complements the model-free similarity-based analysis by linking representational geometry embedded in the neural data with item-specific information under a prediction-based framework (Kriegeskorte and Wei, 2021; Xie et al., 2023). Based on this method, previous research has revealed item-specific WM information in distributed neocortical areas, including the parietal, frontal, and occipital-temporal areas (Bettencourt and Xu, 2015; Ester et al., 2015; Rademaker et al., 2019; Sprague et al., 2016), which are similar to those revealed by other multivariate classification methods (e.g., support vector machine, SVM, Ester et al., 2015). We have also replicated these IEM effects in the current dataset (Figure S3).”

      Overall, this work can have a substantial impact of the field due to its theoretical and conceptual novelty. Namely, the authors leveraged an established retro-cue task to demonstrate that a neurocomputational operation of pattern separation engages both working-memory and long-term memory, both mediated by the MTL circuitry, beyond the distinction in classic memory models. Moreover, on the methodological side, using the multivariate pattern analyses (especially the IEM) to study neural computations engaged in WM and LTM seems to be a novel and promising direction for the field.

      Thanks for the reviewer for this positive appraisal of the current study.

      Reviewer #3 (Public Review):

      This work addresses a long-standing gap in the literature, showing that the medial temporal lobe (MTL) is involved in representing simple feature information during a low-load working memory (WM) delay period. Previously, this area was suggested to be relevant for episodic long-term memory, and only implicated in working memory under conditions of high memory load or conjunction features. Using well-rounded analyses of task-dependent fMRI data in connection with a straightforward behavioural experiment, this paper suggests a more general role of the medial temporal lobe in working memory delay activity. It also provides a replication of previous findings on item-specific information during working memory delay in neocortical areas.

      We thank the reviewer for highlighting the contribution of the current study to fill a gap in the literature.

      Strengths:

      The study has strengths in its methods and analyses. Firstly, choosing a well-established cueing paradigm allows for straightforward comparison with past and future studies using similar paradigms. The authors themselves show this by replicating previous findings on delay-period activity in parietal, frontal, and occipito-temporal areas, strengthening their own and previous findings. Secondly, they use a template with relatively fine-grained MTL-subregions and choose the amygdala as a control area within the MTL. This increases confidence in the finding that the hippocampus in particular is involved in WM delay-period activity. Thirdly, their combined use stimulus-based representational similarity analysis as well as Inverted Encoding Modeling and the convergence on the same result is encouraging. Finally, despite focusing on the delay period in their main findings, extensive supplementary materials give insight into the time-course of processing (encoding) which will be helpful for future studies.

      We thank the reviewer for highlighting multiple strengths of this current study.

      Weaknesses:

      While the evidence generally supports the conclusions, there are some weaknesses in behavioural data analysis. The authors demonstrated fine stimulus discrimination in the neural data using Inverted Encoding Modeling (IEM), however the same standard is not applied in the behavioural data analysis. In this analysis, trials below 20 degrees and trials above 20 degrees of memory error are collapsed to compare IEM decoding error between them. As a result, the "small recall error" group encompasses a total range of 40 degrees and includes neighbouring stimuli. While this is enough to demonstrate that there was information about the remembered stimulus, it does not clarify whether aLEC/CA3 activity is associated with target selection only or also with reproduction fidelity. It leaves open whether fine-grained neural information in MTL is related to memory fidelity.

      We thank the reviewer for this cautious note. As the current task is optimized to reveal the neural representation during visual WM and as our participants are cognitively normal college students, participants’ behavioral performance in the current experiment tends to be very good (Figure 1). This leaves us relatively small variation to further probe the behavioral outcomes of the task. We have recently generalized our findings using intracranial EEG and confirmed that trial-by-trial mnemonic discrimination during a short delay is indeed associated with the fidelity of item-specific WM representation (Xie, Chapeton, et al., in press).

      We have further discussed this issue in the revised Discussion,

      “… These two approaches are therefore complementary to each other. Nevertheless, these analyses are correlational in nature. Hence, although fine-grained neural representations revealed by these analyses are associated with participants’ behavioral outcomes (Figure 4), it remains to be determined whether the entorhinal-DG/CA3 pathway contributes to the fidelity of the selected WM representation or also to the selection of task-relevant information. Strategies for resolving this issue can involve generalizing the current findings to other WM tasks without an explicit requirement of information selection (e.g., intracranial stimulation of the MTL in a regular WM task without a retro-cue manipulation, Xie et al., in press) and/or further exploring how the frontal-parietal mechanisms related to visual selection and attention interact with the MTL system (Panichello and Buschman, 2021).”

      Moreover, the authors could be more precise about the limitations of the study and their conclusions. In particular, the paper at times suggests that the results contribute to elucidating common roles of the MTL in long-term memory and WM, potentially implementing a process called pattern separation. However, while the paper convincingly shows MTL-involvement in WM, there is no comparison to an episodic memory condition. It therefore remains an open question whether it fulfils the same role in both scenarios. Moreover, the paradigm might not place adequate pattern separation demands on the system since information about the un-cued item may be discarded after the cue.

      We thank the reviewer for this cautious note. We have now included a more detailed discussion on this issue.

      In the Discussion,

      “To more precisely reveal the MTL mechanisms that are shared across WM and long-term memory, future research should examine the extent to which MTL voxels evoked by a long-term memory task (e.g., mnemonic similarity task, Bakker et al., 2008) can be directly used to directly decode mnemonic content in visual WM tasks using different simple stimulus features.”

    1. Author Response

      Reviewer #2 (Public Review):

      In the current manuscript, Feng et al. investigate the mechanisms used by acute leukemia to get an advantage for the access to the hematopoietic niches at the expense of normal hematopoietic cells. They propose that B-ALLs hijack the niche by inducing the downmodulation of IL7 and CXCL12 by stimulating LepR+ MSCs through LTab/LTbR signaling. In order to prove the importance of LTab expression in B-ALL growth, they block LTab/LTbR signaling either through ligand/receptor inactivation or by using a LTbR-Ig decoy. They also show that CXCL12 and the DNA damage response induce LTab expression by B-ALL. They finally propose that similar mechanisms also favor the growth of acute myeloid leukemia.

      Although the proposed mechanism is of particular interest, further experiments and controls are needed to strongly support the conclusions.

      1/ Globally, statistics have to be revised. The authors have to include a "statistical analysis" section in the Material and Methods to explain how they proceeded and specify for each panel in the figure legend which tests they used according to the general rules of statistics.

      We apologize for the lack of details. This has been corrected in the revised manuscript.

      2/ The setup of each experiment is confusing and needs to be detailed. Cell numbers are not coherent from one experiment to the other. As an example, there are discrepancies between Fig1 and Fig2. Based on the setup of the experiment in Fig.2 (Injection of B-ALL to mice followed by 2 injections of treatment every 5 days), mice have probably been sacrificed 12-14 days post leukemic cell injection. However, according to Fig.1, B cells and erythroid cells at this time point should be decreased >10 times while they are only decreased 2-4 times in Fig.2. This is also the case in Fig.4B-J or Fig.5D with even a lower decrease in B cells and erythroid cells despite a high number of leukemic cells. Please explain and give the end point for each experiment in each figure (main and supplemental).

      We understand the reviewer concern but we’d like point out the following: kinetic experiments such as these were reproduced multiple times in the laboratory. However, when comparing side-by-side experiments performed over the course of several months discrepancies in the exact days when leukemia shuts-down hematopoiesis are bound to happen. This is because there are numerous variables at play that we can minimize to the extent possible, but we cannot completely eliminate. For example, we took all possible steps to work with stable batches of preB-ALL cells. However, it is impossible to be absolutely certain that the batch in one experiment is identical to another experiment. Cells have to be expanded for adoptive transfer, which inevitably carries some variability (all biological systems undergo random mutations, including purchased C57Bl6/J from reputable vendors); slight differences in ALL engraftment (i.e. injection variability) can occur such that kinetics may change by a couple of days, etc. The findings we reported here are highly reproducible: ALL shuts down lymphopoiesis and erythropoiesis acutely, less so myelopoiesis; that LTbR signaling is the major mechanism shutting down lymphopoiesis but not erythropoiesis; that ALLs up-regulate LTbR ligands when compared to non-leukemic cells of the same lineage and at a similar developmental stage; that CXCR4 and DSB pathways both promote lymphotoxin a1b2 expression. The exact kinetics of these experiments will vary, or at least carry a margin of error that is to the best of our capability impossible to eliminate.

      3/ To formally prove that the observed effect is really due to LTab/LTbR signaling, the authors must perform further control experiments. LTbR signaling is better known for its positive role on lymphocyte migration. They cannot rule out by blocking LTbR signaling, that they inhibit homing of leukemic cells into the bone marrow through a systemic/peripheral effect, more than through an impaired crosstalk with BM LepR+ cells. They must confirm for inhibited/deficient LTbR signaling conditions, as compared to control, that similar B-ALL numbers home to the BM parenchyma at an early time point after injection. Furthermore, they cannot exclude that the effect on the expression of IL7 (and other genes), and consequently the effect on B cell numbers, is not simply due to the tumor burden. Indeed, B-ALL numbers/frequencies are different between control and inhibited/deficient signaling conditions at the time of analysis. The analyses should thus be performed at similar low and high tumor burden in the BM for both control and inhibited/deficient LTbR signaling conditions.

      We performed ALL homing experiments into control and LTbR∆ and found no significant differences in ALL frequency or number in BM 24h after transplantation. These data have been included in Figure 4A.

      We also performed experiments to control for the number of ALL cells in the bone marrow. Briefly, we compared the impact of 3 million WT ALLs with that of 3 and 9 million Ltb-deficient ALLs on Il7-GFP expression in BM MSCs. The number of Ltb-deficient ALLs in the BM of mice recipient of 9 million ALLs was equivalent to that of mice that received 3 million WT ALLs 7 days after transplantation. Importantly, Il7 was only downregulated in mice transplanted with WT ALLs. These data have been included in Figure 4R and 4S.

      4/ LT/LTbR signaling is particularly known for its capacity to stimulate Cxcl12 expression. How do the authors explain that they see the opposite?

      The reviewer is alluding to a well-known role of LTbR signaling as an organizer of immune cells in secondary lymphoid organs such as spleen and lymph nodes, and particularly its role in promoting CXCL13, CCL19, CCL21 production by fibroblastic reticular cells of these organs. Both the B cell follicle and the T-zone do not express CXCL12 abundantly. Furthermore, in the B cell follicle niche, LTbR signaling is critical for the maturation of Follicular Dendritic Cells, yet FDCs hardly produce CXCL12 as well. So, while LTbR is a well-known regulator of cell organization through the production of homeostatic chemokines and lipid chemoattractants, CXCL12 itself is not one of the major chemokines controlled by this pathway. In summary, we do not think our data is in any way incompatible with prior studies on the LTbR pathway, and even if it was, to our knowledge this is the first study on cell-intrinsic effects of LTbR signaling in BM MSCs.

      5/ The authors show that CXCL12 stimulates LTa expression in their cell line. They then propose that CXCR4 signaling in leukemic cells potentiates ALL lethality by showing that a CXCR4 antagonist reverses the decrease in IL7 and improves survival of the mice. This experiment is difficult to interpret. CXCL12 has been shown to be important for migration/retention of B-ALL in the BM and the decreased tumor burden is probably linked to a decreased migration more than an impaired crosstalk with LepR+ cells (see also point 3). If CXCL12 increases LTab expression, CXCR4 blockade should do the opposite. This result should be presented. The contradiction is that if B-ALLs induce a decrease in CXCL12 in the BM (in addition to IL7) and that CXCL12 regulates LTab levels, leukemic cells should be exhausted. Similarly, IL7 has been previously shown to stimulate LTab expression and B-ALL cells express the IL7R. Again, a decrease in IL7 should be unfavorable to B-ALL. How do they explain these discrepancies?

      We thank the reviewer suggestion of testing the impact of CXCR4 blocking in vivo on LTa1b2 expression. We performed these experiments which have now been included in the revised manuscript (Fig. 5C and 5D). In summary, we observed reduced LTa1b2 on ALLs transplanted into mice treated with AMD3100, a well-known CXCR4 antagonist. These data also show that CXCR4 signaling is not the only mechanism driving LTa1b2. These results further strengthen the main conclusions of the manuscript. Finally, to our knowledge no study has reported Lymphotoxin a1b2 upregulation in B-ALLs by IL-7.

      6/ In Supp 4A, RAG-/- mice are blocked at the pro-B cell stage and do not have pre-B cells. Please compare LTa and LTb expression by Artemis deficient pre-B cell to wt pre-B cells. In this experiment, the authors show that similarly to B-ALL artemis-/- pre-leukemic pre-B cells express high levels of LTab and induce IL7 downmodulation. Using mice deficient for LTbR in LepR+ cells, they show that IL7 expression is increased. However, in opposition to leukemic cells (see Figure 4F), pre-leukemic cells are increased in absence of LTab/LTbR signaling. Please explain this discrepancy. The authors use only one B-ALL model cell line for their demonstration (BCR-ABL expressing B-ALL). Another model should be used to confirm whether LTab/LTbR signaling does favor leukemic/pre-leukemic B cell growth.

      We apologize for the confusion. The mice that were used in this study were initially described by Barry Sleckman and colleagues (Bredemeyer et al. Nature 2008). Briefly, they crossed Artemis-deficient mice with VH147 IgH transgenic and EμBcl-2 transgenic mice to generate mice in which B cell development is arrested at the preB cell stage. The Vh147 heavy chain allows their development to the pre-BCR+ preB cell stage but Artemis deficiency prevents Rag protein re-expression and hence B cell can’t recombine light chain genes. The EμBcl-2 transgene allows preB cells to survive despite carrying unrepaired double-strand DNA breaks (DSB).

      Regarding the discrepancy noted by the reviewer we argue that this is not a discrepancy. While ALLs can grow in vitro and in vivo in the absence of IL7, non-leukemic developing B cells are strictly IL7 dependent. PreB cells carrying unrepaired DSBs still express IL7 receptor and although no data is currently available on whether these cells are also IL7-dependent, we speculate that they are. Because up-regulation of Lymphotoxin a1b2 in preB cells carrying unrepaired DSBs promotes IL7 downregulation we speculate that this mechanism may contribute to the efficient elimination of pre-leukemic preB cells in vivo. We revised the manuscript to include this explanation of the mouse model and discussion on how we think the LTbR pathway may play a role in pre-leukemic states.

      Finally, the data presented in this study includes two distinct leukemia mouse models. It also includes data from human B-ALL and AML samples that is in agreement with the mouse data presented here. We respectfully disagree with the reviewer that a third model is needed to confirm a role for the LTa1b2/LTbR pathway in leukemia.

      7/ Pre-B cells are composed of large pre-B cells (pre-BCR+) and small pre-B cells (pre-BCR-). BCR-ABL B-ALL cells express the pre-BCR. What is the level of expression of LTa and LTb by each of these 2 subsets as compared to BCR-ABL B-ALL?

      This is a misconception. The difference between large and small preB cells is simply that large preB cells are in S/G2 phase of the cell cycle. Their increased size is a mere consequence of doubling DNA, protein, membrane content, etc.

    1. Author Response

      Reviewer #1 (Public Review):

      It is a strength of the current manuscript that it provides a near-complete picture of how the metamorphosis of a higher brain centre comes about at the cellular level. The visualization of the data and analyses is a weakness.

      I do not see any point where the conclusions of the authors need to be doubted, in particular as speculations are expressly defined as such whenever they are presented.

      The fact that molecular or genetic analyses of how the described metamorphic processes are organized are not presented should, I think, not compromise enthusiasm about what is provided at the cellular level.

      We appreciate the comments and guidance that Reviewer #1 has given us on data presentation. We have tried to simplify figures and make the images larger. For the developmental figures, a couple of illustrative examples are provided in the main figure with the remainder given in “figure supplements”

      Reviewer #2 (Public Review):

      This very nice piece of work describes and discusses the developmental progression of larval neurons of the mushroom body into those in the adult Drosophila brain. There are many surprising findings that reveal a number of strategies for how brain development has evolved to serve both the early functions specific to the larval brain and then their eventual roles in the adult brain. I think it is fascinating biology and I was educated while reviewing the paper.

      Line 115-116. 'Output from PPL1 compartments direct avoidance behavior, while that from PAM compartments results in attraction'. This is not correct and is actually reversed. The learning rule is depression so that aversive learning reduces the drive to approach pathways whereas appetitive learning reduces the drive to avoidance pathways. This should be corrected and reference made to studies demonstrating learning-directed depression.

      Line 222. It provides feed-forward inhibition from y4>2>1. I could be wrong but I'm not aware that there is functional evidence for this glutamatergic neuron being inhibitory. It's currently speculation.

      We have noted that this function was proposed by Aso et al.

      Line 242. I think it would be nice if the authors focused on extreme changes and showed larger and nicer images. The rest can be summarized but why not pick a few of the best examples to illustrate the strategies they consider in the discussion?

      We have reduced the number of neurons shown in the new Figs 5 and 6. Hopefully, the images are now large enough to appreciate. Data for the remaining neurons are now in Figure Supplements for Figs 5 and 6.

      Line 249 'became sexually dimorphic'. I may have missed it somewhere but this immediately made me think about the sex of all the images that are shown. Is this explicitly stated somewhere? Was it tracked in all larvae, pupae, and adults?

      We now begin the Methods addressing this point. We did an initial screen and found sex-specific differences only in MBIN-b1 and -b2. After this time, we kept no records as to the sex of the fly that was used except for the latter cells.

      Reviewer #3 (Public Review):

      Truman et al. investigated the contribution and remodeling of individual larval neurons that provide input and output to the Drosophila mushroom body through metamorphosis. Hereto, they used a collection of split-GAL4 lines targeting specific larval mushroom body input and output neurons, in combination with a conditional flip-switch and imaging, to follow the fates of these cells.

      Interestingly, most of these larval neurons survive metamorphosis and persist in the adult brain and only a small percentage of neurons die. The authors also elegantly show that a substantial number of neurons actually trans-differentiate and exert a different role in the larval brain, compared to their final adult functionality (similar to their role in hemimetabolous insects). This process is relatively understudied in neuroscience and of great interest.

      Using the ventral nerve cord as a proxy, the authors claim that the larval state of the neuron would be their derived state, while their adult identity is ancestral. While the authors did not show this directly for the mushroom body neurons under study, it is a very compelling hypothesis. However, writing the manuscript from this perspective and not from the perspective of the neuron (which first goes through a larval state, metamorphosis, and finally adult state), results in confusing language and I would suggest the authors adjust the manuscript to the 'lifeline' of the neuron.

      We have tried to be more “linear” in our presentation. This should make the text less confusing.

      In general, this manuscript does not explain how the larval brain has evolved as the title suggests but instead describes how the larval brain is remodeled during metamorphosis. It thus generates perspectives on the evolution of metamorphosis, rather than the larval state. Additionally, this manuscript would benefit from major rearrangements in both text and figures for the story to be better comprehended.

      We think that the end of the Discussion does relate to how a larval brain evolves. The evolution of the larval brain is faced with constraints related to the shortened period of embryonic development and the highly conserved temporal and spatial mechanisms that insects use to generate their neuronal phenotypes. These constraints result in a potential mismatch between the neurons that are needed and those that are actually made (revealed by the adult phenotypes of these neurons). The larva then turns to trans-differentiation to temporarily transform unneeded (or dead) neurons into the missing cell types to build its larval circuits.

      We think that these ideas provide some new insights into how a larval brain may have evolved and that our title is appropriate.

      The introduction is very focused on the temporal patterning of the insect nervous system, while none of the data collected incorporate this temporal code. Temporal patterning comes back in the discussion but is purely speculative.

      The Speculation about the importance of temporal patterning is now brought in late in the Discussion in reference to Figure 12

      Furthermore, the second part of the introduction describes one strategy for remodeling and why that strategy is not likely but does not present an alternative hypothesis. The first section of the results might serve as a better introduction to the paper instead, as it places the results of the paper better and concludes with the main findings. The accompanying Figure 1 would also benefit from a schematic overview of the larval and adult mushroom bodies as presented in Fig. 2A (left).

      This has been revised in the spirit of these comments

      In the second results section, the authors show the post-metamorphic fates of mushroom body input and output neurons and introduce the concept of trans-differentiation. Readers might benefit from a short explanation of this process. I also encourage the authors to revisit this part of the text since it gives the impression that the neurons themselves undergo active migration (instead of axon remodeling).

      We have tried to make it clear that there is no cell migration. Rather there is retraction/fragmentation of larval arbors followed by outgrowth to new, adult targets

      The discussion starts with a very comprehensive overview of the different strategies that neurons could use during metamorphosis (here too, re-writing the text from the neurons' perspective would increase the reflection of what actually happens to them).

      The Discussion now begins by dealing with gross changes in the MB, with reference to the compartments and eventually moves to changes in individual cells. We have reduced our discussion of the metamorphic strategies of cells and no longer have Fig 8A

      The discussion covers multiple topics concerning trans-differentiation, metamorphosis, memory, and evolution and is often disconnected from the results. It could be significantly shortened to discuss the results of the paper and place them in current literature. Generally, the figures supporting the discussion are hard to comprehend and often do not reflect what the text is saying they are showing.

      The Discussion is still long, but, hopefully, our organization now makes it much easier to read and comprehend.

    1. Author Response

      Reviewer #1 (Public Review):

      Junctophilin is mostly known as a structural anchor to keep excitation-contraction (E-C) proteins in place for healthy contractile function of skeletal muscle. Here the authors provide a new interesting role in skeletal muscle for Junctophilin (44 kD segment, JPh44), where it translocates to the nuclei and influences gene transcription. Also, the authors have shown that Calpain 1 can digest junctophilin to generate the 44 kDa segment. The field of skeletal muscle generally knows little about how E-C coupling proteins have dual role and influence gene regulation that subsequently may alter the muscle function and metabolism. This part of the manuscript is solid, informative, and novel. The authors use advanced imaging and genetic manipulations of junctophilin etc to support their hypothesis. The authors then also aim to link this mechanism to hyperglycemia in individuals susceptible for malignant hyperthermia as they have elevated levels of the 44kDa segment. However, the power of the analyses are low and the included data comparisons complicates the possibility to interpret the results and its relevance. Nevertheless, the data supporting the novel dual role of junctophilin would likely be appreciated and gain attention to the muscle field.

      Thanks for your constructive reading. We agreed (in our answer to Item 1) to your concern regarding power of the tests. To improve it we would need many more individual patients (which, after the pandemic peaks, are starting to be recruited again, although at a pace of no more than 2 per month). We are committed to updating the present report as soon as we obtain, say, 20 more MHS and MHN patients –a minimum to impact power of the tests. In any case, we claim that power is not an acute concern, as this communication deals mainly with positive results, where significance is of the essence.

      We have established significance in most of the observations communicated here; in the few cases where p is marginal, significance is inferred by correlations.

      Reviewer #2 (Public Review):

      Skeletal muscle is the main regulator of glycemia in mammals and a major puzzle in the field of diabetes is the mechanism by which skeletal muscle (as well as other tissues) become insensitive to insulin or decrease glucose intake. the authors had proposed in a previous publication that high intracellular calcium, by means of calpain activation, could cleave and decrease the availability of GLUT4 glucose transporters. In this manuscript, the authors identify two additional targets of calpain activation. One of them is GSK3β, a specialized kinase that when cleaved, inhibits glycogen synthase and impairs glucose utilization. The second target is junctophilin 1, a protein involved in the structure of the complex responsible for E-C coupling in skeletal muscle. The authors succeeded in showing that a fragment of junctophilin1 (JPh44) moves from the triad to other cytosolic regions including the nuclei and they show changes in gene expression under these conditions, some of them linked to glucose metabolism.

      Overall, the manuscript shows a novel and audacious approach with a careful treatment of the data (that was not always easy nor obvious) that allow sensible conclusions and definitively constitutes a step forward in this field.

      Thanks for the generous report.

      Reviewer #3 (Public Review):

      First, we express utmost gratitude for your critical work on our manuscript. Your concerns made us perform additional experiments and validations, eventually forcing us to abandon a couple of erroneous notions and therefore improving our understanding and interpretations. Because your concerns were already in the “Essentials” list assembled by the Editor, our responses here will mostly refer to our earlier answers to the items in that list.

      1) Figure 1 A and B show a western blot of proteins isolated from muscles of MHN and MHS individuals decorated with two different antibodies directed against JPH1. According to the manufacturer, antibody A is directed against the JPH1 protein sequence encompassing amino acids 387 to 512 while antibody B is directed against a no better specified C-terminal region of JPH1. Surprisingly, antibody B appears not to detect the full-length protein in lysates from human muscles, but recognizes only the 44 kDa fragment of JPH1. However, to the best of the reviewer's knowledge, antibody B has been reported by other laboratories to recognize the full-length JPH1 protein.

      The reason for the failure of ab B to recognize the full human protein may be that it was raised against a murine immunogen (this interpretation was communicated to us by G.D. Lamb, who co-authored the 2013 paper by Murphy et al. where the failure was noted). It recognizes both JPh1 and JPh44 of murine muscle in our hands.

      Thus, is not obvious why here this antibody should recognize only the shorter fragment.

      We agree entirely. In spite of the difficulties in interpretation, the recognition of human JPh44 by the ab is, however, a fact, repeatedly demonstrated in the present study, which can be used to advantage.

      In addition, in MHS individuals there is no direct correlation between reduction in the content of the full-length JPH1 protein and appearance of the 44 kDa JPH1fragment, since, as also reported by the authors, no significant difference between MHN and MHS can be observed concerning the amount of the 44 kDa JPH1.

      Tentative interpretations of the lack of correlation have been presented in the response to Item 14, above.

      Based on the data presented, it is very difficult to accept that antibody A and B have specific selectivity for JPH1 and the 44 kDa fragment of JPH1.

      Indeed, we now acknowledge that Ab A reacts equally with JPh1 and the 44 kDa fragment (and provide quantitative evidence for it in Supplement 1 to Fig. 8). We also provide conclusive evidence of the specificity of ab B (e.g., Supplement 2 to Fig. 1).

      2) In Figure 2B staining of a nucleus is shown only with antibody B against the 44 kDa JPH1 fragment, while no nucleus stained with antibody A is shown in Fig 2A. Images should all be at the same level of magnification and nuclear staining of nuclei with antibody A should be reported. In Figure 2Db labeling of JPH1 covers both the nucleus and the cytoplasm, does it mean that JPH1 also goes to the nucleus? One would rather think that background immunofluorescence may provide a confounding staining and authors should be more cautious in interpreting these data.

      These items are fully covered in our response to Item 16.

      Images in 2D and 2E refer to primary myotubes derived from patients. The authors show that RyR1 signals co-localizes with full-length JPH1, but not with the 44 kDa fragment, recognized by antibody B. How do the authors establish myotube differentiation?

      Myotubes are studied 5-10 days after switching cells to differentiation medium, which is DMEM-F12 supplemented with 2.5% horse serum, as explained in Figueroa et al 2019. Cells with more than 3 nuclei were considered myotubes. Myotubes with similar degree of maturation (number of nuclei) were selected for experimental comparisons.

      3) Figure 3 A-C. The authors show images of a full-length JPH1 tagged with GFP at the N-terminus and FLAG at the C- terminus. In Figure 3Ad and Cd the Flag signal is all over the cytoplasm and the nuclei: since these are normal mouse cells and fibers, it is surprising that the FLAG signal is in the nuclei with an intensity of signal higher than in patient's muscle.

      Can the authors supply images of entire myotubes, possibly captured in different Z planes? How can they distinguish between the cleaved and uncleaved JPH1 signals, especially in mouse myofibers, where calpain is supposed not to be so active as in MHS muscle fibers?

      Answer fully provided to Items 16b and 17 in Essentials list.

      4) If the 44 kDa JPH1 fragment contains a transmembrane domain, it is difficult to understand the dual sarcoplasmic reticulum and nuclear localization. To justify this the authors, in the Discussion session, mention a hypothetical vesicular transport of the 44 kDa JPH1 fragment by vesicles. Traffic of proteins to the nucleus usually occurs through the nuclear pores and does not require vesicles. Even if diffusion from the SR membrane to the nuclear envelope occurs, the protein should remain in the compartment of the membrane envelope. There is no established evidence to support such an unusual movement inside the cells.

      In agreement with the criticism, we have removed the speculation from the Discussion.

      5) In Figure 5, the authors show the effect of Calpain1 on the full-length and 44 kDa JPH1 fragment in muscles from MHS patients. Can the authors repeat the same analysis on recombinant JPH1 tagged with GFP and FLAG?

      We agree that confirmatory evidence of the calpain effect on dual-tagged recombinant JPh1 would be desirable. However, we think an in-depth study is required to follow up on the number of JPh1 fragments generated by calpain (or by different calpain isoforms) and their positions, similar to the detailed study of JPh2 fragmentation Wang et al. in 2021 (5).

      Can the authors provide images from MHN muscle fibers stained with JPH1 and Calpain1.

      We complied with the request.

      6) In Figure 6, the authors show images of MHS derived myotubes transfected with FLAG Calpain1 and compare the distribution of endogenous JPH1 and RYR1 in two cells, one expressing FLAG Calpain1 (cell1) and one not expressing the recombinant protein. They state that cell1 shows a strong signal of JPH1 in the nucleus, while this is not observed in cell2. Nevertheless, it is not clear where the nucleus is located within cell2 since the distribution of JPH1 is homogeneous across the cell. Can the authors show a different cell?

      In agreement, we now show a comparison between cultures with and without transfection in Supplement 1 to Fig. 6.

      7) In Figure 7, panels Bb and Db: nuclei appear to stain positive for JPH1. It is not clear why in panels Ac, Bc they show a RYR1 staining while in panels Cc and Dc they show N-myc staining. The differential localization to nuclei appears rather poor also in these panels.

      We have entirely removed from the manuscript the description of experiments of exposure to extracellular calpain, including Fig. 7 and three associated tables.

      8) The strong nuclear staining in Figure 8, panels C and D is very different from the staining observed in Fig. 2 and Fig. 3. Transfection should not change the ratio between nuclear and cytoplasmic distribution.

      Transfection is an intrusive procedure, which requires production and trafficking of an exogenous protein. This protein, furthermore, is an artificial construct (in this case, a “stand-in”, which adds to the native protein and therefore is akin to overexpression). For the above reasons, we believe that differences in intensity of nuclear staining may obey to multiple causes and should not be especially concerning.

    1. Author Response

      Reviewer #1 (Public Review):

      1) This study performs an interesting analysis of evolutionary variation and integration in forelimb/hand bone shapes in relation to functional and developmental variation along the proximo-distal axis. They found expected patterns of evolutionary shape variation along the proximo-distal axis but less expected patterns of shape integration. This study provides a strong follow-up to previous studies on mammal forelimb variation, adding and testing interesting hypotheses with an impressive dataset. However, this study could better highlight the relevance of this work beyond mammalian forelimbs. The study primarily cites and discusses mammalian limb studies, despite the relevance of the suggested findings beyond mammals and forelimbs. Furthermore, relevant work exists in other tetrapod clades and structures related to later-developing traits and proximo-distal variation. Finally, variations in bone size and shape along the proximo-distal axis could be affecting evolutionary patterns found here and it would be great to make sure they are not influencing the analysis/results.

      We appreciate the reviewer’s comments, and we acknowledge the importance of including examples of non-mammalian lineages in our study. We attended to the recommendation and included more examples of other tetrapod taxa in our text and in our references, providing a more inclusive discussion of limb bone diversity beyond mammals. We also explain below why the results obtained are not inflated by variation of bigger versus smaller sizes of bones.

      Reviewer #2 (Public Review):

      10) Congratulations on producing a very nice study. Your study aims to examine the morphological diversity of different mammalian limb elements, with the ultimate goal seemingly to test expectations based on the different timing of development of the limb bones. There's a lot to like: the sample size is impressive, the methods seem appropriate and sound, the results are interesting, the figures are clear, and the paper is very well written. You find greater diversity and integration in distal limb segments compared to proximal elements, and this may be due to the developmental timing and/or functional specialization of the limb segments. These are interesting results and conclusions that will be of interest to a broad readership. And the large dataset will likely be valuable to future researchers who are interested in mammalian limb morphology and evolution. I have one major concern with how you frame your discussion and conclusions, which I explain below. But I think you can address this issue with some text edits.

      We sincerely thank the reviewer for his constructive recommendations and for his appreciation of our work. We addressed the issue raised as detailed below.

      11) Major concern - is developmental timing the best hypothesis?

      You discuss two potential drivers for the relatively greater diversity in distal elements: 1) later development and 2) greater functional specialization. Your data doesn't allow you to fully test these two hypotheses (e.g. you don't have detailed evo-devo data to infer developmental constraints), and I think you realize this - you use phrases like "consistent with the hypothesis that ...". You seem to compromise and conclude that both factors (development + function) are likely driving greater autopod diversity (e.g. Lines 302-306). Being unable to fully test these hypotheses weakens the impact of your conclusions, making them a bit more speculative, but otherwise, it isn't a critical issue.

      But my concern is that you seem to favor developmental factors over functional factors as the primary drivers of your results, and that seems backwards to me. For instance, early in the Abstract (Line 32) and early in the Discussion (Line 201) you mention that your results are consistent with the developmental timing hypothesis, but it's not until later in the Abstract or Discussion that you mention the role of functional diversity/specialization/selection. The problem with favoring the development hypothesis is that your integration results seem to contradict that hypothesis, at least based on your prediction in the Introduction (Line 126; although you spend some of the Discussion trying to make them compatible). Later in the paper, you acknowledge that functional specialization (rather than developmental factors) might be a better explanation for the integration results (Lines 282-284, 345-347), but, again, this is only after discussions about developmental factors.

      When you first start discussing functional diversity, you say, "high integration in the phalanx and metacarpus, possibly favoured the evolution of functionally specialized autopod structures, contributing to the high variation observed in mammalian hand bones." (Line 282). This implies that integration led to functional diversity in the autopod. But I'd flip that: I think the functional specialization of the hand led to greater integration. Integration does not result solely from genetic/developmental factors. It can also result from traits evolving together because they are linked to the same function. From Zelditch & Goswami (2021, Evol. & Dev.): "Within individuals, integration is customarily ascribed to developmental and/or functional interdependencies among traits (Bissell & Diggle, 2010; Cheverud, 1982; Wagner, 1996) and modularity is thus due to their developmental and/or functional independence."

      In sum, I think your results capture evidence of greater functional specialization in hands relative to other segments. You're seeing greater 1) disparity and 2) integration in hands, and both of those are expected outcomes of greater functional specialization. In contrast, I think it's harder to fit your results to the developmental timing hypothesis. Thus, I recommend that throughout the paper (Abstract, Intro, Discussion) you flip your discussion of the two hypotheses and start with a discussion on how functional specialization is likely driving your results, and then you can also note that some results are consistent with the development hypothesis. You could maintain most of your current text, but I'd simply rearrange it, and maybe add more discussion on functional diversity to the Intro.

      Or, if you disagree and think that there's more support for the development hypothesis, then you need to make a better case for it in the paper. Right now, it feels like you're trying to force a conclusion about development without much evidence to back it up.

      We thank the reviewer for his thoughtful and thorough comment. We agree that the results provided, particularly those of integration, support the hypothesis that functional specialization contributes to the uneven diversity of limb bones. We addressed the concerns by substantially changing our discussion, particularly moderating (and removing) sections on the developmental constraints and adding new arguments for other possible drivers for the diversity of limb bones, such as function. However, the goal of the paper was to test whether the data corroborate - or not - the predictions derived from the developmental hypothesis, and they largely do. Therefore, we decided to keep the developmental hypothesis presented first in the introduction and in the discussion section, as we believe this sequence provides more coherence considering the hypothesis tested (we believe that detailing the role of functional specialization particularly in the introduction would mislead the reader to think that we directly tested for these parameters). Following the discussion of the integration results, we then go on to discuss the possible role of functional specialization on the results obtained (lines 262-285, see also lines 216-234). Yet, these are not tested in this paper and remain to be tested in a future analysis focusing specifically on the role of ecology and function in driving variation in the mammalian limb.

      12) Limitations of the dataset

      Using linear measurements is fine, but they mainly just capture simple aspects of the elements (lengths and widths). You should acknowledge in your paper the limitations of that type of data. For example, the deltoid tuberosity of the humerus can vary considerably in size and shape among mammals, but you don’t measure that structure. The autopod elements don’t have a comparable process, meaning that if you were to measure the deltoid tuberosity then you’d likely see a relative increase in humerus disparity (although my guess is that it’d still be well below that of the autopod). And you omit the ulna from your study, and its olecranon process varies considerably among taxa and its length is a very strong correlate of locomotor mode. In other words, your finding of the greatest disparity in the hand might be due in part to your choice of measurements and the omission of measurements of specific processes/elements. I recommend that you add to your paper a brief discussion of the limitations of using linear measurements and how you might expect the results to change if you were to include more detailed measurements and/or more elements.

      We followed the recommendation and included a discussion about the dataset limitations, acknowledging for the possible impact of the measurements and the bones chosen in the results obtained (Lines 235-260).

      Reviewer #3 (Public Review):

      32) This paper uses a large (638 species representing 598 genera in 138 families) extant sample of osteologically adult mammals to address the question of proximodistal patterns of cross-taxonomic diversity in forelimb bony elements. The paper concludes, based on a solid phylogenetically controlled multivariate analysis of liner measurements, that proximal forelimb elements are less morphologically diverse and evolutionarily flexible than distal forelimb elements, which the paper concludes is consistent with a developmental constraint axis tied to limb bud growth and development. This paper is of interest to researchers working on macroevolutionary patterns and sources of morphological diversity.

      Methodological review Strengths:

      The taxonomic dataset is very comprehensive for this sort of study and the authors have given consideration to how to identify bony elements present in all mammalian taxa (no small task with this level of taxonomic breadth). Multivariate approaches as used in this study are the gold standard for addressing questions of morphological variations.

      The authors give consideration to two significant confounders of analyses operating at this scale: phylogeny and body size. The methods they use to address these are appropriate, although as I note below body size itself may merit more consideration.

      We sincerely thank the reviewer for his appreciation of our study. We addressed the main concerns pointed out below.

      Weaknesses:

      33) The authors assume a lot of knowledge on the part of the reader regarding their methods. Given that one of their key metrics (stationary variance) is largely a property as I understand it of OU models, more explanation on the authors' biological interpretation of stationary variance would help assess the strength of their conclusions, especially as OU models are not as straightforward as they first appear in their biological interpretation (Cooper et al., 2016).

      We acknowledge that this may not be straightforward and now include a more extensive explanation of the approach and the metrics used. We detailed the explanation about the stationary variances in the methods, contextualizing the biological meaning (lines 456-469).

      34) It is unclear what the authors mean when they say they "simulated the trait evolution under OU processes on 100 datasets". Are the 100 datasets 100 different tree topologies (as seems to be the case later "we replicated the body mass linear regressions with 100 trees from Upham et al (2019)." If that is so, what is the rationale for choosing 100 topologies and what criteria were used to select the 100 topologies?

      We understand the explanation may have been confusing. Globally, we used a parametric bootstrap approach to assess the uncertainty around point estimates for morphological diversity and integration. That is, we first simulated 100 datasets on the maximum clade credibility tree (MCC tree, that summarizes 10,000 trees from Upham et al. 2019) – using the best fit model on our original data (i.e., an OU process) with parameters estimates from this model fit. The model (an OU process) was then fit to these 100 simulated traits, and the distribution of parameters estimates obtained was used to assess the variability around the point estimate (for the determinant, the trace, and the measure of integration) obtained on empirical data. We did not used the simulated dataset to estimate the significance of the stationary variances. We fitted the empirical datasets with 100 trees randomly sampled from the credible set of 10,00 trees of Upham et al (2019) – instead of using the MCC – to further assess the variability due to the tree topology and branching times uncertainties. We included this expanded explanation in the methods in lines 421-428 and 471.

      35) The way the authors approach body mass and allometry, while mathematically correct, ignores the potential contribution of body mass to the questions the authors are interested in. Jenkins (1974) for example argued that small mammals would converge on similar body posture and functional morphology because, at small sizes, all mammals are scansorial if they are not volant. Similarly, Biewener (1989) argued that many traits we view as cursorial adaptations are actually necessary for stability at large body sizes. Thus size may actually be important in determining patterns of variation in limb bone morphology.

      We agree with the observation. We believe that categorizing the groups according to size would provide a meaningful overview on the effect of size on the diversity and evolution of limb bones. Although insightful and worthy of investigation, we were particularly interested in understanding whether developmental timing corresponds to bone diversification more broadly across Mammalia and thus considered only the size residual values. This issue will be addressed in our future works. We discussed in the lines 329-341 the potential contribution of body size to limb segment diversification and the importance of considering this aspect in future studies.

      36) Review of interpretation.

      The authors conclude that their result, in showing a proximo-distal gradient of increasing disparity and stationary variance in forelimb bone morphology, supports the idea that proximo-distal patterning of limb bone development constrains the range of morphological diversity of the proximal limb elements. However, this correlation ignores two important considerations. The first is that the stylopod connects to the pectoral girdle and the axial skeleton, and so is feasibly more constrained functionally, not developmentally in its morphological evolution. The second, related, issue arises from the authors' study itself, which shows that the lowest morphological integration is found in the stylopod and zeugopod, whereas the autopod elements are highly integrated. This suggests a greater tendency towards modularity in the stylopod and zeugopod, which is itself a measure of evolutionary lability (Klingenberg, 2008). And indeed the mammalian stylopod is developmentally comprised of multiple elements (the epiphyses and diaphysis) that are responding to very different developmental and biomechanical signals. Thus, for example, the functional signal in stylopod (Gould, 2016) and zeugopod (MacLeod and Rose, 1993) articular surface specifically is very high. What is missing to fully resolve the question posed by the authors is developmental data indicating whether or not the degree of morphological disparity in the hard tissues of the forelimb change over the course of ontogeny throughout the mammalian tree, and whether changing functional constraints over ontogeny (as is the case in marsupials) affect these patterns.

      We thank the reviewer for sharing such an interesting reinterpretation of the results. Combined to the recommendations from the other two reviewers, we substantially changed our discussion, specially modifying the interpretation of results concerning trait integration. We discussed the possible role of the functional variation at the articulations on element integration in lines 263-285.

    1. Author Response

      Reviewer #1 (Public Review):

      In this manuscript, Goering et al. investigate subcellular RNA localization across different cell types focusing on epithelial cells (mouse C2bbe1 and human HCA-7 enterocyte monolayers, canine MDCK epithelial cells) as well as neuronal cultures (mouse CAD cells). They use their recently established Halo-seq method to investigate transcriptome-wide RNA localization biases in C2bbe1 enterocyte monolayers and find that 5'TOP-motif containing mRNAs, which encode ribosomal proteins (RPs), are enriched on the basal side of these cells. These results are supported by smFISH against endogenous RP-encoding mRNAs (RPL7 and RPS28) as well as Firefly luciferase reporter transcripts with and without mutated 5'TOP sequences. Furthermore, they find that 5'TOP-motifs are not only driving localization to the basal side of epithelial cells but also to neuronal processes. To investigate the molecular mechanism behind the observed RNA localization biases, they reduce expression of several Larp proteins and find that RNA localization is consistently Larp1-dependent. Additionally, the localization depends on the placement of the TOP sequence in the 5'UTR and not the 3'UTR. To confirm that similar RNA localization biases can be conserved across cell types for other classes of transcripts, they perform similar experiments with a GA-rich element containing Net1 3'UTR transcript, which has previously been shown to exhibit a strong localization bias in several cell types. In order to determine if motor proteins contribute to these RNA distributions, they use motor protein inhibitors to confirm that the localization of individual members of both classes of transcripts, 5'TOP and GA-rich, is kinesin-dependent and that RNA localization to specific subcellular regions is likely to coincide with RNA localization to microtubule plus ends that concentrate in the basal side of epithelial cells as well as in neuronal processes.

      In summary, Goering et al. present an interesting study that contributes to our understanding of RNA localization. While RNA localization has predominantly been studied in a single cell type or experimental system, this work looks for commonalities to explain general principles. I believe that this is an important advance, but there are several points that should be addressed.

      Comments:

      1) The Mili lab has previously characterized the localization of ribosomal proteins and NET1 to protrusions (Wang et al, 2017, Moissoglu et al 2019, Crisafis et al., 2020) and the role of kinesins in this localization (Pichon et al, 2021). These papers should be cited and their work discussed. I do not believe this reduces the novelty of this study and supports the generality of the RNA localization patterns to additional cellular locations in other cell types.

      This was an unintentional oversight on our part, and we apologize. We have added citations for the mentioned publications and discussed our work in the context of theirs.

      2) The 5'TOP motif begins with an invariant C nucleotide and mutation of this first nucleotide next to the cap has been shown to reduce translation regulation during mTOR inhibition (Avni et al, 1994 and Biberman et al 1997) and also Lapr1 binding (Lahr et al, 2017). Consequently, it is not clear to me if RPS28 initiates transcription with an A as indicated in Figure 3B. There also seems to be some differences in published CAGE datasets, but this point needs to be clarified. Additionally, it is not clear to me how the 5'TOP Firefly luciferase reporters were generated and if the transcription start site and exact 5'-ends of these constructs were determined. This is again essential to determine if it is a pyrimidine sequence in the 5'UTR that is important for localization or the 5'TOP motif and if Larp1 is directly regulating the localization by binding to the 5'TOP motif or if the effect they observe is indirect (e.g. is Larp1 also basally localized?). It should also be noted that Larp1 has been suggested to bind pyrimidine-rich sequences in the 5'UTR that are not next to the cap, but the details of this interaction are less clear (Al-Ashtal et al, 2021)

      We did not fully appreciate the subtleties related to TOP motif location when we submitted this manuscript, so we thank the reviewer for pointing them out.

      We also analyzed public CAGE datasets (Andersson et al, 2014 Nat Comm) and found that the start sites for both RPL7 and RPS28 were quite variable within a window of several nucleotides (as is the case for the vast majority of genes), suggesting that a substantial fraction of both do not begin with pyrimidines (Reviewer Figure 1). Yet, by smFISH, endogenous RPL7 and RPS28 are clearly basally/neurite localized (see new figure 3C).

      Reviewer Figure 1. Analysis of transcription start sites for RPL7 (A) and RPS28 (B) using CAGE data (Andersson et al, 2014 Nat Comm). Both genes show a window of transcription start sites upstream of current gene models (blue bars at bottom).

      A more detailed analysis of our PRRE-containing reporter transcripts led us to find that in these reporters, the pyrimidine-rich element was approximately 90 nucleotides into the body of the 5’ UTR. Yet these reporters are also basally/neurite localized. The organization of the PRRE-containing reporters is now more clearly shown in an updated figure 3D.

      From these results, it would seem that the pyrimidine-rich element need not be next to the 5’ cap in order to regulate RNA localization. To generalize this result, we first used previously identified 5’ UTR pyrimidine-rich elements that had been found to regulate translation in an mTOR-dependent manner (Hsieh et al 2012). We found that, as a class, RNAs containing these motifs were similarly basally/neurite localized as RP mRNAs. These results are presented in figures 3A and 3I.

      We then asked if the position of the pyrimidine-rich element within the 5’ UTR of these RNAs was related to their localization. We found no relationship between element position and transcript localization as elements within the bodies of 5’ UTRs were seemingly just as able to promote basal/neurite localization as elements immediately next to the 5’ cap. These results are presented in figures 3B and 3J.

      To further confirm that pyrimidine-rich elements need not be immediately next to the 5’ cap, we redesigned our RPL7-derived reporter transcripts such that the pyrimidine-rich motif was immediately adjacent to the 5’ cap. This was possible because the reporter uses a CMV promoter that reliably starts transcription at a known nucleotide. We then compared the localization of this reporter (called “RPL7 True TOP”) to our previous reporter in which the pyrimidine-rich element was ~90 nt into the 5’ UTR (called “RPL7 PRRE”) (Reviewer Figure 2). As with the PRRE reporter, the True TOP reporter drove RNA localization in both epithelial and neuronal cells while purine-containing mutant versions of the True TOP reporter did not (Reviewer Figure 2A-D). In the epithelial cells, the True TOP was modestly but significantly better at driving basal RNA localization than the PRRE (Reviewer Figure 2E) while in neuronal cells the True TOPs were modestly but insignificantly better. Again, this suggests that pyrimidine-rich motifs need not be immediately cap-adjacent in order to regulate RNA localization.

      Reviewer Figure 2. Experimental confirmation that pyrimidine-rich motif location within 5’ UTRs is not critical for RNA localization. (A) RPL7 True TOP smFISH in epithelial cells. (B) RPL7 True TOP smFISH in neuronal cells. (C) Quantification of epithelial cell smFISH in A. (D) Quantification of neuronal cell smFISH in D. (E) Comparison of the location in epithelial cells of endogenous RPL7 transcripts, RPL7 PRRE reporter transcripts, and PRL7 True TOP reporter transcripts. (F) Comparison of the neurite-enrichment of RPL7 PRRE reporters and RPL7 True TOP reporters. In C-F, the number of cells included in each analysis is shown.

      In response to the point about whether the localization results are direct effects of LARP1, we did not assay the binding of LARP1 to our PRRE-containing reporters, so we cannot say for sure. However, given that PRRE-dependent localization required LARP1 and there is much evidence about LARP1 binding pyrimidine-rich elements (including those that are not cap-proximal as the reviewer notes), we believe this to be the most likely explanation.

      It should also be noted here that while pyrimidine-rich motif position within the 5’ UTR may not matter, its location within the transcript does. PRREs located within 3’ UTRs were unable to direct RNA localization (Figure 5).

      3) In figure 1A, they indicate that mRNA stability can contribute to RNA localization, but this point is never discussed. This may be important to their work since Larp1 has also been found to impact mRNA half-lives (Aoki et al, 2013 and Mattijssen et al 2020, Al-Ashtal et al 2021). Is it possible the effect they see when Larp1 is depleted comes from decreased stability?

      We found that PRRE-containing reporter transcripts were generally less abundant than their mutant counterparts in C2bbe1, HCA7, and MDCK cells (figure 3 – figure supplements 5, 6, and 8) although the effect was not consistent in mouse neuronal cells (figure 3 – figure supplement 13).

      However, we don’t think it is likely that the changes in localization are due to stability changes. This abundance effect did not seem to be LARP1-dependent as both PRRE-containing and PRRE-mutant reporters were generally more expressed in LARP1-rescue epithelial cells than in LARP1 KO cells (figure 4 – figure supplement 9).

      It should be noted here that we are not ever actually measuring transcript stability but rather steady state abundances. It cannot therefore be ruled out that LARP1 is regulating the stability of our PRRE reporters. Given, though, that their localization was dependent on kinesin activity (figures 7F, 7G), we believe the most likely explanation for the localization effects is active transport.

      4) Also Moor et al, 2017 saw that feeding cycles changed the localization of 5'TOP mRNAs. Similarly, does mTOR inhibition or activation or simply active translation alter the localization patterns they observe? Further evidence for dynamic regulation of RNA localization would strengthen this paper

      We are very interested in this and have begun exploring it. We have data suggesting that PRREs also mediate the feeding cycle-dependent relocalization of RP mRNAs. As the reviewer says, we think this leads to a very attractive model involving mTOR, and we are currently working to test this model. However, we don’t have the room to include those results in this manuscript and would instead prefer to include them in a later manuscript that focuses on nutrient-induced dynamic relocalization.

      5) For smFISH quantification, is every mRNA treated as an independent measurement so that the statistics are calculated on hundreds of mRNAs? Large sample sizes can give significant p-values but have very small differences as observe for Firefly vs. OSBPL3 localization. Since determining the biological interpretation of effect size is not always clear, I would suggest plotting RNA position per cell or only treat biological replicates as independent measurements to determine statistical significance. This should also be done for other smFISH comparisons

      This is a good suggestion, and we agree that using individual puncta as independent observations will artificially inflate the statistical power in the experiment. To remedy this in the epithelial cell images, we first reanalyzed the smFISH images using each of the following as a unique observation: the mean location of all smFISH puncta in one cell, the mean location of all puncta in a field of view, and the mean location of all puncta in one coverslip. With each metric, the results we observed were very similar (Reviewer Figure 3) while the statistical power of course decreased. We therefore chose to go with the reviewer-suggested metric of mean transcript position per cell.

      Reviewer Figure 3. C2bbe1 monolayer smFISH spot position analysis. RNA localization across the apicobasal axis is measured by smFISH spot position in the Z axis. This can be plotted for each spot, where thousands of spots over-power the statistics. Spot position can be averaged per cell as outlined manually within the FISH-quant software. This reduces sample size and allows for more accurate statistical analysis. When spot position is averaged per field of view, sample size further decreases, statistics are less powered but the localization trends are still robust. Finally, we can average spot position per coverslip, which represents biological replicates. We lose almost all statistical power as sample size is limited to 3 coverslips. Despite this, the localization trends are still recognizable.

      When we use this metric, all results remain the same with the exception of the smFISH validation of endogenous OSBPL3 localization. That result loses its statistical significance and has now been omitted from the manuscript. All epithelial smFISH panels have been updated to use this new metric, and the number of cells associated with each observation is indicated for each sample.

      For the neuronal images, these were already quantified at the per-cell level as we compare soma and neurite transcript counts from the same cell. In lieu of more imaging of these samples, we chose to perform subcellular fractionation into soma and neurite samples followed by RT-qPCR as an orthogonal technique (figure 3K, figure 3 supplement 14). This technique profiles the population average of approximately 3 million cells.

      6) F: How was the segmentation of soma vs. neurites performed? It would be good to have a larger image as a supplemental figure so that it is clear the proximal or distal neurites segments are being compared

      All neurite vs. soma segmentations were done manually. An example of this segmentation is included as Reviewer Figure 4. This means that often only proximal neurites segments are included in the analysis as it is often difficult to find an entire soma and an entire neurite in one field of view. However, in our experience, inclusion of more distal neurite segments would likely only strengthen the smFISH results as we often observe many molecules of localized transcripts in the distal tips of these neurites.

      Reviewer Figure 4. Manual segmentation of differentiated CAD soma and neurite in FISH-quant software. Neurites that do not overlap adjacent neurites are selected for imaging. Often neurites extend beyond the field of view, limiting this assay to RNA localization in proximal neurites.

      Also, it should be noted that the neuronal smFISH results are now supplemented by experiments involving subcellular fractionation and RT-qPCR (figure 3 supplement 14). These subcellular fractionation experiments collect the whole neurite, both the proximal and distal portions.

      Text has been added to the methods under the header “smFISH computational analysis” to clarify how the segmentation was done.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      [Reviewer's comments]

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary In this article Roure et al address the role of BMP during formation of the ascidian palps, using Ciona intestinalis. Overexpression of BMP (specifically ADMP) from early stages of development results in complete suppression of palp formation, and early loss of the palp forming region (also called anterior neural border ANB). Using p-Smad1/5/8 antibody staining they show a marker of the ANB (FoxC) is expressed in a region negative for BMP signals. Inhibition of BMP signals is not sufficient to produce ectopic ANB. However, treatment with FGF protein from very early stages (8-cell stage) plus inhibition of BMP signaling (from 8-cell stage) increased FoxC expression. Looking at later stages of development the authors show that in a U-shaped expression domain of Foxg, Smad1/5/8 is active in the ventral-most part, which is expected to form the ventral-most palp. BMP2 treatment from gastrula stages results in loss of the ventral most palp expression of Isl and repression of ventral Foxg expression. Inhibition of BMP signaling from gastrula or neurula stages results in failure of a U-shaped pattern of Isl expression to resolve into the three palp expression domains, and by late tailbud stages, Sp6/7/8/9 (proposed as a repressor of Foxg in the inter-palp territory) expression is reduced and the numbers of specific cell-types making up the palps is increased. These cells are present in a single large palp of dorsal identity. Thus, inhibition of BMP from early gastrula stages results in a single palp made of more cells than the three palps of control larvae, presumably due to recruitment of cells usually present between the palps. The authors then show a similar phenotype in another ascidian species Phallusia mammillata. Using their previous RNA-Seq data of embryos treated with BMP4, they looked for potential novel palp markers and identify a further eight novel markers of the palps. Looking further into this data and at a list of 68 genes expressed in palps (but not exclusively) they find that in whole embryo RNA-Seq data 70% were regulated by BMP signaling, mostly repressed, but some activated by BMP. 30 of these genes were regulated by Notch. Apart from the confusion I explained in my comments below, the data seems to be carefully presented and interpreted. Overall, this manuscript presents a more detailed analysis of the role of BMP signaling during ascidian palp formation, but it remains to be precisely understood.

      [Response]

      We thank the reviewer for the evaluation of our work.

      Major comments

      1) I am a little confused about the timing of the protein treatments. In Figure 2, the authors show nicely that at the neurula stages, P-Smad1/5/8 staining abuts the FoxC ANB territory. Then at late neurula P-Smad1/5/8 is detected in the ventral-most part of the Foxg U-shaped part of the palp forming region, presumably the ventral most palp. However, the protein treatments with BMP (and FGF) are carried out from the 8-cell stage, which seems a bit drastic and embryos look difficult to orientate (e.g. Fig. 3D).

      [Response]

      We first would like to clarify the issue raised from Figure 3. Actually, Figure 3D was the only case where the embryo was shown from the side (the description as a lateral view was inadvertently omitted in the legend). We have now modified Figure 3 by properly showing only dorsal (neural plate) views and lateral views in insets when necessary. In addition, we have added schemes of embryos depicting the main tissues we have examined (palps, CNS and epidermis) and their localization depending on the treatments.

      Regarding the timing of treatments, we performed them at the 8-cell stage to make them manageable to perform. At the latest, bFGF treatment should be performed at the 16-cell stage (before neural induction at the 32-cell stage), while BMP2 treatment should be performed at the 64-cell stage (before the onset of Foxc/partial effect at early gastrula (St. 10)). In principle, sequential treatment (first bFGF, then BMP2) could thus be performed. Since earlier treatments, produce the same effects, we reasoned that combined treatments from the 8-cell stage should be equivalent and would avoid fastidious repeated manipulation of the embryos that could negatively impact their development. We are convinced that the way we performed the treatment has no impact on our results (except for the treatment by bFGF alone on Foxc as already discussed in the text) and conclusions.

      While BMP-treatment from early stages inhibits all palp gene expression and any sign of palp formation (Figure 1), treatment with BMP from the early gastrula stage, when Smad1/5/8 is detected only in mesendoderm cells and before it is detected in any ectoderm, is sufficient only to block ventral palp formation and cause a partial down-regulation of FoxC expression in the ANB. Thus, there seems to be a discrepancy between the roles proposed for BMP during ANB and palp formation as judged by P-Smad1/5/8 staining and the temporal evidence from BMP- and BMP-inhibitor treatment. Do the authors have some explanation for why they need to treat at least one hour before the BMP-mediated patterning mechanism (as indicated from the P-Smad1/5/8 staining) is taking place? For example, could the authors check how long it takes DMH1 to inhibit P-Smad1/5/8 positive staining? Or BMP to strongly induce P-Smad1/5/8? This seems to be a simple experiment and might go some way to explaining why they need to treat embryos much earlier than I would have thought necessary.

      [Response]

      We understand the reviewer's concerns, but we do not think that there are major discrepancies in the timing of events. The main rationale is to consider the onset of expression for the main genes of interest. We have examined their dynamics of expression in details, but we do not show them since our conclusions are in agreement with a previous report (Figure 1 from Liu and Satou, 2019). We have summarized the data in the modified Figure 2. Foxc can be detected from early gastrula stages (St. 10) when the palp precursors consist of a single row of 4 cells. This is the exact developmental time when the treatment with BMP2 has partial effects (Figure 4). Once the cells divide to make 2 rows of 4 cells robustly expressing Foxc (St. 12), BMP2 treatment has no effect on Foxc. Similarly, DMH1 treatment has no effect from late neurula stage (St. 16) (Figure 4) that corresponds to the onset of Sp6/7/8/9 expression. We thus consider that modulating BMP pathway has no effect once key regulatory genes have acquired a robust expression in their normal domains. We have enhanced these points in the main text (lines 205-208, lines 228-229).

      We think the above discussion should address the points raised by the reviewer. In the contrary, we are willing to perform the suggested experiments.

      2) It does not make sense to me that BMP treatment from gastrula stage blocks only ventral palp formation (Figure 4) and ventral Foxg expression (Fig. 5G). In particular, it is the ventral palp region which is positive for P-Smad1/5/8 (Fig.2I,J) so I would not expect the ventral palp to be the most sensitive to BMP-treatment.

      [Response]

      We were, like the reviewer, surprised by the phenotype. The time window to obtain this phenotype is quite narrow, and most likely deals with the full acquisition of the palp fate ('consolidation' of Foxc expression, onset of Foxg). This is actually a phenotype that we have not characterized in details. And such a characterization may help clarify the role of BMP: does BMP regulate papilla/inter-papilla fates only for the ventral palp or for all three palps? Does BMP 'only' regulate the dorso-ventral identities of the palps?

      To better understand the role of BMP in palp formation, we propose to describe this specific phenotype: loss of ventral palp induced by BMP2 treatment at St. 10. We propose to test the following hypotheses. What is the fate of the ventral palp? Conversion into epidermis (more ventral fate)? Conversion into inter-papillar fate? What is the identity of the 2 remaining presumptive palps? Do they still have a dorsal identity? Are they converted into ventral palps? This is part of the proposed experiments for a revision.

      Minor comments line 185 I see what the authors are trying to say but I don't agree that BMP limits the domain of FoxC expression as inhibition of BMP has no effect on FoxC. Rather BMP has to be kept out of the ANB in order to allow ANB formation.

      [Response]

      We have modified the sentence (lines 195-196).

      The relationship between Foxg and Sp6/7/8/9 expression is not really clear and it would be better to do this with double ISH if the authors want to show mutually exclusive expression domains, or at least provide a summary figure.

      [Response]

      We have modified Figure 5 by adding schematic representations of our understanding of the expression patterns in relation to the different precursors of the palp lineage.

      In case the reviewer does not find this clarification sufficient, we propose to perform the double fluorescent in situ hybridizations as part of the revision plan.

      Line 218, I do not see the data showing that Isl is expressed at a U-shape at st. 23, it seems to be expressed in three dots, unless embryos are treated with DMH1.

      [Response]

      We apologize for the misunderstanding since the sentence was not clear. We referred to the U-shaped Isl expression under BMP inhibition. Indeed, Isl starts to be expressed in 3 separate domains in the palp forming region, and not following a U-shape as its upstream regulator Foxg (Liu and Satou, 2019). We amended the sentence (lines 234-235).

      Figure 6B, G. It could be nice to show a close up of the palps to see elongated cells.

      [Response]

      The close up pictures have now been added in the modified Figure 6.

      Figure 6K. It is better to use a statistical test to support the authors conclusions.

      [Response]

      As suggested, we have performed a statistical evaluation (Mann-Whitney U test) of the cell counts. The p-values are presented in Figure 6Q. The slight increase of Celf3/4/5/6 is not statistically significant, but it does not impact our conclusion that the number of papilla cells increases following BMP inhibition.

      It could be nice to provide a timeline for Smad1/5/8 signaling and the role for BMP signals that are proposed in this manuscript as a summary diagram.

      [Response]

      Following the suggestion, we have added summary diagrams in Figure 2 for BMP signaling in relation to lineages and gene expression.

      lines 66-74 is lacking references.

      [Response]

      This is now corrected (lines 70-80).

      Reviewer #1 (Significance (Required)):

      Significance While it is still not clear how BMP signals are established (which ligands for example) and their precise role in palp formation, this manuscript adds more information to our current understanding of the role of BMP signaling during palp formation. In particular it shows that BMP signals need to be kept out of the ANB for its formation and that it is required to resolve the later forming palp territory into three discrete palp regions. However, there is some way to go before this is fully understood. This article will certainly be of interest to ascidian developmental biologists trying to understand the formation and patterning of the larval PNS. It may also be of some interest to evolutionary biologists trying to understand the relationship between the telencephalon territory of vertebrates and the palp forming territory of ascidians as some links have been proposed between these two developmental territories (e.g. line 78).

      [Reviewer's comments]

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary. The manuscript presents a detailed examination of how dynamic changes in BMP signaling during the development of the ascidian larval palps. Early in development BMP inhibition is responsible for the formation of a large field within the neuroectoderm that includes, among other fates, the presumptive palps. As development progresses, the territories of BMP activity/inhibition appear to be spatially refined within the palp-forming territory to specify palp versus interpalp fate. The experiments are presented with sufficient replication and statistical rigor.

      [Response]

      We thank the reviewer for the evaluation of our work.

      Major Comments.

      1. The researchers should look at otx expression in pFOG>Admp overexpressing embryos. It is difficult to assess from Figure 1, but it appears possible the the entire anterior sensory vesicle (not just the palps) are absent in the pFOG>Admp embryos (can the authors say briefly whether other ectodermal structures such as the atrial primordia or the oral siphon are still present?). Thus, is it possible that the entire a-lineage is disrupted? This would be an important distinction to make: are the defects attributed to experimental BMP activation specific to the palps, or are they more widespread in the anterior neuroectoderm? If the entire a-lineage is mis-fated, might this change the interpretation of the role of BMP inhibition? For example, might the formation of the palps depend on the proper development of the neighboring anterior neural plate? To address this concern, the authors should use a different driver to restrict Admp overexpression only to the palp forming region.

      [Response]

      In Figure 1, we show that Celf3/4/5/6, a general neural marker was still expressed in pFog>Admp embryos. We explain, in the Figure 1 legend, that this most likely corresponds to the CNS. It does not demonstrate that the anterior sensory vesicle (a-line induced CNS lineage) is still present. Unfortunately, Otx cannot be used as a suitable marker since it is also expressed in the posterior sensory vesicle (A-line lineage) (Hudson et al., 2003). Other a-line markers do exist. However, determining their expression at tailbud stages may not be conclusive since it is most likely that the patterning of the sensory vesicle (hence the expression of these markers) is modified after BMP activation. We have presented in former Figure 3 and Figure S1, strong evidence that the a-line neural lineage is intact at the neural plate stage. To better communicate these data, we have combined then in a modified Figure 3 that includes all markers examined and interpretative embryonic schemes. We show that, following BMP2 treatment, Otx and Celf3/4/5/6 were downregulated in the palp lineage but otherwise normal. Consequently, the a-line CNS lineage is most likely not affected by BMP pathway activation. This does not mean that its later derivatives form normally, but this is an issue that we have not addressed. A previous report indicates that BMP activation leads to Six1/2 repression and, possibly, the absence of oral siphon primordium (based on the images, no description in this paper) (Figure 1 from Abitua et al., 2015).

      We think that we have addressed the concern of the reviewer, but would like to comment on the suggested experiment. It is very difficult to find a driver that would allow BMP activation only in the palp lineage (by overexpressing a constitutive active BMP receptor for example). a-line neural linage and palp lineage are intimately linked and separate at gastrula stages (St. 10). The regulatory sequences of Foxc, the first palp specific gene that we know, would thus be interesting. But it is most likely too late according to our whole embryo protein treatments (Figure 4). In agreement with this assumption, overexpressing Bmp2/4 (another BMP ligand) using the regulatory sequences of Dmrt (a master regulator of the palp+a-line CNS lineage expressed just before Foxc) does not apparently abolish palp formation (Extended Data Figure 5 from Abitua et al., 2015).

      1. The authors hypothesize that papilla versus inter-papilla fate is controlled by differential BMP signaling. Is it possible to show differential P-Smad staining in papilla versus inter-papilla territories, as in Figure 2 for earlier gastrula-stage embryos? This data would make the authors hypothesis much more compelling. It appears that the authors have the necessary reagents.

      [Response]

      The actual lineage and fate segregation of papilla and inter-papilla lineage has not been determined as far as we know. Our current understanding comes from indirect evidence from gene expression and gene function, in particular from the study of Foxg and Sp6/7/8/9 by Liu and Satou (2009). Papillae originate from the 3 Foxg/Isl positive spots that are visible at very early tailbud stages. At earlier stages, Isl is not expressed and Foxg is expressed with a U-shape (Figure 5). Within this U, it is most likely that the segregation of papilla and inter-papilla fates takes place when Sp6/7/8/9 starts being expressed at late neurula stages. It is thought that Sp6/7/8/9+/Foxg+ cells will become inter-papilla cells while Sp6/7/8/9-/Foxg+ will become papilla. Our data indicate that BMP signaling is active in the future ventral papilla. We have mapped these data on schematics in the modified Figure 2.

      Minor Comments.

      1. There is no mention of panels Figure 1 U and V in the text. In the figure legend they are misidentified as panels S and T.

      [Response]

      This has been corrected.

      Very small issue with English usage that occurs throughout the manuscript. The authors should check the use of "palps" versus "palp", particularly when expressions such as the following are used: "palps formation", "palps network", "palps lineage", "palps differentiation", "palps molecular markers", "palps neuronal markers", "palps phenotypes", etc . For example, the sentence, "Here, we show that BMP signaling regulates two phases of palps formation in Ciona intestinalis", should read instead "Here, we show that BMP signaling regulates two phases of palp formation in Ciona intestinalis".

      [Response]

      Thank you, we have corrected these mistakes.

      It would be worth mentioning possible relationships between the tunicate palps and the adhesive glands for larval fish and amphibians. Are there common mechanisms? All of these are anterior ectoderm derivatives.

      [Response]

      Thank you for the suggestion. We have added a section on that topic in the discussion (line 358).

      Please consider providing references in the Introduction for the sentences which end on the following lines of text: 36 ( . . . is the sister group of vertebrates), 46 ( . . . and sensory properties), 48 ( . . . the secretion of adhesive materials), 57 ( . . . on the nervous system in chordates), 68 ( . . . also known as Ap2-like), 74 ( . . . anterior neural territories)

      [Response]

      References have now been added.

      To provide extra emphasis and to help the figures to stand alone with their respective legends, can you mention in the legend for Fig. 2 that D and E are controls? Also, can a brief legend be provided for S2 to give overall indication of staging, scale, orientation, etc.?

      [Response]

      Actually, the original Fig 2D and 2E correspond to treated embryos as explained in the legend. For clarity, these embryos have been separated from control embryos in the modified Figure 2.

      Figure S2 has modified and a legend has been added.

      Reviewer #2 (Significance (Required)):

      Significance.

      This study presents an advance in our understanding of the fine-structure regulation of BMP signaling in sculpting neuroectoderm derivatives. While this study is potentially of broad interest, the authors fail to fully discuss the comparative aspects of this study in the context of conserved chordate developmental mechanisms. This could be remedied without too much difficulty in the Discussion section.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: This paper explores the role of BMP signaling for palp formation in ascidians using gain and loss of function approaches. The paper shows that while BMP at early (gastrula) stages prevents formation of the Foxc-positive palp ectoderm in Ciona, at later stages it appears to be essential for separation of the palps (possibly by promoting differentiation of interpapillary cells). The paper further shows that BMP plays similar roles in a different ascidian, Phallusia mammillata. Using previously published RNA-Seq results for the latter species after BMP up-regulation, the authors were able to identify additional BMP-responsive genes expressed in the palp region of ascidians.

      [Response]

      We thank the reviewer for the evaluation of our work.

      Major comments: However, while the effect of BMP overexpression at early stages has been confirmed by two independent strategies (electroporation of the BMP agonist ADMP and BMP2 treatment), the effects of late BMP activation as well as the effects of BMP inhibition at both early and late stages have been studied exclusively by pharmacological treatments with a single BMP signaling agonist (BMP2) and antagonist (DMH1). To substantiate these findings and rule out unspecific side effects, it would have been desirable to verify them with alternative strategies.

      [Response]

      The reviewer may have missed some of our data. We have shown that BMP inhibition through overexpression of the secreted antagonist Noggin via electroporation using the early ectodermal driver pFog gives the same phenotypes as DMH1 treatment. The effects on Foxc * were presented in Figure S1, and are now presented in the modified Figure 3 (line 170). We also showed that the morphological Cyrano phenotype was observed with Noggin overexpression (modified Figure 6H). We now present a novel Figure S1 with expression of Isl and Celf3/4/5/6* following Noggin overexpression, and stress the use of this independent way of inhibiting BMP (lines 260-264). Given that early or late BMP inhibition lead to the same phenotype, we do not consider that overexpressing Noggin at gastrula stages is necessary.

      Regarding BMP activation from gastrula stages, we have only used BMP2 treatment. It may be possible to overexpress Admp using promoters active in the palp lineage such as the ones of Dmrt, Foxc or Foxg. However, it may be difficult to phenocopy the phenotype obtained using BMP2 protein (loss of ventral palp), for two reasons. First, the precise timing to reach high BMP activation is not tightly controlled using such a method. Hence, all drivers should be tested. Second, the different promoters are active progressively later in development and in more and more restricted regions. Consequently, we consider that this requires a huge effort to validate a method (BMP protein treatment) that we already validated for the early effects and that has been used in several publications.

      Therefore, while this study provides some new insights into the role of BMP in the specification of the palp forming region and subsequent palp development in ascidians, the evidence provided is relatively weak. Moreover, the scope of the study is quite limited. While identifying some BMP-responsive genes expressed in the palp region and describing the effects of BMP dysregulation on palp morphology, the study does not provide further insights into the underlying mechanisms how BMP patterns this region or affects subsequent palp formation.

      [Response]

      We are surprised by the appreciation of the reviewer describing our work as 'some new insights'. To our knowledge, this is the first report addressing the role of BMP signaling in palp formation at the molecular level. The only previous report by Darras and Nishida (2001) describes solely the morphology of the palps following overexpression of Bmp2/4 and Chordin overexpression by mRNA injection. We have brought significant novel findings 1) two important steps in palp formation with a precise description of the cellular and molecular actors, and a proposed function for BMP at each step, 2) evidence for conservation of this process in different ascidian species and 3) significant enrichment in the molecular description of this process. Moreover, the reviewer does not ask for specific items, we thus feel in the impossibility to offer satisfaction.

      Minor comments:

      • 63: ...as the anterior...

      [Response]

      Corrected.

      • 68, 71, 74: references missing

      [Response]

      References have now been added.

      • 73: better: anterior neural territories and placodes

      [Response]

      Corrected.

      • 76: palp territories also share molecular signature with anterior (eg. olfactory) placodes, not only telencephalon

      [Response]

      Corrected.

      • 106: awkward sentence

      [Response]

      Corrected.

      • 114: at what stage was ADMP electroporated?

      [Response]

      Electroporation of plasmid DNA is performed in the fertilized egg. Transcription of the transgene is controlled by the driver. In this case, with pFog, it occurs from the 16-cell stage. This precision has been added in line 121.

      • 134: to facilitate comparison between stages it would be useful to label cells in Fig. 2(eg. which are a-line and b-line cells? Where is the border between them?)

      [Response]

      As suggested by the reviewer, we have modified Figure 2 with embryo outlines and schemes to better appreciate where BMP signaling is active.

      • 152: since Foxc and Foxg overlap with pSMAD1/5/8 at neurula but not gastrula stages, do you know whether this is due to a dorsal expansion of BMP activity or a ventral expansion of Foxc/Foxg expression? Again, labeling of the nuclei would help

      [Response]

      The change corresponds to a dorsal expansion of P-Smad1/5/8. Our conclusion comes from combining nuclear staining (not shown for simplicity) and available fate maps. The results are presented in schematic diagrams of embryos in frontal views in the modified Figure 2.

      • 174: the description is not clear here; what proportion of embryos did show reduction versus expansion of expression?. Why is the reduction shown in Fig.3 D asymmetrical?

      [Response]

      The proportions are now indicated in line 184.

      We apologize for the impression led by Fig 3D. Actually, it was the only case where the embryo was shown from the side (the description as a lateral view was inadvertently omitted in the legend). It did not show an asymmetric repression but an ectopic expression. We have now modified Figure 3 by properly showing only dorsal (neural plate) views and lateral views in insets when necessary. In addition, we have added schemes of embryos depicting the main tissues we have examined (palps, CNS and epidermis) and their localization depending on the treatments. We hope that the results are now clearly presented.

      • 198: ... of endogenous...

      [Response]

      Corrected (line 213).

      • 208: I suggest to highlight the regions of changes in Fig. with asterisks/arrows etc.

      [Response]

      We have added schematic embryos to highlight expression changes in the modified Figure 5.

      • 218: contrary to what is stated here, there is no depiction of u-shaped Isl1 expression in control embryos of Fig. 4

      [Response]

      As also pointed by reviewer 1, we apologize for the misunderstanding since the sentence was not clear. We referred to the U-shaped Isl expression under BMP inhibition. Indeed, Isl starts to be expressed in 3 separate domains in the palp forming region, and not following a U-shape as its upstream regulator Foxg (Liu and Satou, 2019). We amended the sentence (lines 234-235).

      • 220: the cell shapes referred to here cannot be seen in Fig. 4 (too small)

      [Response]

      We have modified Figure 6 to include close up of the palps.

      • 271: the description here is confusing: first you talk about 53 genes and the mention palp expression of 12/26. Where does number 26 come from? And why was in situ done then for 27 additional genes? Also, while the comparison with previously published RNA-Seq data was valuable in uncovering additional BMP-sensitive palp markers, it does not provide any substantial new insights into how BMP patterns this territory.

      [Response]

      We have amended the sentence to make it clearer (lines 291-295).

      • line 624: where

      [Response]

      Thank you. Corrected line 731.

      • Fig. 2: to facilitate comparison between stages it would be useful to label cells (eg. which are a-line and b-line cells? Where is the border between them?)

      [Response]

      Already responded above.

      -Fig. 3: Why is the expression in D asymmetrical? In the main text you write that expression is expanded in some embryos but reduced in others - Please show examples also of the expanded phenotype and give numbers

      [Response]

      Already responded above.

      • Fig. 6: small panels in I, L, N need to be explained (single channels), white signal needs to be explained (overlap ?)

      [Response]

      We used white for better display of separate single channels. Given the confusion and the good quality of the 2 color fluorescent in situ images, we removed these panels in the modified Figure 6.

      White in K and L correspond to overlap (explained in the legend).

      • Fig. S2: legend is missing

      [Response]

      This has been amended.

      Reviewer #3 (Significance (Required)):

      Since the study does not provide substantial new insights into the mechanisms how BMP patterns the palp forming region or affects subsequent palp formation in ascidians, it will be of interest mostly for a specialized audience in the field of developmental biology.

      [Response]

      We do not agree with the reviewer as discussed above. The description of the role of BMP signaling in the specification of the ANB and its subsequent patterning in ascidians has interesting evolutionary implications and should be of interest for a broader audience.

    1. Author Response:

      Reviewer #1 (Public Review):

      This paper reports an analysis of the inhibition of the serotonin transporter, SERT, by a novel compound, ECSI#6. The authors perform a comprehensive analysis of SERT transport inhibition for the new agent and compare its properties to those of other well-characterized agents: cocaine and noribogaine, with the data pointing to an unusual noncompetitive mechanism of inhibition, a model also supported by electrophysiological recordings of transport currents. Based on the results of these experiments the authors conclude that ESCI#6 binds essentially exclusively to the inward-facing state of the transporter. The authors further present experiments suggesting that ESCI#6 can stabilize the folded form of an ER-arrested SERT mutant and recover its trafficking to the plasma membrane, with some in-vivo drosophila experiments perhaps also supporting this conclusion. Finally, kinetic simulations using a transport model with rate constants from previous experiments support the basic conclusions of the first sections of the paper.

      Strengths:<br /> The transport experiments and simulations here are thorough, carefully performed, and reasonably interpreted. The authors' arguments for noncompetitive inhibition seem well-thought-out and reasonable, as is the conclusion that ESCI#6 binds to the inward-facing state of the transporter. The simulations are also thorough and support the conclusions. In the discussion, the comparison of enzyme noncompetitive inhibition to the process studied here was thoughtful and interesting. Also, the care and analysis of the uptake data are a strength of the paper, with well-presented evidence of reproducibility and statistics. The electrophysiology data is more limited but does communicate the essential conclusion.

      Weaknesses:<br /> The most important concern about the work is the interpretation of the in-vivo drosophila data. Though the SERT fluorescence with WT protein is strong, I cannot see any fluorescence in either drug-treated image from the PG mutant. In this context, shouldn't there be additional intracellular staining for ER-resident SERT? If the cell bodies of these cells are elsewhere this should be clearly pointed out.

      We have modified Fig. 6 to include, in all instances, images of the posterior brain, where the neurons (FB6K) reside, from which the serotonergic projections originate. These images visualize expression of membrane-anchored GFP (mCD8GFP; in panel B), immunoreactivity of serotonin (panel B’), wild type SERT (panels C’,D’,E’) and mutant SERT-PG601,602AA (panels F’,G’,H’) in the soma. The description of these panels has been added to the pertinent sentences starting on p. 20, line 6 from bottom to the end of end of the first paragraph p. 21, which read:

      “These projections (Fig. 6A-A’’) and the FB6K-type neurons, from which they originate in the posterior brain (Fig. 6B-B’’) can be visualized by expressing membrane-anchored GFP (i.e. GFP fused to the C-terminus of murine CD8; [36]) under the control of TRH-T2A-Gal4. Similarly, when placed under the control of TRH-T2A-Gal4, YFP-tagged wild-type human SERT was expressed in the FB6K-type neurons (Fig. 6C’) and delivered to the fan-shaped body (Fig. 6C). In contrast, in flies harboring human SERT-PG601,602AA, the transporter was visualized in the soma of FB6K-type neurons (Fig. 6F’), but the fan-shaped body was devoid of any specific fluorescence (Fig. 6F). However, if three-day old male flies expressing human SERT- PG601,602AA were fed with food pellets containing 100 μM ECSI#6 or 100 μM noribogaine for 48 h, fluorescence accumulated to a level, which allowed for delineating the fan-shaped body (Fig. 6G and H, respectively). This show that ECSI#6 and noribogaine exerted a pharmacochaperoning action in vivo, which partially restored the delivery of the mutant transporter to the presynaptic territory. As expected, in flies harboring wild-type human SERT, feeding of ECSI#6 and noribogaine did not have any appreciable effect on the level of fluorescence in the fan-shaped body (Fig. 6D and E, respectively). “

      Similarly, the single Western blot demonstrating enhanced glycosylation in the presence of Noribogaine or ECSI#6 could be strengthened. I can see the increased band at a high molecular weight that the authors attribute to the fully glycosylated form, but this smear, and the band below, look quite different from those in the blot shown in the El-Kasaby et al reference, raising concerns that the band could be aggregated or dimerized protein rather than a glycosylated form. This concern could easily be addressed by control experiments with appropriate glycosidases, as shown in the reference.

      We understand that the appearance of the mature glycosylated species is being criticized, at least in part, because it differs from sharper bands, which can be found in our previously published papers. We stress that the resolution very much depends on the electrophoretic conditions. We addressed the reviewers’ criticism by carrying out the recommended deglycosylation experiments: a representative experiment is shown in (the new) panel F of Fig. 5, with lysates prepared from HEK293 cells expressing wild type SERT, from untransfected HEK293 cells and from HEK293 cells, which had been preincubated with 30 μM cocaine, 100 μM ECSI#6 and 30 μM noribogaine. The experiment confirms the band assignment with the upper band(s) M representing the mature glycostylated species (which are resistant to deglycosylation by endoglycosidase H) and the lower band C corresponding to the core- gylcoylated species (which are susceptible to cleavage that (as expected) the mature band show a representative degylcosylation by endoglycosidase H). We also think that the immunoblot in panel F ought to satisfy the aesthetic criticism: the bands are sharper/less smeared.

      The description of panel F can be found on p. 18, starting in line 7 from bottom to end of page, and reads: “We confirmed the band assignment by enzymatic deglycosylation (Fig. 5F): the upper bands (labeled M), which appeared in cells incubated in the presence of ECSI#6 and of norbogaine, were resistant to deglycosylation by endoglycosidase H (which cannot cleave mature glycans). In contrast, the core-glycosylated species (labeled C), was susceptible to cleavage by endoglycosidase H resulting in the appearance of the deglycosylated band D.”

      The overall interest in the work is reduced given the quite low affinity of ECSI#6 for the transporter.

      We agree that it would be preferable to have a compound, which works in the submicromolar/nanomolar range. However, it is worth pointing out that the EC50 is low enough for allowing in vivo rescue of the folding-deficient SERT-PG: feeding flies restores its trafficking to the cell surface and to the presynaptic specialization. Obviously, there is room for improvement, but ECSI#6 provides a starting point.

      Reviewer #3 (Public Review):

      This is interesting research that uncovers a novel inhibition mechanism for serotonin (SERT) transporters, which is akin to traditional un-competitive inhibitors in enzyme kinetics. These inhibitors are known to preferentially bind to the enzyme-substrate complex, thus stabilizing it, resulting in a decrease of the IC50 with increasing substrate concentrations. In contrast to this classic enzyme inhibition mechanism, the authors show for SERT, through detailed kinetic analysis as well as kinetic modeling, that the inhibitor, ECSI#6, binds preferentially to the inward-facing state of the transporter, which is stabilized by K+. Therefore, inhibition becomes "use-dependent", i.e. increasing substrate concentrations push the transporter to the inward-facing configuration, which then leads to the increased apparent affinity of ECSI#6 binding. Interestingly, this mechanism of action predicts that the inhibitor should be able to rescue SERT misfolding variants. The authors tested this possibility and found that surface expression and function of a misfolding mutant SERT is increased, an important experimental finding. Another strength of the manuscript is the quantitative analysis of the kinetic data, including kinetic modeling, the results of which can reconcile the experimental data very well. Overall, this is important and, in my view, novel work, which may lead to new future approaches in SERT pharmacology.

      With that said, some weaknesses of the manuscript should be mentioned. 1) The authors suggest that serotonin and ECSI#6 cannot bind simultaneously to the transporter, however, no direct evidence for this conclusion is provided.

      We assessed this point by plotting the data in Fig. 2A,B,C as Dixon plots in (the new) panels D,E,F of Fig. 2. We refer the reader to Segel’s textbook on enzyme kinetics (new ref. 18) on using Dixon plots in the presence of two inhibitors. The pertinent description is on p. 9, lines 12-22 and reads as follows: “We transformed the data summarized in Figs. 2A-C by plotting the reciprocal of bound radioligand as a function of inhibitor concentration to yield Dixon plots (Fig. 2D-F): the x-intercept corresponds to -IC50 of the inhibitor [18]. Thus, Dixon plots allow for differentiating mutually exclusive from mutually non-exclusive binding, if one inhibitor (i.e., cocaine, noribogaine or ECSI#6) is examined at a fixed concentration of the second inhibitor (i.e., serotonin) [18]: if binding of the two inhibitors is mutually non-exclusive, a family of lines of progressively increasing slope, which intersect at -IC50, is to be seen. In contrast, if the two inhibitors bind to the same site, the slope of the inhibition curves is not affected and the x- intercept (i.e, -IC50 of the variable inhibitor) is shifted to more negative values. It is evident from Fig. 2D-E that the presence of 1 and 10 μM serotonin progressively shifted the (expected) x-intercept for cocaine (Fig. 2D), noribogaine (Fig. 2E) and ECSI#6 (Fig. 2D). Thus, binding to SERT of serotonin and of these three ligands was mutually exclusive.” Based on the Dixon plots, we feel that our conclusion is justified, i.e., binding of serotonin and ECSI#6 (and of the other ligands) is mutually exclusive.

      2) How does ECSI#6 access the inward-facing binding site? Does it permeate the membrane and bind from the inward-facing conformation, or is it just a very slowly transported low-affinity substrate that stabilizes the inward-facing state with much higher affinity? Including ECSI#6 in the recording electrode may provide further information on this point.

      We did the suggested experiments: the data are summarized in panel I of Fig. 4 and described in the first paragraph on p. 15, which also explains why this experiments is possibly inconclusive due to the high diffusivity of ECSI#6:

      “Fig. 4I shows representative traces of 5-HT induced currents recorded from SERT expressing cells in the absence (in blue) and presence of 100 μM ECSI#6 (in red) in the electrode solution: when thus applied from the intracellular side, ECSI#6 did not cause an appreciable current block. The right-hand panel summarizes the current amplitude obtained from cells measured in the absence (blue open circles) and presence of intracellular ECSI#6 (open circles in red). These data seem to indicate that ECSI#6 binds to SERT from the extracellular side. Yet this conclusion can be challenged based on the following consideration: in earlier experiments, ibogaine, the parent compound of noribogaine, was found to block HERG channels when applied from the bath solution but failed to do so when added to the electrode solution [27]. However, at a lower intracellular pH (i.e., pH 5.5), ibogaine gained the ability to inhibit HERG from the intracellular side (i.e., via application through the electrode). Conversely, ibogaine was less effective when applied to an acidified bath solution. These observations led to the conclusion that ibogaine blocked HERG from the cytosolic side: because the molecule in its neutral form was so diffusive, a low intracellular pH was required to force its protonation and thus preclude diffusion from the interior of the cell. ECSI#6 is presumed to also be very diffusible given its estimated logP value and polar surface area of 2.48 and 66 Å2, respectively. However, ECSI#6 harbors an amide nitrogen (see Fig. 1A) and thus remains neutral in the experimentally accessible pH range. Hence, it is not possible to verify to which side of SERT it binds.”

      Additionally, it is not clear why displacement experiments were not carried out with cocaine. Since cocaine is a competitive inhibitor but does not induce transport (i.e. doesn't induce the formation of the inward-facing conformation), it should act in a competitive mechanism with ECSI#6.

      We did not quite understand this comment, because displacement experiments were done with cocaine (Fig. 2A, new Fig 2G/previous Fig. 2D). However, if the reviewer questions why we do not use cocaine rather than 5-HT, in the three-way competition experiment, it is precisely, because we wanted to compare the action/binding mode of ECSI#6 to that of cocaine.

      3) Why are dose-response relationships not shown for electrophysiological experiments? These would be a good double-check for the radiotracer flux data.

      These experiments were done and are shown in (the new) panels G and H of Fig. 4; the pertinent description is in the second paragraph of p. 14 and reads:

      “The protocol depicted in Fig. 4B can also be used to gauge the apparent affinity of ECSI#6 for SERT in the presence of 5-HT. Plotted in Fig. 4G is the block of the serotonin-induced current as a function of the co-applied ECSI#6 concentration. The current was evoked by a saturating concentration of 5-HT (30μM) and inhibited by 3, 10, 30 and 100 μM co-applied ECSI#6, respectively (the inset in Fig. 4G shows representative current traces). A fit of an inhibition curve to the data points yielded an IC50 value of approx. 5 μM. This value was lower but still in reasonable agreement, with the IC50 obtained in the radioligand uptake assay for the condition where the 5-HT concentration had been saturating (cf. dashed line in Fig.1C; 10 μM 5-HT). In the uptake assay the IC50 value of ECSI#6 dropped to about 0.5 mM, in the presence of a low 5-HT concentration (i.e., 0.1 μM). In contrast to uptake experiments, electrophysiological recordings also allow for assessing the apparent affinity of ECSI#6 for SERT in the absence of the substrate. This can be achieved by employing the protocol depicted in Fig. 4H (see representative current traces on the left-hand side): we first applied 30 μM 5- HT to a cell expressing SERT for 0.5 s to elicit a peak current (i.e., a control pulse). We then reapplied 30 μM 5-HT after a superfusing the cell with 100 μM ECSI#6 for 1 s (second upper trace in panel H). We chose this time period because it had been sufficient to allow for full current block in the other protocol (see Fig. 4G): the amplitude of the peak current following pre-application of 100 μM ECSI#6 was essentially identical to the prior control pulse. When we pre-applied 100 μM ECSI#6 for a longer period (i.e., 3 s) the amplitude of the two peak currents also remained the same (cf. lower traces in panel H). The right-hand panel shows the summary of several experiments. Plotted in the graph is the ratio of the second and first pulse, which was always close to one. We previously used this protocol to assess the binding kinetics of cocaine, methylphenidate and desipramine on SERT and DAT. Pre-application of these inhibitors consistently led to a concentration dependent reduction in the peak current amplitude of the second pulse in comparison to the first [23]. The lack of inhibition, thus, indicates that the affinity of ECSI#6 in the absence of 5-HT is low. To obtain estimates for the affinity of ECSI# for SERT in the absence of 5-HT we would need to apply this compound at much higher concentrations. This, however, is not possible, because ECSI#6 is poorly soluble in aqueous solutions (i.e., max. 0.03 mg/ml).”

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1

      This paper identifies a role for the hereditary spastic paraplegia protein spatacsin in lysosome morphology, positioning and dynamics, and undertakes detailed mechanistic studies to try to identify the mechanism for this effect. In doing so the paper elucidates further mechanistic information about the properties of two other hereditary spastic paraplegia proteins, spastizin and AP5Z1. The work is done in mammalian cells and uses a combination of over-expression, depletion and biochemical studies. The main findings are:

      1. The authors present evidence that spatacsin is an ER-localised protein.
      2. Murine embryonic fibroblasts lacking spatacsin have a reduced number of tubular lysosomes and the remaining lysosomes are less motile. In general, a relationship between tubular lysosome morphology and lysosome motility, often in association with the endoplasmic reticulum (ER), is demonstrated. These tubular lysosomes are catalytically active and acidic.
      3. In terms of mechanism of this effect, by combining a yeast-two hybrid and siRNA phenotypic screen, the authors identify a number of spatacsin-interacting proteins that also regulate lysosomal tubulation. The most important of these for the purposes of this paper is UBR4, an E3 ubiquitin ligase.
      4. The authors show that spatacsin and UBR4 promote degradation of AP5Z1, and that this property required the ability of spatacsin to interact with UBR4. Somewhat surprisingly, as AP5Z1 is a coat protein, this degradation appeared to occur within the lumen of the lysosome - the authors speculate how this could be in the discussion.
      5. The authors then demonstrate that AP5Z1 and spastizin, both hereditary spastic paraplegia proteins, compete for binding with spatacsin.
      6. The relationship between spatacsin, spastizin, AP5Z1 and motor proteins in then examined. There is a known interaction between spastizin and KIF13A and expression of a dominant negative KIF13A protein reduced lysosomal tubulation. The authors then demonstrate an interaction between AP5Z1 and the p150Glued dynein/dynactin complex member, then showed that expression of a dominant negative p150Glued protein reduced lysosomal tubulation.
      7. Finally, that authors demonstrate the relevance of these findings to neurons, the target cells of hereditary spastic paraplegia, by showing that lysosomal tubulation and axonal transport are reduced in mouse neurons lacking spastacsin, and that depletion of UBR4 or AP5Z1 affected these as expected from the experiments above.

      Major comments:

      Overall I believe that the key conclusions of this paper are generally convincing and that the work is of high quality. However, I do have some reservations:

      1. The localisation of spatacsin on the ER. It is always difficult to be convinced about colocalization of a diffuse punctate marker and the ER. From the STED experiments in figure 1, while it definitely seems that there is some spatacsin on the ER, there also appears to be some spatacsin puncta that are not. I'd like to know if these puncta represent lysosome-associated spatacsin. This is important for interpretation of the subsequent experiments (see point 3 below). I also think quantification of these co-localisation will increase confidence in the results. In addition, a caveat of the immunofluorescence studies is that they use over-expressed spatacsin. I appreciate that there are no good antibodies to endogenous spatacsin, but I don't think this limitation is sufficiently acknowledged. As the claim of ER-localisation is critical for the proposed mechanistic model, and in the absence of experiments with endogenously tagged spatacsin, this makes the biochemical fractionation studies of figure 1C very important. To make these more convincing I would prefer to see additional control markers to verify the separation of lysosomal and ER compartments - e.g. lamp1, lamp2, an ER tubular marker such as a REEP5 or a reticulon.

      Authors response : We agree with the reviewer that the localization of spatacsin is critical, and we appreciate the knowledge of the reviewer concerning the lack of good antibodies to endogenous spatacsin. We better acknowledged this limitation in our revised manucript (p. 5 and p. 15). We performed extra experiments to convincingly show that spatacsin is indeed localized at the ER. First, we performed 3-color STED experiments to visualize in the same cell spatacsin, the ER and lysosomes. The preliminary data seem to indicate that some spatacsin is associated with lysosomes at ER-lysosomes contact site. We plan to add quantifications of colocalization between spatacsin and ER staining at STED resolution to better support the fact that spatacsin is a protein of the ER.

      Moreover, as requested, we have performed a western blot with Lamp2 and REEP5 antibodies on the ER- and lysosome-enriched fractions (New Figure 1B). This western blot shows that a significant proportion of Lamp2 is present in the ER-enriched fraction, which may be explained by the strong association of ER with late endosomes and lysosomes. Yet the lysosome-enriched fractions that contained no ER markers do not present spatacsin staining, suggesting that spatacsin is either in the ER or in lysosomes associated with the ER that are not positive for cathepsin D. We reformulated the text of Figure 1 according to the new included data (p. 5-6).

      The authors generally do a good job of quantifying their results. However, this is lacking for the biochemical experiments (immunoblotting and IP) in figures 4 and 5, and I would prefer to see these quantified (the quantification should include data from repeat experiments so that we can judge the reproducibility of the results).

      Authors response : We agree that our presentation did not indicate that the western blots were repeated several times. We have added quantifications for the western blots present in Figures 4 and 5.

      On page 10, referring to the proximity ligation results, the authors comment: "This suggests that the spatacsin-spastizin interaction occurs at contact sites between the ER and lysosomes to allow spastizin recruitment to lysosomes". I'm not sure this statement is fully supported, as mentioned at point 1 above it is possible that some steady state spatacsin is at lysosomes. To fully support this, we'd need to see the PLA signal also convincingly co-localise with an ER marker.

      Authors response : We will perform extra PLA experiment to indeed show that the spots where spatacsin and spastizin colocalize with an ER marker. This data will be added in Figure 5.

      In figure 6C and D the effect of spastizin on lysosomal tubulation and dynamics is investigated. Wartmannin treatment is used to do this, as it is known to remove spastizin from lysosomes. However, this is a very indirect manipulation that could have many other consequences and it would be better to demonstrate this directly by showing the effect of depletion of spastizin on lysosomal morphology/dynamics. I also think the role of AP5Z1 in tubulation/dynamics would be better supported with additional experiments to deplete the protein - at present only over-expression is examined.

      Authors response: *We added new data to answer this comment. Downregulation of spastizin using siRNA led to lower number of tubular lysosomes and decreased the proportion of dynamic lysosomes, showing that spastizin is required to regulate lysosome motility (Figure 6B-6C Supplementary Figure 7B). We have also added new data regarding downregulation of AP5Z1 (Figure 6A-6C-Supplementary 7A). Both overexpression and downregulation of AP5Z1 using siRNA decreased the number of tubular lysosomes and decreased the proportion of dynamic lysosomes (Figure 6A-6C-Supplementary Figure 6C-D). *

      This observation suggests that the levels of AP5Z1 must be tightly regulated to control lysosome motility. We added discussion about this point as well (p.12-13).

      While the experiments showing that over-expression of dominant negative forms of KIF13A and p150Glued affect lysosomal tubulation/dynamics provide good circumstantial evidence that spatacsin influences these lysosomal properties via its interactions with spastizin and AP5Z1 (which bind to these motor proteins), the authors have not shown that the interaction of the motor proteins with spastizin and AP5Z1 is required for this ability to regulate lysosome tubulation/dynamics. This means that the model presented in figure 7 is not fully supported by the data. If the authors have been able to map the binding regions for these interactions then perhaps this could be investigated with rescue experiments, although I appreciate that this is potentially a major piece of work and perhaps outside the scope of this paper. An alternative would be that the authors acknowledged this part of the model as somewhat speculative.

      Authors response : We agree with the reviewer that our data do not show that KIF13A and p150Glued interact directly with spastizin and AP5Z1 to regulate lysosome dynamics. It has previously been shown that the adaptor complex AP2 interacts with p150glued via the ear domain of AP2 b subunit (Kononenko et al, 2017). It is therefore likely that the interaction of adaptor complex 5 with p150-Glued also occurs via AP5B1 subunit, and thus interaction of AP5Z1 with p150 glued would be indirect. *We discussed this point carefully (p.16). *

      *Regarding the interaction of Spastizin with KIF13A, it was identified by yeast-two hybrid screen and validated by GST-pulldown (Sagona et al, 2010). This showed that KIF13A interacts with the C-terminal domain of Spastizin, and we discussed this point. To confirm that KIF13A interaction with spastizin is required to promote its role in tubular lysosome formation and dynamics, we can perform an experiment where we downregulate endogenous mouse spastizin using siRNA and express either full length human spastizin to rescue the effect of the siRNA, or overexpress a human spastizin lacking its C-terminal domain required for the interaction with KIF13A (where we would expect no rescue). This would strengthen our conclusion on the role of KIF13A in link with spastizin to regulate the formation and dynamics of tubular lysosomes. We could add these data in Figure 6 (or Supplementary Figure 7). *

      • Are the experiments adequately replicated and statistical analysis adequate?

      In general I am not convinced that the statistical tests are applied rigorously in this paper. Most experiments are done three times, but the "n" used for statistical testing is typically chosen as, e.g. the number of cells, number of lysosomes, rather than number of biological repeat experiments. This means that inter-experimental variability is not rigorously taken into account. A more rigorous practice would be to use the mean measures for each of three biological repeats and apply the statistical tests to the three means, so n=3 if three repeats were done. Superplots would be a nice way to graphically display these data.

      Authors response : We agree with the comments of the reviewer regarding data presentation. We have therefore changed the presentation of all graphs of the manuscript using superplots that allow us to show all the points that were analyzed as well as the mean value for each biological replicate, and performed statistical analyses by comparing the biological replicates as proposed in Lord et al, JCB 2020 (10.1083/jcb.202001064).

      Minor comments:

      1. In supplementary figure 3D I cannot honestly say that I see the smaller band.

      Authors response : We agree that this western blot is not clear. We will provide a new western blot.

      When first called out, I expected supplementary tables 1 and 2 to show the list of interactors with wild-type spatacsin and spatacsind32-34 respectively, but this is not what they show.

      Author response : We have added two supplementary data tables (Now Supplementary Tables 1 and 2) to give the list of interactors of wild-type C-terminal domain of spatacsin and spatacsinD32-34, respectively.

      Supplementary Tables 3 and 4 now refer to the analysis of the downregulation experiments by respectively the neural network method and the tubular lysosome detection method.

      The experiments in Figure 4A are a little problematic in the way that they are called out. The first call refers to just a small subset of the data in the figure, and the figure is then called out at various points later in the paper. This is quite confusing. Is there any way this could be simplified?

      Authors response :We agree with the reviewer that Figure 4A was called at various points of the manuscript. This was to avoid duplicating data into two separate figures. However, we have modified the presentation of Figure 4 and Figure 5. We have included new Figure 4C to show that downregulation of UBR4 prevents the degradation of AP5Z1 upon overexpression of Spatacsin-GFP, but also in basal conditions in wild-type fibroblasts. The co-IP that was originally presented in Figure 4A has now been moved into Supplementary Figure 6A.

      The section on page 10: "Spatacsin also interacts with spastizin, and is required to recruit spastizin to lysosomes (Hirst et al., 2021). ........ We hypothesized that spatacsin interaction with spastizin was required for spastizin localization to lysosomes." Is odd, as the authors seem to be hypothesising an observation that they have just said has already been demonstrated.

      Authors response : We agree that these sentences were odd. We have rephrased the paragraph (p. 11).

      Can the authors explain why there is so little interaction between wild-type KIF13A and spastizin?

      Authors response : The interaction domain of spastizin with KIF13A is close to the motor domain according to the two-hybrid data published by Sagona et al (2010). The dominant negative construct of KIF13A that is devoid of the motor domain (KIF13A-ST) may thus facilitate access of spastizin to binding domain. We have commented on this point in the text (p.13).

      In figure 6G p150Glued signal is also present in the control IP lane, which casts doubt on the specificity of the interaction. Could the authors generate a cleaner result?

      Authors response : We have repeated the experiment 3 times, always with some p150Glued signal present in the control IP. Of note, as stated in the method section, we have increased the concentration of NaCl in the washing of this co-IP to decrease non-specific binding of p150glued to control beads, but we could not get cleaner results so far. We will try to get cleaner western blot to illustrate Figure 6G.

      I would be interested to see how AP5Z1 expression differs between neurons with and without spatacsin- we would expect similar results to those shown in the MEFS.

      *Authors response : We have not checked the levels of AP5Z1 in neurons with and without spatacsin yet. However, the complete knockout of spatacsin strongly modifies the levels of its partners. We previously showed that spastizin levels are decreased by >90% in Spg11 knockout brain (Branchu et al, 2017). Furthermore, the levels of AP5Z1 have been shown to be decreased by ~50% in fibroblasts of SPG11 and SPG15 patients (Hirst et al, 2015). *

      *Our work shows that spatacsin promotes the degradation of AP5Z1 by lysosomes. It is possible that other degradation mechanism(s) may exist and could explain the lower levels of AP5Z1 in knockout cells. We discussed this point (p.15). *

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this study Pierga et al. report that SPG11 (spatacsin) is an ER-resident protein involved in the regulation of ER-lysosome contact sites (in particular tubular lysosomes) and subsequent faster motility of tubular lysosomes, as well in the degradation of AP5Z1 (SPG48), which forms a heterotrimeric complex with SPG15 (spastizin) and SPG11. This complex has been localized by several groups on the cytoplasmic side of LAMP-1-positive lysosomes. In addition, mutations in SPG11, SPG15, and SPG48 patients share various clinical features and were supported by biochemical/cell biological data from Spg11 and Spg15 KO mouse models and cultured cells both from patients and mice, respectively, demonstrating e.g. accumulation of autolysosome storage material, defects in the autophagic lysosome reformation process, and the loss of cortical motoneurons and Purkinje cells.

      Major concerns:

      i) Fig. 1, 2, 3: major disadvantage of this study is the analysis of overexpressed proteins (SPG11-V5, GFP-Sec61, and Lamp1-mCherry) which might contribute to the observed strong expression of SPG11-V5 in the ER/ER-enriched fraction. The results should be compared with the endogenous expressed proteins.

      Authors response :* As stated by reviewer 1, there are no good antibodies to endogenous spatacsin, and therefore we have to rely on expression of tagged spatacsin to study its localization by immunohistochemistry. For the colocalization with the ER, we stained the latter by GFP-Sec61 that is a widely used marker for this compartment. To confirm our results, we plan to try to perform new STED imaging with REEP5 antibody to stain the ER, and Lamp1 antibody to label lysosomes, avoiding overexpression of proteins to label the subcellular compartments. Furthermore, as it is not possible to localize endogenous spatacsin by immunostaining, we addressed its localization by biochemical fractionation and western blots comparing wild-type and Spg11 knockout samples. *

      For Figure 2, the data presented were indeed obtained using transfection of Lamp1-mCherry. However, we confirmed our observation of Figure 2A using alternative staining of lysosomes (Lysotracker or loading of lysosomes with Texas-Red Dextran). We therefore think that our data presented in figure 2 are valid, and that the effect we observed on tubular lysosomes was not affected by expression of Lamp1-mCherry.

      In Figure 3, the lysosome were labelled with Texas-Red Dextran, and thus all the data presented in figure 3 do not rely on overexpression.

      In Fig. 1C the lack of the mature Cathepsin D form which is proteolytically generated only in lysosomes from the higher molecular mass precursor is misleading and should be related to presence of lysosomal membrane proteins.

      Authors response: As requested, we have performed a western blot to show the lysosomal membrane protein Lamp2 on the ER- and lysosome-enriched fractions (Figure 1B). This western blot shows that a significant proportion of Lamp2 is actually present in the ER-enriched fraction, which may be explained by the strong association of ER with late endosomes and lysosomes previously described (Friedman et al, 2013). Yet the lysosome-enriched fractions that contained no ER markers do not present spatacsin staining, suggesting that spatacsin is either in the ER or in lysosomes associated with the ER. We reformulated the text of Figure 1 according to the new included data (p 5-6). The 3-colours STED experiment that we plan to perform to answer reviewer 1 comments will help discriminate between these possibilities.

      Fig. 1D: the TEM image shows only a single lysosome and proposed ER contact zones in wt-MEFs without comparison with Spg11 KO MEFs (only in the quantification). Without double immunogold labeling of SPG11 (and their lack on SPG11 KO cell lysosomes) and known ER contact-site proteins this image and the conclusion are insufficient.

      Authors response : We have added an image of a lysosome taken from a knockout fibroblast (Figure 1E). As stated above there are no good antibodies to spatacsin for immunostaining, so it will not be possible to perform double immunogold labelling. This prevents us from claiming that spatacsin is a protein enriched at contact site. We therefore modulated our result section and discussion accordingly (p.5-6 and p.16).

      ii) The rationale for the selection of the deleted Spg11 region D32-34 is not clear. What are the symptoms of this Spg11 knock-in mouse? A more detailed description of the phenotype is required Is the phenotype including the accumulation of LC3-positive material similar to the phenotype of the SPG11 KO mouse which has been published by Varga et al.(2015) and Branchu et al. (2017) ? If not, is the new mechanisms reported here not so important?

      Author response : We have added new data (Supplementary Figure 3E-F) showing motor and cognitive impairment in mice expressing truncating spatacsin, although the motor dysfunction is slightly less marked than in Spg11 knockout animals. We also checked for accumulation of autophagy markers. We did not use LC3, but p62 that labels substrates to be degraded by autophagy. We observed accumulation of p62 in Spg11 knockout and in Spg11D32-34/D32-34 mouse neurons (Supplementary Figure 3G). These data support the functional importance of the domain encoded by exons 32 to 34 of Spg11. We commented on this in the text (p.9).

      iii) p8/Fig. 3F/Suppl.Fig.3F- the most important part of the manuscript: what are the parameters of lysosomal staining in images that were used to identify genes important for lysosome tubulation by the neural network?

      Authors response : For screening in Figure 3, lysosomes were stained by loading fibroblasts with Texas-Red Dextran overnight, followed by a wash of at least 4 hours. The neural network was first trained to discriminate between control and Spg11-/- fibroblasts, using any parameters of the lysosomal staining, not necessarily lysosome tubulation. This is a completely unsupervised and unbiased method, but one of its drawbacks is that we do not know which parameters were used by the network to discriminate between control and Spg11-/- fibroblasts. Therefore, we validated the classification performed by the neural network on a data set independent from the training set before using it for the screening. We rephrased the paragraph to make it clearer (p.9).

      I cannot understand how the authors predict the probability of the cell to be considered as an Spg11 KO fibroblast (why not as an Spg11 D32-34 knock-in fibroblast?) as the basis for the selection of interaction candidates.

      Author response : The neural network was trained on sets of images obtained from wild-type and Spg11 KO fibroblasts, which were expected to represent extreme lysosomal phenotypes linked to spatacsin function. We could therefore predict the probability of cells to be considered as Spg11 KO, not as Spg11 *D32-34 fibroblasts. We clarified this in the text (p9). *

      A simple statement that the neural network approach identified those genes is too weak and requires more convincing experimental data. It has to be shown at least for the 8 positive genes in both approaches how the siRNA treatments of these genes phenocopied the lysosomal changes and of course the effect of the downregulation on the protein level of their products both in wild-type control and Spg11 D32-34 knock-in MEF. The Suppl. Fig.3F is completely unclear. How were the Y2H interaction partner validated? Did the authors use the identified 8 interaction candidates as full length bait to demonstrate the interaction with the Spg11 exons 32-34 ?

      Author response : The purpose of the siRNA screen was to quickly identify putative candidates important for the regulation of lysosome dynamics. We identified 8 candidates possibly implicated in lysosomes dynamics based on the two analysis methods. We have added in Supplementary Figure 4 C-D the effect of both siRNA on lysosomal function by the two methods of analysis compared to the effect of siSPG11. However, here we aimed to identify candidates and we do not claim that every one of these eight proteins were indeed implicated in the regulation of lysosome dynamics. We corrected the text, accordingly, stating that the products of the 8 identified genes are good candidates to regulate lysosomal function (p.10). We validated the role of one of the identified candidates, UBR4, and we showed that the UBR4 siRNA indeed downregulates the protein level (Figure 4C). We only validated the interaction of spatacsin Cter with UBR4 by co-immunoprecipitation (Figure 4B).

      *For the 7 remaining candidates, full characterization would indeed be required to validate their role and elucidate their mechanisms of action, but this is out of the scope of this manuscript. *

      p8/Fig.3F: the genes identified in both approaches have to be listed in the Fig. 3F-Table.

      Authors response : We have added in new Figure 3F the list of the 8 candidate genes that could contribute to regulate lysosome function.

      The GO process- ubiquitin-dependent protein catabolic process is neither positive for the neural network nor for the directed analysis but positive for both analyses? Please explain. Similarly, the GO process proteolysis involved in cellular protein catabolic process -is not positive for the neural network analysis but again positive for both analyses.

      Authors response : We agree with the reviewer that Table 3F in its older version could be a bit confusing. GO analysis is based on “enrichment” of biological processes within a list of proteins. As we did not have the same number of proteins in the 3 analyses provided in original Table 3F, we got variability in the identified biological processes. To simplify, we have therefore chosen to present only the GO analysis for the 8 candidates that were most likely implicated in lysosomal dynamics according to our two analyses of the siRNA screen which is the most relevant for our study (new Figure 3G).

      For Fig. 3G the mutant ubiquitin-K0 staining in wild-type MEF cells has to be shown as well as for the Spg11 ki/KO MEFs (+ quantification of the respective data)

      Authors response : As stated by Reviewer 4, the expression of lysine-null ubiquitin may impact many different cellular pathways. We therefore removed this part of the data in order to simplify the manuscript (p.10)

      iv) The interpretation of the Y2H-interactome analysis by the authors is hard to follow. They searched with the exon 32-34 cDNA for binding partner, selected 3 degradative GO processes and showed by overexpression of a mutant Ub-K0 plasmid in wild-type MEFs a decreased number of tubular lysosomes, as well as their dynamics (without showing the control data in Spg11 KO or ki-MEFs). Thus, poly-ub of proteins should be in some way responsible for a lysosomal phenotype of Spg11ki MEFs.

      Now they went to AP5Z1, the second binding partner of SPG11, which is reduced in its abundance upon overexpression of Spg11-GFP. I would expect to do the respective control experiment to show that in the absence of SPG11 or in the knock-in cells the amount of AP5Z1 has to increase. However, in the studies by the Huebner group by deletion of Spg11 or the other binding partner Spg15, no increase of AP5Z1 protein levels has been observed. The authors have to comment on this discrepancy.

      *Authors response : We agree that this is an important point to discuss, and we failed to do it in our first version. *

      *The complete knockout of spatacsin strongly modifies the levels of its partners. We previously showed that spastizin levels are decreased by >90% in Spg11 knockout (Branchu 2017). Furthermore, the levels of AP5Z1 have been shown to be decreased by ~50% in fibroblasts of SPG11 and SPG15 patients (Hirst et al, 2015). *

      Our work shows that spatacsin promotes the degradation of AP5Z1 by lysosomes. It is possible that other degradation mechanism may exist, and could explain the lower levels of AP5Z1 in knockout cells. Furthermore, it was proposed that AP5Z1 stability may depend on the presence of spatacsin and spastizin (Hirst et al., 2013)*. Therefore spatacsin may contribute to tightly regulate AP5Z1 levels by contributing both to its stability, and to its degradation. We have carefully discussed this point (p.16). Furthermore, the experiments requested by reviewer 2 in point (vi) that we are planning to perform will help clarify the mechanisms of AP5Z1 degradation both in presence and absence of spatacsin. *

      Then the authors found that the selected interaction partner of the exon 32-34 sequence, UBR4, does not bind to the Spg11-GFP construct lacking the domain encoded by exons 32-34 but to the C-terminal domain of Spg11-GFP. Unfortunately, all these IP-experiments were shown as cut and paste figures, preventing the direct comparison between the input and the IP protein amounts (since the information is missing what percentage of the input and the IP has been loaded per lane, the evaluation and significance of these Co-IPs are unclear).

      Authors response : We have added in the Figure legend the fact that the input represents 5% of lysate added to the immunoprecipitation assays

      v) p9: AP5 (Z1) is a cytoplasmic protein and can be localized on the cytoplasmic surface of lysosomes. How should the GFP-mcherry-AP5Z1 protein enter the lumen of lysosomes justifying the quenching of the GFP signal? A positive control has to be included in the experiment shown in Fig. 4E demonstrating the effect of MG132 under identical conditions of a protein substrate for proteasomal degradation.

      Authors response :* We agree this is an important control. We plan to add a control showing accumulation of ubiquitin in lysates upon MG132 treatment to show it was indeed effective. *

      vi) Fig. 5A: In contrast to GFP-mcherry-AP5Z1, spastizin-GFP is localized at the cytoplasmic surface of lysosomes (co-staining with LAMP1-mcherry) in wild-type MEFs. In regard to the incomplete data commented under "minor points Fig.4/Suppl.Fig.4", I suggest to perform a simple control experiment with overexpressed GFP-spastizin and mCherry-AP5Z1 in wild-type MEFs (at the best also in Spg11 KO MEF) with and without bafA treatment, which will clearly demonstrate whether single components of the trimeric Spg11, spastizin-AP5Z1 complex are degraded independently of each other in lysosomes.

      *Authors response : As stated above, we will perform this control experiment, and will add the data in Figure 5 in future revision. This will help clarify the mechanism of degradation of AP5Z1 and spastizin both in presence and absence of spatacsin. Discussion of this point will also help to clarify the point iv raised by reviewer #2. *

      vii) why did the authors neither mention nor discuss the described role of SPG11 in autophago-lysosome reformation (ALR)?

      *Authors response : We did not discuss ALR in our first version as we did not investigate autophagic conditions. However, due to the well-described role of spatacsin in ALR, we agree that we should discuss ALR in our manuscript, and we added a paragraph (p.15). *

      Minor points

      • Figure 1 A, B, D, and G: ER-lysosome contact sites. The quantification of the co-localization of spatacsin-V5 with the ER marker protein GFP-Sec61b has to be given.

      Authors response :* We plan to add quantification data performed on STED images showing localization of Spatacsin-GFP together with ER and lysosomal markers. This data will be added in Figure 1. *

      Moreover, the authors analyzed overexpressed tagged-proteins only. The results should be compared with the endogenous proteins.

      Authors response :* As stated above, there are no good antibodies to endogenous spatacsin for immunostaining. We will add new STED images with antibodies against endogenous Reep5 and Lamp1 to label the ER and lysosomes together with overexpressed spatacsin. Regarding endogenous spatacsin, we could only investigate its localization by subcellular fractionation and western blots comparing wild-type and Spg11 knockout samples. We added biochemical data suggesting that spatacsin is enriched either in the ER or in lysosome membrane associated with the ER. These data have been added in Figure 1 and in text (p.5) and we added a paragraph in discussion regarding spatacsin subcellular localization (p.15). *

      p8/Figure 3: what does the 'analysis of trained neural networks' mean?

      Authors response : We did not analyzed the trained neural network, but we used this trained neural network to perform image analysis. We clarified the text (p.10).

      Figure 4: what happens with the other AP5 subunits?

      Authors response : This is a very interesting question. We will test whether overexpression of spatacsin-GFP induces a degradation of some other AP5 subunit, provided we get specific antibody. We will add the data in Figure 4A.

      Fig.4F/Suppl.Fig4: live images of GFP-mcherry-AP5Z1 + lysotracker staining have to be shown both for wild-type MEFs with and without bafilomycin A treatment(as in Fig.4F), and in Spg11 KO and Ki MEFs +/- bafA.

      Authors response : We will add these data in Figure 4 (WT Mefs +/- Baf A) and in Supplementary Figure 5 (Spg11KO and SPG11D32-34 Mefs +/- Baf).

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This manuscript highlights an interesting localization of spatacsin in the endoplasmic reticulum (ER)-lysosomes contact sites. In addition, it implicates spatacsin in regulating tubular dynamic lysosomes. Mechanistically, the authors propose that spatacsin interacts with UBR4 to promote the autophagic degradation of its binding partner AP5Z1 at the lysosomes. In turn, this would also regulate the amount of spastizin at the lysosomes, which is known to interact with anterograde motors. The authors further show that AP5Z1 interacts with p150Glued. Thus, the balance between AP5Z1 and spastizin at the lysosomes would determine lysosomal trafficking directionality.

      Major Comments

      1. Several crucial results of the manuscript are based on quantifications performed on immunofluorescence stainings. Data points in graphs show individual cells or individual lysosomes and the authors apply statistical tests on replicates that cannot be considered biologically independent, since they come from the same experiment or even the same cell. It is recommended to show superplots where both the individual data and the average of each independent experiment is indicated as recommended by Lord et al. (J Cell Biol 2020 219 (6): e202001064.). Statistics should be performed only on independent biological replicates.

      Authors response : We agree with the comments of the reviewer regarding data presentation. We have therefore changed the presentation of all graphs of the manuscript using superplots that allow us to show all the points that were analyzed as well as the mean value for each biological replicate, and performed statistical analyses by comparing the biological replicates as proposed in Lord et al, JCB 2020 (10.1083/jcb.202001064).

      The authors have used yeast two-hybrid to search for spatacsin interactors. Although in the manuscript they refer to supplementary tables that should show these interactors, the available Tables are confusing and refer to the following downregulation experiments.

      Author response : We have added two supplementary data tables (Now Supplementary Tables 1 and 2) to give the list of interactors of wild-type C-terminal domain of spatacsin and spatacsinD32-34, respectively.

      Supplementary Tables 3 and 4 now refer to the analysis of the downregulation experiments by respectively the neural network method and the tubular lysosome detection method.

      An experiment to demonstrate that endogenous UBR4 and spatacsin interact by co-immunoprecipitation would be crucial.

      Authors response : We agree with the reviewer that it would be important to test whether endogenous spatacsin and UBR4 are interacting by co-immunoprecipitation. So far we have not managed to immunoprecipitate either endogenous spatacsin or endogenous UBR4 with the antibodies we tested, which prevents us to test the interactions of endogenous proteins by co-immunoprecipitation. We are not sure we can provide this result.

      Several important experiments to unravel the mechanistic role of spatacsin (Figure 4 and 5) are performed upon overexpression. This is a major limitation of the study and the authors should address it as much as possible. Western blots and immunoprecipitations are shown that appear to have been performed only once and have no quantification. As an example, in Fig 4A the difference in levels of AP5Z1 upon spatacsin overexpression or UBR4 downregulation are very minor. I would be very careful in drawing big conclusions, without additional repetitions and additional experiments in an endogenous setting.

      *Authors response : We agree that a lot of our experiments used overexpression. We have now added to the manuscript new data obtained in MEFs where we downregulated spastizin or AP5Z1 (Figure 6). They confirm the role of spastizin in the regulation of lysosome dynamics. Furthermore, our new data show that levels of AP5Z1 must be tightly regulated as both overexpression and downregulation of AP5Z1 affects lysosome dynamics (p.12). We also discussed these data carefully (p.16 ). *

      Furthermore, we agree that our presentation did not indicate that the western blots were repeated several times. We have now added quantifications for the western blots presented in Figures 4 and 5. Furthermore, we have also added the data showing that downregulation of UBR4 led to higher levels of AP5Z1 in control fibroblasts (Figure 4C).

      The authors suggest a model by which UBR4 recruited by spatacsin is involved in autophagic degradation of AP5Z1. The data shown do not support this conclusion. First, in Figure 4A downregulation of UBR4 does not increase levels of AP5Z1 above the control in lane 1, but only when spatacsin is overexpressed. The effect of downregulation of UBR4 in wilt-type cells on AP5Z1 should be investigated. Secondly, there is no experiment directly proving that the stability of AP5Z1 depends on UBR4.

      Authors response : We have added new western blots (and quantification) in Figure 4C showing that downregulation of UBR4 increased levels of AP5Z1 in control conditions. The fact that downregulation of UBR4 increased levels of AP5Z1 in control conditions suggests that UBR4 contributes to regulating the levels of AP5Z1. However, we do not show whether UBR4 directly promotes the degradation of UBR4, which has been added in the discussion (p15). To test whether UBR4 affects the stability of AP5Z1, we will monitor whether downregulation of UBR4 by siRNA increases the half-life of AP5Z1. These data will be added on Figure 4.

      The authors suggest that the interaction of spatacsin with spastizin or AP5Z1 are in competition. This is an interesting hypothesis, however to conclusively demonstrate this, pull-down experiments in KO cells and not upon extreme overexpression should be performed.

      Authors response : We agree that testing the interaction of spatacsin with its partners in SPG15 KO or AP5Z1 KO fibroblasts would be a very good control of our hypothesis. However, we previously showed that the levels of AP5Z1 are lower in SPG15 KO than in control fibroblasts (Hirst et al, 2015), which introduces a bias in the analysis. We therefore plan to concentrate on AP5Z1 fibroblasts and investigate whether interaction of spatacsin with spastizin is modified in these cells. An alternative would be to monitor the effect of siRNA downregulating AP5Z1 on the interaction between spatacsin and spastizin. We will add these data in Figure 5.

      Minor comments

      1. In figure 1G and 1H the overlapping area between lysosomes and ER is quantified. Considering that the ER occupies a large portion of the field a 90{degree sign} flipped control for both WT and KO would be important to sort out random colocalization. In this direction, it would be also essential to show that the total amount of lysosomes is not different in WT and KO, especially because in figure 1A the lysosomes in WT and KO seem to be different not just in shape but also in number and size. A different number or size of lysosomes affects this analysis.

      Authors response :* We added quantifications in Supplementary Figure 1F showing that 90° flipped controls are indeed not capturing the same proportion of contacts between the ER and lysosomes. We also added quantifications in Supplementary Figure 1D-E showing that the average size of lysosomes and the number of lysosomes per unit area are similar in control and Spg11 KO fibroblasts and mentioned it in the text (p.6). If the lysosomal staining appears different in Spg11 KO fibroblasts it is because lysosomes are clustered around the nucleus, an observation that we reported previously (Boutry et al, 2019). *

      In the second chapter of the Results, the authors state: "we observed by live imaging a higher number of lysosomes with tubular shape in Spg11+/+ compared to Spg11-/- cells", however the number of elongated lysosomes is quantified per area. Why the number of elongated lysosomes is not quantified over the total amount of lysosomes?

      Authors response : The point raised by the reviewer is a fair point. The purpose of our analysis was to compare the number of lysosomes with tubular shape in control and Spg11 KO cells. As the number of lysosomes per unit area is invariant between control and Spg11 KO cells as shown in new data included in Supplementary Figure 1D, normalization to total number of lysosomes or to cell surface reflects the same difference in phenotype.

      The In the fourth chapter of the Results, the authors state:" In wild-type MEFs, mCherry was colocalized with lysosomes. In contrast, GFP that is sensitive to pH was poorly colocalized with lysosomes, suggesting that AP5Z1 was mainly inside the acidic subcellular compartment (Figure 4F)." If the aim of the authors is to shown that AP5Z1 is mainly into the lysosome, the amount AP5Z1-mcherry inside and outside the lysosome need to be compared, with a proper statistical analysis. There is also a lot of GFP signal in the cytosol. Why is that?

      *Authors response : We agree with the reviewer, we will add quantification of the proportion of AP5Z1-mCherry inside lysosomes on Supplementary Figure 5. *

      Regarding the GFP-AP5Z1 signal in the cytosol, AP5Z1 has no transmembrane domain and may thus exist as a cytosolic protein. Since GFP is quenched in the acidic environment of lysosomes, the GFP fluorescence of the mCherry-GFP-AP5Z1 protein is outside lysosomes, and it appears partly cytosolic. Of note, there is also some cytosolic mCherry signal that is less visible due to the high level of mCherry fluorescence in lysosomes. We will clarify this point with the quantification of the proportion of mCherry signal compared to GFP inside the lysosomes and add it in Figure 4.

      construct used in the paper is a C-terminal tagged version of spatacsin. The authors should consider to test an N-terminal tagged construct at least for the localization experiments.

      Authors response : We added an immunostaining image of Spatacsin with an N-terminal tag (Supplementary Figure 1B) and mentioned it in the text (p.6). As spatacsin with a C-terminal tag, it presents a diffuse distribution that poorly co-localizes with lysosomes.

      Figure 5C: a negative control and the quantification are missing.

      Authors response : A non-transfected cell is present on Figure 5C, visible thanks to the Lamp1 immunostaining, and that we considered as a negative control. In this non-transfected cell, we detected no PLA signal. We added an asterisk to point the non-transfected cell on Figure 5C. Quantification will also be added in the revised version after we have performed the PLA experiment required by Reviewer 1.

      Reviewer #3 (Significance (Required)):

      Since spatacsin, AP5Z1 and spastizin are all implicated in hereditary spastic paraplegia, the data are of potential interest not only for basic cell biology, but also to understand the pathogenesis of the disease. In addition, the manuscript proposes a novel model regulating trafficking of dynamic lysosomes.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      Pierga et al. reveal subtle differences in lysosome morphology, ER-contact, and trafficking in the absence of Spatascin. These data are replicated with a truncated Spatascin, presumably a loss of function. Two-hybrid screening of the deleted sequence from this truncation for interactors and then asked whether these hits could phenocopy the lysosome morphology changes. This led to an assertion for a role for ubiquitination in these effects. Rather than these hits the group then investigates previously known Spatascin interactors and reports similar complex but subtle abnormalities via overexpression or knockdown of these. While data show overlapping phenotypes by modulation Spatascin, AP5z1, and Spastizin, the manuscript is confusing, leaps from experiment to experiment, and does not provide novel rigorous mechanisitic insight. It conflates all the discrete lysosomes aspects into a collective to link them. The title is over-stated and not appropriate for the experiments.

      The localization of endogenous Spatascin is lacking - over-expression is prone to artifact and the punctate data on the V5 suggests much more work is needed to understand where in the cell it is. It would seem much more work is needed here.

      Authors response : As stated by reviewer 1, there are no good antibody to endogenous spatacsin, and therefore we have to rely on expression of tagged spatacsin to study its localization by immunofluorescence. When performing the images, we avoided the cells with the highest ovexpression of tagged spatacsin. Yet, we agree that this is still overexpression. That’s why we included subcellular fractionation data where we can detect endogenous spatacsin (Figure 1A-1B). These data confirmed that spatacsin is enriched in the ER or in lysosome fraction tightly associated with the ER.

      Furthermore, the EM data (1E) would suggest the far majority of lysosomes are in contact with ER - these seems uncharacteristic.

      Authors response : The EM data in figure 1E indeed shows that the majority of lysosomes are in contact with the ER, as previously shown by other groups (Friedman et al, 2013, Höglinger et al, 2019).

      The phenotypes analyzed are very subtle, and while statistically significant the biological impact is unclear - in many cases individual lysosomes (or lysosome-ER contacts) are considered as an 'n'. While these results are probed across multiple independent experiments the batch effects and how uniform per cell the events are is unclear.

      Authors response : We agree with the comments of the reviewer regarding data presentation. ‘n’ represented individual cells, but did not actually take into account the variability across experiments. We have therefore changed the presentation of all graphs of the manuscript using superplots that allow us to show all the points that were analyzed as well as the mean value for each biological replicate, and performed statistical analyses by comparing the biological replicates as proposed in Lord et al, JCB 2020 (10.1083/jcb.202001064).

      In fig 2H critical data are missing - the effect of Spatascin KO on the transition between these morphologies should be considered as in G. Otherwise the relevance is unclear.

      Authors response : We have added this quantification on Figure 2I. It shows that transition of morphology of lysosomes from round to tubular in Spg11 KO cells is still associated with a change of speed, although the average speed attained is halved compared to conditions where spatacsin is present. This shows that loss of spatacsin does not abolish morphological transition of lysosomes but limit their speed in the tubular shape. We commented on this new data in the text (p.8).

      The impact of over-expressing a lysine-null Ub ( Fig 3) is far too crude and non-specific to have meaning here. It is assumed that the only proteins affected are those of interest. This is consistent with much of the paper where "true-true-and unrelated" is more likely than the presumption of causality.

      Authors response : It is true that the expression of lysine-null ubiquitin is really crude and may impact many different cellular pathways. Furthermore, the results obtained with the lysine-null ubiquitin do not contribute to the rest of the paper. We therefore removed the original Fig3G, H, I and Fig 4B and updated the text accordingly (p.10).

      The blots in Fig4 are a relatively poor quality and not quantified over repetition.

      *Authors response :Spatacsin and spastizin are large proteins, and there is not much choice for antibodies able to detect these proteins. Yet we have validated their specificity by western blot using knockout cells (spatacsin) (Supplementary Figure 4 A-B) or siRNA (spastizin) (Supplementary Figure 7B). We agree that our presentation did not indicate that the western blots were repeated several times. We have added quantifications for the western blots present in Figures 4 and 5. We also changed some illustrative western blots to improve quality. *

      Controls are missing and Fig5 suffers from a reliance on over-expression - there is a massive over-expression of AP5Z1 which may be affected the stoichiometry of these overall interactions, but with an n=1 its hard to know and its not clear what these data add. Again, while statistically significant (5E and F) due to the nature of data analysis (every lysosome=n of 1) it is not clear how biologically significant UBR4 siRNA or AP5Z1 over-expression is - as the accumulation of AP5Z1 in these two conditions is orders of magnitude apart - again likely unrelated.

      Authors response : We added quantification for this western blot (Supplementary Figure 6A).

      *As stated above we have changed the representation of the graphs. Each point represents one cell, and we included the mean value for each biological replicate. *

      Preventing degradation of AP5Z1 by UBR4 siRNA or overexpression of AP5Z1 do not indeed have the same effect on total AP5Z1 but do have a similar effect on the interaction of spatacsin with its partners evaluated by co-immunoprecipitation, as illustrated by the quantifications that we have added. We clarified this in the text (p.12). As requested by reviewer 3, we will also investigate the effect of AP5Z1 knockout or downregulation on the interaction between spatacsin and spastizin assessed by co-immunoprecipitation. These data will be added in Figure 5 and will strengthen our conclusions.

      Fig 6 begins to conflate the fact that different lysosome morphologies appear to have different trafficking properties even in WT cells and that many of these targets affect morphology - therefore to conclude a direct effect on trafficking seems inappropriate.

      Authors response : In original Figure 6, we showed that Kif13A-ST and p150CC1 changed the proportion of tubular lysosomes (previous Figure 6 and H), and the data showing that these constructs changed the trafficking of lysosomes were presented in Supplementary Figure 5 B-C. We have now moved the data showing the effect of Kif13A-ST and p150CC1 in the main Figure (Figure 6F and 6I) to facilitate the interpretation of the data. Therefore, expression of Kif13A-ST and p150CC1 do not only affect the morphology of lysosomes, but also impaired their trafficking. We thus do not extrapolate lysosome dynamics from their morphology, we actually quantify lysosome dynamics.

      Fig 7 extends this into polar cells (neurons) but still it is not clear whether form (morphology) dictates function (likelihood of trafficking or directionality.

      Authors response : We did not only analyzed neurons because they are polarized cells, but because neurons are the main cells affected by neurodegeneration observed in absence of spatacsin (Branchu et al, 2017). We added new data on Figure 7 showing that tubular lysosomes in axons are actually more dynamic than round lysosomes, as observed in fibroblasts. We added these data in Figure 7 and text (p.13).

      Investigation of lysosome trafficking in axons also allowed us to investigate the directionality of movement, which is difficult in MEFs. We clarified this point in the text (p.13).

      In sum, there is a lot of data that collectively points to a partial localization of Spatascin at Er-lysosome contacts and an influence on morphology and trafficking of lysosomes in the cell, but at the end of the day very new mechanism is brought to light.

      Authors response : The mechanisms regulating trafficking of lysosomes are far from being fully resolved. Our manuscript shows that spatacsin contributes to this regulation by modulating the degradation of AP5Z1. This in turn regulate the lysosomal association of AP5Z1 and spastizin that interact with motor proteins to control lysosomal dynamics.

      Reviewer #4 (Significance (Required)):

      This manuscript is directed to the basic cell biology community - involving ER, lysosome, and microtubule dependent trafficking. There are some new analytical tools employed and many co-factors and binding partners of Spatascin considered but frankly too many to adequately and rigorously control for. Because of this the manuscript is very unfocused, hard to follow and makes too many assumptions about shared dynamics ? necessarily arising from shared morphology - lysosomes are highly dynamic and can be affected by virtually any change in intracellular trafficking or protein/membrane transport. This is not appropriately considered.

      Authors response : We have clarified our manuscript to show that dynamics is not necessarily arising from a tubular morphology. It turns out that lysosomes with a tubular morphology indeed are more dynamic that lysosomes with a round morphology. Importantly, in all our experiments dealing with lysosomal dynamics, we have actually included a quantification of lysosome dynamics using time lapse imaging as detailed in methods (p.21).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      This paper identifies a role for the hereditary spastic paraplegia protein spatacsin in lysosome morphology, positioning and dynamics, and undertakes detailed mechanistic studies to try to identify the mechanism for this effect. In doing so the paper elucidates further mechanistic information about the properties of two other hereditary spastic paraplegia proteins, spastizin and AP5Z1. The work is done in mammalian cells and uses a combination of over-expression, depletion and biochemical studies. The main findings are:

      1. The authors present evidence that spatacsin is an ER-localised protein.
      2. Murine embryonic fibroblasts lacking spatacsin have a reduced number of tubular lysosomes and the remaining lysosomes are less motile. In general, a relationship between tubular lysosome morphology and lysosome motility, often in association with the endoplasmic reticulum (ER), is demonstrated. These tubular lysosomes are catalytically active and acidic.
      3. In terms of mechanism of this effect, by combining a yeast-two hybrid and siRNA phenotypic screen, the authors identify a number of spatacsin-interacting proteins that also regulate lysosomal tubulation. The most important of these for the purposes of this paper is UBR4, an E3 ubiquitin ligase.
      4. The authors show that spatacsin and UBR4 promote degradation of AP5Z1, and that this property required the ability of spatacsin to interact with UBR4. Somewhat surprisingly, as AP5Z1 is a coat protein, this degradation appeared to occur within the lumen of the lysosome - the authors speculate how this could be in the discussion.
      5. The authors then demonstrate that AP5Z1 and spastizin, both hereditary spastic paraplegia proteins, compete for binding with spatacsin.
      6. The relationship between spatacsin, spastizin, AP5Z1 and motor proteins in then examined. There is a known interaction between spastizin and KIF13A and expression of a dominant negative KIF13A protein reduced lysosomal tubulation. The authors then demonstrate an interaction between AP5Z1 and the p150Glued dynein/dynactin complex member, then showed that expression of a dominant negative p150Glued protein reduced lysosomal tubulation.
      7. Finally, that authors demonstrate the relevance of these findings to neurons, the target cells of hereditary spastic paraplegia, by showing that lysosomal tubulation and axonal transport are reduced in mouse neurons lacking spastacsin, and that depletion of UBR4 or AP5Z1 affected these as expected from the experiments above.

      Major comments:

      Overall I believe that the key conclusions of this paper are generally convincing and that the work is of high quality. However, I do have some reservations:

      1. The localisation of spatacsin on the ER. It is always difficult to be convinced about colocalization of a diffuse punctate marker and the ER. From the STED experiments in figure 1, while it definitely seems that there is some spatacsin on the ER, there also appears to be some spatacsin puncta that are not. I'd like to know if these puncta represent lysosome-associated spatacsin. This is important for interpretation of the subsequent experiments (see point 3 below). I also think quantification of these co-localisation will increase confidence in the results. In addition, a caveat of the immunofluorescence studies is that they use over-expressed spatacsin. I appreciate that there are no good antibodies to endogenous spatacsin, but I don't think this limitation is sufficiently acknowledged. As the claim of ER-localisation is critical for the proposed mechanistic model, and in the absence of experiments with endogenously tagged spatacsin, this makes the biochemical fractionation studies of figure 1C very important. To make these more convincing I would prefer to see additional control markers to verify the separation of lysosomal and ER compartments - e.g. lamp1, lamp2, an ER tubular marker such as a REEP5 or a reticulon.
      2. The authors generally do a good job of quantifying their results. However, this is lacking for the biochemical experiments (immunoblotting and IP) in figures 4 and 5, and I would prefer to see these quantified (the quantification should include data from repeat experiments so that we can judge the reproducibility of the results).
      3. On page 10, referring to the proximity ligation results, the authors comment: "This suggests that the spatacsin-spastizin interaction occurs at contact sites between the ER and lysosomes to allow spastizin recruitment to lysosomes". I'm not sure this statement is fully supported, as mentioned at point 1 above it is possible that some steady state spatacsin is at lysosomes. To fully support this, we'd need to see the PLA signal also convincingly co-localise with an ER marker.
      4. In figure 6C and D the effect of spastizin on lysosomal tubulation and dynamics is investigated. Wartmannin treatment is used to do this, as it is known to remove spastizin from lysosomes. However, this is a very indirect manipulation that could have many other consequences and it would be better to demonstrate this directly by showing the effect of depletion of spastizin on lysosomal morphology/dynamics. I also think the role of AP5Z1 in tubulation/dynamics would be better supported with additional experiments to deplete the protein - at present only over-expression is examined.
      5. While the experiments showing that over-expression of dominant negative forms of KIF13A and p150Glued affect lysosomal tubulation/dynamics provide good circumstantial evidence that spatacsin influences these lysosomal properties via its interactions with spastizin and AP5Z1 (which bind to these motor proteins), the authors have not shown that the interaction of the motor proteins with spastizin and AP5Z1 is required for this ability to regulate lysosome tubulation/dynamics. This means that the model presented in figure 7 is not fully supported by the data. If the authors have been able to map the binding regions for these interactions then perhaps this could be investigated with rescue experiments, although I appreciate that this is potentially a major piece of work and perhaps outside the scope of this paper. An alternative would be that the authors acknowledged this part of the model as somewhat speculative.

      6. Are the data and the methods presented in such a way that they can be reproduced?

      Yes - Are the experiments adequately replicated and statistical analysis adequate?

      In general I am not convinced that the statistical tests are applied rigorously in this paper. Most experiments are done three times, but the "n" used for statistical testing is typically chosen as, e.g. the number of cells, number of lysosomes, rather than number of biological repeat experiments. This means that inter-experimental variability is not rigorously taken into account. A more rigorous practice would be to use the mean measures for each of three biological repeats and apply the statistical tests to the three means, so n=3 if three repeats were done. Superplots would be a nice way to graphically display these data.

      Minor comments:

      1. In supplementary figure 3D I cannot honestly say that I see the smaller band.
      2. When first called out, I expected supplementary tables 1 and 2 to show the list of interactors with wild-type spatacsin and spatacsind32-34 respectively, but this is not what they show.
      3. The experiments in Figure 4A are a little problematic in the way that they are called out. The first call refers to just a small subset of the data in the figure, and the figure is then called out at various points later in the paper. This is quite confusing. Is there any way this could be simplified?
      4. The section on page 10: "Spatacsin also interacts with spastizin, and is required to recruit spastizin to lysosomes (Hirst et al., 2021). ........ We hypothesized that spatacsin interaction with spastizin was required for spastizin localization to lysosomes." Is odd, as the authors seem to be hypothesising an observation that they have just said has already been demonstrated.
      5. Can the authors explain why there is so little interaction between wild-type KIF13A and spastizin?
      6. In figure 6G p150Glued signal is also present in the control IP lane, which casts doubt on the specificity of the interaction. Could the authors generate a cleaner result?
      7. I would be interested to see how AP5Z1 expression differs between neurons with and without spatacsin- we would expect similar results to those shown in the MEFS.

      8. Are prior studies referenced appropriately?

      Yes. - Are the text and figures clear and accurate?

      Yes - Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Overall I thought the presentation was good. However, this is a complex paper and anything that the authors can do to simplify the textual descriptions of the experiments would be helpful. There are quite a few long multiphrase/multiclause sentences that could perhaps be broken up or simplified, e.g. I had to read the following three or four times to understand it: "Downregulation of UBR4 that prevented degradation of AP5Z1mediated by spatacsin (Figure 4A) led to higher interaction of spatacsin with AP5Z1 and decreased the interaction of spatacsin with spastizin (Figure 4A)."

      Referees cross-commenting

      Thanks for the opportunity to comment on the other reviews. It does seem that there is a consistent theme that reviewers are concerned about the over-reliance on over-expression experiments and the need for additional experiments using endogenous antibodies or protein depletion methodologies to strengthen the data. In addition, I and at least one other reviewer feel that it is not adequate to use number of cells as the "n" for statistical testing, and that true biological repeats are needed.

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      I think this paper represents a significant conceptual advance in our understanding of the mechanisms by which lysosomal dynamics are controlled in non-polarised cells and neurons. In addition, it elucidates mechanisms that may underlie multiple forms of hereditary spastic paraplegia, a hereditary form of motor neuron disease.<br /> - Place the work in the context of the existing literature (provide references, where appropriate).

      This is a significant conceptual advance on the current literature on spatacsin and on the molecular mechanisms controlling lysosomal morphology/dynamics. The paper elucidates important mechanistic details of the relationship between three key proteins involved in hereditary spastic paraplegia, while also shedding light on the basic biology of lysosomal morphology and dynamics. - State what audience might be interested in and influenced by the reported findings.

      Basic cell biologists interested in the ER, in lysosomes, in ER-organelle contacts. Scientists interested in the causation of hereditary spastic paraplegias. - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Membrane traffic, lysosome function, ER-endosome contacts, hereditary spastic paraplegia.

    1. One reason may be the emotional investment we all have in language. Language is more than a neutral medium for transmitting a message. It has washed over us like a river continually since birth. We use it constantly. It shapes who we are. Think back to your earliest memories. Can you ever remember a time when you were without language? Identity and language twine about each other so tightly that they are impossible to separate. Children of immigrant families, for example, often associate the language of their home with warmth and strong personal connections, with the deepest, private sense of who they are, in contrast to the formal public language of school and the outside world.

      What I have gathered from this is that language makes us the people we are today and in this context that is correct because language is the one thing that we as people or social beings can’t be without it it connects us to not just people but places and beings around us and it transcends across the globe and it’s very interesting to think about language like this.

    1. Author Response

      Reviewer #1 (Public Review):

      Ciliary length control is a basic question in cell biology and is fascinating. Regulation of IFT via calcium is a simple model that can explain length control. In this model, ciliary elongation associates with an increase in intraciliary calcium level that leads to calcium increase at the ciliary base. Calcium increase acts to reduce IFT injection and thus ciliary assembly rate. The longer the cilia, the more increase of calcium level and the more reduction of IFT injection and thus the ciliary assembly rate. When the cilia approach the genetic defined length, the gradual reducing assembly rate eventually balances the constitutive disassembly activity. Cilia then stop elongation and a final length is achieved. This work tested this model by manipulating the calcium level in cilia by using an ion channel mutant and treatment of the cells with EGTA. In addition, IFT injection was measured before and after calcium ciliary influx. Based on the outcome of these and other experiments, it was concluded that there is no correlation between changes in calcium level and IFT injection, thus challenging the previous model. This work is well written and the experiments appear to be properly executed. It nicely showed an increase of intraciliary calcium during cilia elongation, and beautifully showed that ciliary calcium influx depends on extracellular calcium. However, I felt the current data are inadequate to support the author's conclusion.

      We thank the reviewer for the positive assessment of the interest in our work, and we have performed additional experiments to address the reviewers concerns as discussed below.

      The authors showed that ciliary calcium increases along with ciliary elongation, which correlates with reduction of IFT injection. Thus, this result would support that calcium increase reduces IFT injection. To test whether reducing calcium influx would alter the IFT injection, the authors used an ion channel mutant cav2. Indeed, ciliary calcium level in the mutant cilia appears to be lower compared to the control in average. After measuring ciliary calcium level and IFT injection during ciliary elongation with mathematical analysis, it was concluded that reducing ciliary calcium level did not lead to increased IFT injection, which is distinct from the control cells. Thus, the authors concluded that calcium does not act as a negative regulator of IFT injection. However, if one examines the calcium flux in Figure 3B and IFT injection in Figure 4B of cilia less than 6 micron, one may draw a different conclusion. For the mutant cilia, the calcium influx is higher than that in control cilia and IFT injection is reduced compared to the control. Thus, this analysis is the opposite of the authors' conclusion, and is supporting the previous model. There is a rapid change in ciliary assembly rate at the early stages of ciliary assembly (see Figure 1C), thus, the changes in calcium influx and IFT injection in the earlier assembly stage would be more appropriate to assess the relationship between intraciliary calcium level and IFT injection.

      We thank the reviewer for raising this issue, which led us to examine the data more carefully. In looking at the numbers of cells with flagella in each length range, we became concerned that the apparently low calcium influx in shorter flagella in control cells compared to ppr2 or EGTA treatment might actually due to bias from technical issues: it is relatively difficult to image shorter flagella in our TIRF imaging setup, because shorter flagella have less flagellar surface area to attach the coverslip. The more motile the flagella are, the more likely are the cells to detach when their flagella are short, because the bending force of the flagella is strong enough to pull them away from their small area of adhesion. This effect is much stronger in control cells than in either the ppr2 mutants or EGTA treated cells, whose flagella are less motile. This led to a reduced number of cells examined with flagella shorter than 6 um (17 versus 34 for control and ppr2 cells, respectively). To overcome the difficulties and biased result, we observed more flagella in control cells. The new data has now been integrated with our previous data and shown in Figure 3. The new result shows that calcium influx in control cells is in fact higher than in the ppr2 mutant cells. So, our result is remains consistent with our conclusion, and we believe that it is not useful to analyze the shorter flagella separately.

      The authors used EGTA treatment to support their conclusion. However, EGTA treatment may induce a global calcium change of the cell, the outcome may not reflect actual regulation of IFT injection by ciliary calcium influx. For example, as reported elsewhere, the change of cAMP level in the cell body and cilia has a different impact on ciliary length and hedgehog regulation. The slower assembly of cilia in EGTA treated cells may be caused by many other factors instead of sole regulation by IFT.

      It is certainly possible that EGTA is affecting some process inside the cell that then indirectly affects IFT. Our experiments cannot rule this out. The fact that similar effects are seen with the ppr2 mutant argues against this idea, but again cannot rule it out. We have added the following caveat to the discussion:

      "Other calcium dependent processes in the cytoplasm might also potentially address IFT, and our results cannot rule out this possibility. However, we note that the ppr2 mutant also fails to show the effect on IFT or regeneration predicted by the ion current model."

      The authors only examined the impact of reducing ciliary calcium influx. To further support the authors' conclusion, it is recommended that the authors should examine IFT injection in a condition where ciliary calcium level is increased. Using calcium ionophore may not be a good choice as it may change the global calcium level. One approach to consider is using mutants of a calcium pump present in cilia.

      We thank the reviewers for this suggestion. The calcium current model would predict that if a calcium pump mutant failed to export calcium, the increased calcium building up inside the flagellum should lead to decreased IFT entry and a shorter flagellar length. We found at least two calcium pumps in the published Chlamydomonas flagella proteome (Pazour et al., 2005) and ordered several mutant strains from Chlamydomonas Library Project (CLiP) which are annotated as affecting these pumps. We measured the flagellar length of these potential calcium pump mutant strains, but none showed a statistically significant difference in length relative to control cells. We have now included this data as Figure S4. Because no length change was observed, we did not perform the extremely time consuming process of constructing strains that contain these mutations along with DRC4-GCaMP and KAP-GFP.

      As an alternative strategy to get at this reviewer's suggestion, we measured DRC4-GCaMP and KAP-GFP intensity in 1 mM CaCl2 treated flagella and found that CaCl2 treatment increases both the flagellar calcium level (Figure 3, see below) and IFT injection (Figure 4). This increase in IFT injection is the opposite of what the calcium current model predicts.

      Based on these results, we think the calcium pump experiment is not necessary because of the following reasons. 1. These calcium pump mutants might not increase the flagellar calcium level. 2. Even if the flagellar calcium was increased in these mutants, it does not affect the flagellar length and thus our conclusions would still hold. 3. These mutant strains might still have functional calcium pumps since the existing data on calcium pumps in flagella is likely to be incomplete. 4. The CaCl2 experiment clearly increased the flagellar calcium level inside flagella, directly addressing the point that the reviewer is getting at.

      The conclusion on line 272-273 may need more evidence. The authors showed that addition of 1 mM CaCl2 does not change ciliary assembly, and used this as one of the evidences to argue against the ion-current model. The addition of calcium extracellularly may not alter intracellular/intraciliary calcium level given that cells have robust systems to control calcium homeostasis. To support the authors' conclusion, one should measure the changes of calcium level in the cell/cilia or revise their conclusion.

      We have now performed these measurements and have included the data in Figure 3D.

      The authors showed nicely the changes in IFT properties before, during and after ciliary calcium influx and found that the intensity and frequency of IFT do not have a correlation with calcium influx though calcium influx restarts paused IFT trains for retrograde transport as previously reported (Collingride 2013). The authors again concluded that this is supporting their conclusions in that there is no correlation between IFT injection and calcium influx. However, I am not sure whether the short pulses of calcium influx at one time point would change the calcium level in the whole cilia in a significant way that would alter IFT injection at the ciliary base.

      We agree that individual pulses might not have an effect on the average level of IFT injection. We were specifically trying to see if, having previously ruled out the predicted correlation at the level of average rates, there might still be a trace of the correlation for individual events.

      Reviewer #2 (Public Review):

      The authors use a genetically encoded calcium indicator to measure Ca in flagella to establish that Ca influx correlates with flagellar length. (Despite this correlation, there is so much noise that it is dubious that Ca level can regulate the flagella's length.) Then, they show that reduced Ca decreases the rate of IFT trains entering flagella, which ruins the ion-current model of regulating flagella's length. (Ca can still be one of the factors that sets the target length.) Ca does not seem to change the disassembly rate either. There are also no correlations between Ca influx spikes and IFT injection events. Curiously, these spikes broke pauses of retrograde IFT trains, but that still did not affect IFTs entering dynamics.

      Some other possibilities like Ca regulating unloading rates are discussed and convincingly rejected.

      The study ends with an interesting Discussion, which talks about other possible models, and concludes that the only model not easily rejected so far is the mechanism relying on diffusion time for kinesins from flagella to the cell body being greater in longer flagella.

      The paper is well written, very thorough, contains significant results.

      We thank the reviewer for this strong positive assessment.

      Reviewer #3 (Public Review):

      This work by Ishikawa et. al is focused on testing the hypothesis first proposed by Rosenbaum that Ca2+ levels in the primary cilia act as an internal regulator of cilia length by negatively regulating intraflagellar transport (IFT) injection and/or microtubule assembly. The authors first built a mathematical model for Ca2+ based regulation of cilia length through the activity of a Ca2+ dependent kinase. They then tested this model in the growing cilia of Chlamydomonas cells expressing an axonemal localized GCaMP. Ca2+ levels were manipulated genetically with a calcium channel deficient mutant line and with the addition of EGTA. While increases in Ca2+ levels do correlate with cilia length as expected by the model they found that IFT injection was positively correlated with IFT injection and increased axonemal stability which contradicts its potential as a mechanism for the cell to internally regulate cilia length.

      Overall the conclusions of the paper are supported by their data. They greatly benefit from first establishing their model in a clear form and then experimentally interrogating the model from multiple angles in order to test its viability. The importance of cilia length to our understanding of human health has only become greater in recent history and the authors are making a significant contribution to our understanding of ciliary length regulation.

      We thank the reviewer for this positive assessment, including of the relevance of the model. We have attempted to address all suggestions.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for thorough reading and for providing useful suggestions to improve our manuscript. We find two major issues indicated by the reviewers.

      1. Lack of pathophysiological relevance to attract a broader readership – to address, we have stained brain slices of PD patient’s with p129-Syn and Lamin B1 antibodies. Microscopy images show extensive lamina damages in the patient brain slices which contain p129-Syn positive inclusions. These images are now included in the current revision of the manuscript as 6C-D. We think that these results in the pathologically relevant systems will now establish a connection between lamina defects with neurodegeneration in PD and will be attractive for a broader audience.

      Experimental issues as indicated by major and minor points – majority of the points have been addressed in the current revision attached herewith. Given opportunity to submit a full revision, we shall incorporate more experiments to address all the points in the final revised manuscript.

      Point by point response to reviewer’s concerns:

      Reviewer 1

      R1: The work by Mansuri and collaborators reports that LB-like filamentous inclusions of α-Synuclein are able to associate with and perturb the nuclear lamina due to an unbalanced mechanical tension between cytoskeleton and nucleoskeleton. Consequently, lamina-injuries are proposed as a major driver of proteostasis sensitivity in cells with LB-like Syn-IBs.

      It is a complex work, in which a range of different cellular, biochemical and molecular techniques have been used. Readers of the paper (including the undersigned) will be wondering if a similar behaviour occurs in pathological systems, such as iPSC derived dopaminergic neurons arising from patients carrying the synuclein pathological mutations reported in this work.

      Response: We thank the reviewer for bringing out the lack of pathophysiological relevance in our manuscript. To address, we imaged post-mortem thalamus sections of a Parkinson’s Disease (PD) patient (BioChain Institute Inc., USA Cat# T2236079Par) and a control (BioChain Institute Inc., USA Cat#T2234079). Our experiments clearly show extensive lamina deformities in the patient brain (Fig. 6C-D) and connects with neurodegeneration in a pathological system.

      Major points

      R1: Authors should explain why there is a so high amount of p129-Syn in unseeded neurons (Fig. 1Ai, Fig. S1Bi): "p129-Syn was distributed throughout the neuron cell body and projections including light staining in the nucleus", as its accumulation is typical of PD-like α-syn aggregates. Similarly, unseeded neurons labeled with p129-Syn in Fig. 1Ai, Fig. 1Bi, and Fig. S1Bi and Fig. S1Ci are very different each other. Why? As neurons are unseeded, the pathological signature of PD-like α-syn aggregates should be very low or absent in all cases.

      Response: We agree with the reviewer that very low amount of p129-Syn should be present in unseeded neurons. We standardized microscopy parameters using fields that contained neurons with both large LB-like perinuclear IBs and smaller peripheral Syn-filaments. We used Leica SP8 confocal microscope. Argon laser power was kept constant at 30% of full potential while Smart Gain was titrated to visualize the smaller filaments. For example, the smaller filaments were not clearly visible in Annexure Figure 1Ai when Smart gain was 690V. Smaller filaments were prominent when the Smart Gain was increased to 848V (Annexure Figure 1Aii, included with the revision plan attached herewith). We also observed light intra-nuclear staining of p129-Syn at 848V Smart Gain when we zoomed the arrow indicated nucleus in Fig. 1Aii shown below as Annexure Figure 1Aiii. Accordingly, we used Smart Gain: 650-850V in all the images presented in the manuscript. Brightness and contrast are now adjusted for all the images prepared for the revised manuscript for the optimum view of the immunostaining. All the raw image files will be submitted to https://www.ebi.ac.uk/biostudies in due course.

      In order to rule out imaging artefacts at the higher Smart Gain (650V – 850V), we performed a control experiment without adding primary antibody against p129-Syn during immunostaining. Secondary antibodies were added and the Smart Gain was ~950-1000V during imaging. The light staining of p129-Syn as visible in Fig. 1Ai and 1Bi in the revised manuscript were not visible in this experiment (Annexure Figure 1B).

      A table indicating the Smart Gain for all the images is included in the revised manuscript as__ Methods Table S5 - Laser Intensity.__

      Reviewer 1 has also pointed out the difference in staining of p129-Syn in Fig. 1Ai and Fig. 1Bi. For Fig.1Ai, Rabbit monoclonal (p129-Syn (MJF-R13 (8-8), epitope: phosphoserine 129, cat# ab168381), and for Fig. 1Bi Mouse monoclonal (P-syn/81A, epitope: phosphoserine 129, cat# ab184674) were used. This information is now included in the figure legends. The difference in the staining pattern is due to the use of the different primary and secondary antibodies.

      Lastly, we want to emphasize that the staining pattern seen in unseeded neurons () are not the typical PD-like Syn-aggregates but the soluble p129-Syn that is yet to be incorporated into the amyloid-filaments. p129-Syn ((antibody MJF-R13 (8-8)) staining pattern in 1Ai is continuous in the projections and light dotted in the periphery and inside nucleus. These dots also accumulate on the Microtubule Organizing Centre (MTOC) indicating the presence of aggresome-like inclusion bodies in the neurons. The staining pattern in 1Bi (antibody P-syn/81A) is dotted throughout. In both the cases, the continuous or dotted staining were not observed after seeding. The continuous staining at the projections seen in 1Ai is broken into smaller filaments in 1Aii (indicated by arrowheads). The broken filaments are much more increased in number and length in Fig 1Bii and the staining-intensity prominently increased. Accumulation of multiple larger filaments into perinuclear LBs is typical PD-like (Fig. 1Bii, yellow arrowhead).

      The continuous staining and the broken staining patterns at the projections are also visible in the zoomed out MIP images presented in S1Bi and ii, respectively. The increase in fluorescence intensity of p129-Syn staining is prominent between S1Ci and ii indicating accumulation of p129-Syn in the form of large amyloid filaments in seeded neurons.

      We now discuss the staining patterns in the revised manuscript. Please see pages 4-7.

      R1: Authors should try to perform a more accurate quantification of the various colocalizations reported along the manuscript, i.e. by reporting the Pearson correlation coefficient or the Mander's overlap coefficient.

      Response:As suggested by the reviewers, Pearson’s co-localization coefficient values have been added separately for all figured showing co-localization in Supplementary note: Colocalization figures and table.

      Minor points

      R1: In Fig. S1B the red fluorescent signal arising from γ-tubulin staining is not visible in the merged picture.

      Response: Fig. S1B are the zoomed out MIP images of Fig. 1A. γ-Tubulin stains centrosome as tiny dots at the perinucleus in one of the z-sections of the MIP. To visualize these tiny dots in the MIP images, we have 1) optimized the brightness contrast of the MIP images and 2) provided a separate channel for γ-tubulin (arrowheads). These corrections are included in the revised version.

      R1: Page 6: results of Fig. S1D-E should be explained properly (CALNEXIND and CMX-Ros staining).

      Response:As suggested, we revised this part in Page 7.

      R1: Fig. 2A: the indication of SNCA in western blotting is not proper, as in this experiment you evaluated the protein level, so it is better to report "α-syn";

      Response:We agree with the reviewer. SNCA in western blots has been changed to α-Syn all the figures and figure legends.

      R1: Fig. S2B: there is great variability in the number of SNCA(A53T)- EGFP and SNCA(DM)-EGFP cells with IBs during the course of PFF-incubation, so that authors did not reveal any significant difference. I think it is not completely correct to emphasize this data at page 9, lanes 12-13;

      Response:We agree with the reviewer that the difference in number of SNCA(A53T)-EGFP and SNCA(DM)-EGFP cells with IBs was not statistically significant. Yet, we always observed aggressive biogenesis LB-like IBs in SNCA(DM)-EGFP cells. The statement in the manuscript is now corrected as per the reviewer’s suggestion (Page 9).

      __R1:__Did authors reveal any cytotoxicity upon Congo Red treatment at the indicated concentrations (Fig. S2G)?

      Response: Previously, Congo Red incubation was found to be non-toxic for neuronal cells even at 350 µM (PMID: 7991613). We have now performed MTT assay after Congo red treatment in our cells. The graph is now included as S2H. We did not observe any difference in cell viability even after treating the cells with the highest dose (100 µM) used in the experiment.

      R1: I have concerns about the percentages reported in Fig. S2G: the percentage of cells with filaments in the absence of Congo Red is apparently too low as compared to the previously reported percentages.

      Response:The reviewer is right. Number of Syn-filament containing cells varies between experiments because of ‘age’ of the recombinant amyloid seeds, different batches of seed preparation etc. We are repeating this experiment to increase the biological N. Results will be included and discussed in the final revised version

      R1: Fig. S2G: I also believe that authors should report representative images of cells treated with Congo Red, in which Syn-filament biogenesis is prevented;

      Response:As instructed by reviewer, the images are included in Fig. S2G.

      R1: Fig. 2Eiii: The stick arrowhead seems to indicate a separate blob that is not so red: authors should consider to show separated channels and not only the merged picture (as in Fig. S3).

      Response:We agree with the reviewer that the blob is not so red. We could not accommodate the separate channels in the main figure because of space constraint. Therefore, we presented the separate channels in Fig. S3A. Now we are including the stick arrowhead also at Fig. S3A.

      R1:Page 10: authors should explain why they performed the LC3 staining;

      Response:Previous reports indicated association of LC3B with α-Synuclein inclusions in neurons (PMID: 21412173, 31375560). Therefore, we also stained our cells with LC3 antibody. The references are now incorporated in Page 10.

      R1: Why in Fig.2i, SNCA(DM) the ubiquitin signal is pink and not red?

      Response:The blue of the DAPI is slightly overlapping with the ubiquitin staining at the aggresomes as these bodies are perinuclear making it appear pink. Separate channels are provided in Fig. S3E.

      R1: Fig. 3, western blotting: as I previously reported, I think it would be better to write "total α-syn" instead of SNCA. Fig. 3D: is should be useful to explain properly the content of the soluble and insoluble fractions.

      Response:We agree with the reviewer. SNCA in western blots has been changed to α-Syn all the figures and figure legends.

      R1: Explain in the legend of Fig. 4 what is h2b tdTOMATO

      Response:We thank the reviewer for pointing out the lack of information. This is now included with a reference in the revised manuscript.

      Significance

      R1: Overall this is interesting to read, a lot of data are presented, demonstrating a new potential phenomena that would be important to a specialized audience in the field of synuclein misfolding, aggregation and cellular toxicity.

      Response: We have now included immunofluorescence images of post-mortem thalamus sections of a Parkinson’s Disease (PD) patient (BioChain Institute Inc., USA Cat# T2236079Par) and a control (BioChain Institute Inc., USA Cat#T2234079). Our experiments clearly show lamina deformities in patient brain (Fig. 6D). We think that these experiments will highlight the pathophysiological relevance of the manuscript to make it appropriate for a wider audience.

      Reviewer 2

      __R2:__The present paper titled "Nuclear-injuries by aberrant dynein-forces defeat proteostatic purposes of Lewy Body-like Inclusions" provides an in details and compelling study about the formation of aggregates of SNCA in presence of PFFs, which other proteins play a role in the formation of this inclusions, and which pathways are the major players. They study provides many well-done experiments to highlight the composition and the process formation of these aggregates. unfortunately I think the study is lacking in connecting these events with neurodegeneration. how do all the pathways study impact viability and functionality of neurons and other disease relevant cells like astrocytes and microglia? it is thus a work which mainly focuses on the pathways leading to the formation of inclusions leaving untouched the question of how this might impact the disease. This does not take away the value of the findings but it should be taken in consideration when deciding which journal to submit.

      Response:We thank the reviewer for the encouraging words and also for bringing out the lack of pathophysiological relevance in our manuscript. To address, we have performed immunofluorescence experiments with post-mortem thalamus sections of a Parkinson’s Disease (PD) patient (BioChain Institute Inc., USA Cat# T2236079Par) and a control (BioChain Institute Inc., USA Cat#T2234079). Our results show extensive lamina deformities in patient brain (Fig. 6C-D) connecting neurodegeneration in PD with lamina injuries.

      Further, although we found that LB-containing primary neurons and Hek293T cells do not show any loss in cell viability as estimated by LDH and MTT assays respectively (Fig 4A-B), they show sensitivity to additional stresses. LB-like IB containing Hek293T cells were unable to trigger stress response pathways and were vulnerable to heat stress. These results were already included in the earlier version of the manuscript (Fig. 4H-I). We now estimated sensitivity of neurons in presence of additional stress. We have subjected LB-containing neurons and control neurons to heat stress and estimated induction of Hsp chaperones by western blot and quantitative mass spectrometry. Preliminary results (included herewith) indicate that Hsp-upregulation is defective in neurons with LB-like IBs. These results are now included as Figure 4J-M in the attached revised manuscript. Repeat experiments with quantitative mass spectrometry will be included in the final revision.

      R2: I have a few suggestion for each figure which will not take much time, energies or expenses but that would overall make the paper easier to read and digest.

      R2::Fig 1: quantification of aggregates dimension, number and colocalization score with p62 (Pearson)

      Response:Co-localization score with p62 is included in the current revision (Supplementary note: Colocalization figures and table). Quantification of aggregate dimension, number etc. in neurons have been already documented by Mahul-Mellier et al. (PMID: 32075919). We are following the same protocol and therefore did not repeat the counting for neurons. However, if the reviewer thinks that its mandatory, we shall do that and include with full revision.

      __R2:__Fig 2: aesthetic comment: the way to read the figure should be consistent throughout the figure. they should be assembled either all in vertical or all in horizontal.

      Response:We tried. We find the current organization is the best fit to accommodate all panels.

      R2: Fig 3: 3E better to put an image without nocodazole to visualize the difference

      Response:The control image is now added in Fig. 3E.

      R2: 3D probe WB also for SNCA

      Response:Sorry for the confusion. The western blots in 3D are probed for both total Synuclein and p129-Syn. As suggested by the first reviewer, we have also changed SNCA to α-Syn which indicates the total Synuclein protein level.

      R2: 3K this WB needs quantification to backup the statement made

      Response:We are repeating this experiment. Results will be included and discussed in the final revised version.

      R2: 3I check the - and + for PFF and doxy. I believe they are wrong

      Response:We have rearranged the figure. The scheme in Fig. 3I (now Fig. 3H) is correct but we have made it simpler to avoid confusion.

      R2: Fig 4: missing IF of peri nuclear IBs with HS

      Response:The images are now included as Fig. S4E and discussed in page 19.

      R2: Fig 5: quantification of H2BTdTom exit from the nucleus

      Response:We have performed this experiment as a supporting evidence of the nuclear damage in presence of LB-like IBs. We have quantified the damages in Fig. 5A and D. We have also performed quantitative mass spectrometry to show nuclear entry of associated organelle proteins (Fig. S5G). We think, quantifying the H2BTdTom exit will not be a significant value addition to the manuscript.

      R2: Fig 6: some neurons with large PFF seems very unhealthy. is it possible to quantify neuronal viability may not with MTT which is not suited for single cells analysis?

      Response:The reviewer correctly pointed out that neurons with large LB-like IBs seemed unhealthy which was confirmed by ƴH2AX staining indicative of extensive DNA damage in Fig. 6B.

      R2: maybe it would be nice to have a WB with soluble and insoluble SNCA and p129 with ciliobrevin D with and without PFF. Ciliobrevin D might also impact degradative systems as demonstrated by the EHNA compound (PMCID: PMC5584856).

      Response:We have performed the dynein experiments to figure out the role of cytoskeleton-nucleoskeleton tension in the lamina injuries in LB-like inclusion containing cells. However, we think that the reviewer has correctly pointed out that dynein may have a direct role in degrading Synuclein by either autophagy or proteasome. Given the results of the suggested experiments are not going to change the final conclusion of the manuscript, we propose to limit ourselves in discussing this possibility and citing the paper in the current revised version of the manuscript (page 29).

      Significance

      R2: As already stated above, the experiments are correctly performed and the evidence are well-presented and demonstrated. the realm that this paper falls into is not though neuroscience. The aim of this paper is to study the formation of inclusions regardless of their impact on disease-relevant cell type functions. the presented experiments are numerous and even though the message is pretty clear some figure might be too crowded to correctly convey the message (see fig 3). some of these findings even tough with much less details were already suggested by other papers (PMCID: PMC5584856) in which the importance of the dynein was studied in the context of the communication between autophagy and proteasome. I think adding this angle with few experiments might add a little bit more relevance but it is also true that this paper has already a lot of data.

      Response:Thank you very much for the encouraging comments

      R2: the type of audience for this paper I think is a very specialized audience which is interested in molecular mechanisms of inclusions formation and protein-protein interaction. as a final statement the paper is beautifully done and is relevant but it lacks the translational angle.

      Response:We again thank the reviewer for reminding the lack of pathophysiological relevance. We have now included microscopic images of brain slices of PD patients with extensive lamina defects (Fig. 6D) and think this will attract a broader audience.

      R2: my field of expertise is neuroscience. I have expertise in bimolecular techniques as well as cellular techniques to study neurodegenerative diseases

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      In this manuscript, the authors investigated the role of AF10, a subunit of DOT1L histone-methyl transferase complex for writing H3K79me1-2-3 marks, in cellular reprogramming. Using siRNA-mediated knockdown and chemical inhibitors, the authors show that AF10, and DOT1L as a whole, are inhibitory to reprogramming of mouse embryonic fibroblast cells (MEF) to induced pluripotent cells (iPSC), suggesting that AF10 plays an important role in determination and changes in cell lineages. The authors also show that this effect of AF10 is not transcription mediated. Based on their ChIP experiments of H3K79me1,2,3 and RNA Pol II, the authors claim that the effect of AF10 is mediated by "changes in epigenome circuitry".

      Major comments

      1. The claim that AF10 and DOT1L inhibits reprogramming of MEF to iPSC is largely supported by authors' experiments. Mostly, the authors used expression levels of NANOG as a mark for pluripotency. While it is a well-documented mark, an orthogonal mark (such as colony morphology, embroid bodies, etc.) will increase the rigor and confidence. This is especially important in the context of testing something like DOT1L complex which plays important role in transcription.
      2. The data presented here largely supports the claim that AF10-mediated effect is not through transcription.
      3. The authors final model ¬- "negative feedback by RNA-PolII recruited DOT1L leading to ESC-like state" - is not supported by the data presented here.
        • For example, at line 295, the authors say that H3K79me1 pattern in ΔAF10 "resembles the H3K79me1 found in ESCs which are much more TSS-enriched for this modification compared to MEFs." However, the data in 5H show that the pattern in ESC matches more with AF10 fl than ΔAF10.
        • At line, 299, "given that AF10 deleted cells retain H3K79 methylation..". This statement highly contradicts data in 4B, 4C, 5G and 5H where it is shown that deletion of AF10 leads to substantial loss of H3K79me1,2.
        • While the authors showed there are changes in H3K79 methylation pattern upon AF10 deletion, its link to changes in iPSC reprogramming is not shown. The Pol II occupancy data, shown for WT MEFs and ESC, do not support any of part of this claim. Even further, there is no evidence for changes in Pol II occupancy levels upon AF10 deletion.
      4. How do authors reconcile that there is increased expression of AF10 in pluripotent cells (Fig. 1A and 1B) although it inhibits pluripotency?
      5. Line 341, "We do not find any evidence that H3K79me2 opposes spreading of H3K27me3 in reprogramming to iPSCs" seems to be an over-interpretation. The experiment just shows that inhibition of PRC does not change global H3K79me2 levels. A direct role of H3K79me2 on H3K27me3 is not tested here.
      6. Fig S1D shows that deletion of AF10 can have additional effect to inhibition of DOT1L. This is in contrast to most of the main figures, especially, fig 1E. Some comment about this discrepancy is warranted.

      Minor comments

      1. It might help the reader if authors put a schematic of reprogramming regimen for Fig. 1A.
      2. At line 146, the authors inference " ΔAF10 is estimated to contribute about 40% of the DOT1Li phenotype in reprogramming" is not clear. It may help the reader the reader if more information is provided for their analyses and interpretation.
      3. Line 324, a typo: it should be "AF10"
      4. Line 456, It might be better for readers if the authors report whether and how RT-qPCR was normalized to housekeeping genes etc.
      5. Line 582, It is not clear at what step human cells were spike in. Also the type of human cells should also be reported.
      6. At many places (e.g. Fig 1E, Fig S3D) authors seem to have used multiple t-tests. Please consider using something like ANOVA to avoid multiple t-test error.
      7. Fig 1E. It is commendable that authors show factor independent reprogramming. It will be helpful for readers if authors show number of days for OSKM-dependent and OSKM-independent growth in the schematic.
      8. Fig S1C is not clear as such. Please add more information in the figure or legends.

      Referees cross-commenting

      With regards to reviewer1's comments: I particularly agree with major points 1 and 2 that authors' current model regarding feedback regulation needs more evidence. The technical concerns regarding ChIP normalization, esp. point 5, are also well-warranted.

      With regards to rev3's comments: The major concern about another similar study is well-warranted. The authors may want to explicate compare and contrast their key inferences with the other study.

      Significance

      The present work provides good evidence that AF10-mediated H3K79me can contribute to cellular reprogramming independent of steady-state mRNA levels. However, I think that the manuscript falls short of providing the basis for it. The claim that it is through subtle changes in H3K79me patterns seems nebulous and unsupported by the data presented here. If the manuscript finds the mechanistic basis for AF10's role in cellular reprogramming, it will be of interest to readers in general epigenetics as well as clinical fields that use histone methyl transferase inhibitors for treating leukemia.

      I am not an expert in the field of cellular reprogramming; so, I may not be able to judge the merits or caveats of authors' reprogramming methods and analyses.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01682

      Corresponding author(s): Peter Keyel

      1. General Statements

      We thank the reviewers for their thorough and critical analysis of our manuscript. We have addressed most of the concerns and questions with our revised version. To address the remaining concerns, we plan to perform two lines of experiments— aerolysin sensitivity of dysferlin null C2C12 muscle cells and aerolysin sensitivity of ESCRT-impaired cells. When these experiments are complete, we believe the revised contribution will provides important novel insights into membrane repair that will appeal to a broad audience.

      Reviewer comments below are in italics.

      Description of the planned revisions

      Reviewer 1

      Major

      In order to show that patch repair is indeed protecting cells against aerolysin, the authors should disrupt patch repair of the cells under study and observe and increased toxicity.

      Reviewer 2

      Major

      *1. The effect of dysferlin overexpression does not indicate that patch repair is a protective mechanism or that dysferlin plays a significant role in aerolysin resistance. The authors should knock out dysferlin and assess cell resistance to lysis. *

      Reviewer 3

      Significance

      The work presents a foundation to further investigate into the mechanism of aerolysin function, following the discovery of the role of extracellular Ca2+ in its activity. As aforementioned, the role of dysferlin in resisting aerolysin also has potential, but the limitations of this work were discussed including the absence of performing a dysferlin knockout, although performing this experiment may help to strengthen the current finding.

      We agree with all 3 reviewers that a dysferlin knockout will complement our gain-of-function studies and this will strengthen the manuscript. We plan to challenge C2C12 myocytes that express control shRNA or dysferlin shRNA with toxin and determine their sensitivity.

      We chose this system instead of targeting a patch repair protein in HeLa cells for 3 reasons. First, it will provide the corresponding loss-of-function experiment to match the gain-of-function experiments we have already done. Second, other patch repair proteins work redundantly with other proteins, complicating their knockdown and/or their disruption may interfere with lipid/protein transport. Finally, dysferlin null C2C12 cells are commercially available, so other groups will have an easier time replicating our results.

      Reviewer 1

      Significance

      *and in the statement that a cellular process that has been artificially introduced in the experimental system is the cellular protection mechanism against aerolysin attack. In order to prove that this process is a bona fide protection mechanism, the authors should show that it is present without the need of overexpressing a protein that is not expressed at all either in the used cell line (HeLa), or in the natural cellular target of aerolysin (epithelial cells). The significance of the proposed protection mechanism is therefore questionable. *

      We plan to address this concern by using C2C12 muscle cells that have and do not have dysferlin. Muscle cells are natural cellular targets of Aeromonas during necrotizing soft-tissue infections.

      Reviewer 2

      Major

      *2. ESCRT complex was shown to play a role in plasma membrane repair following mechanical damage or perforin treatment of cells (Jimenez 2014, and Ritter, 2022). Whether ESCRT is important in aerolysin pore repair can be assessed by knocking out the Chmp4b gene or overexpressing dominant-negative mutant of VPS4a, E228Q. *

      We plan to use a previously characterized (Lin 2005 PMID: 15632132) inducible system (TRex cells) to express the dominant negative VPS4b E235Q in cells. We plan to pulse cells for 2 h with 1 ug/mL doxycycline one day prior to the assay. This pulse time and dose strikes a balance between cell death due to non-functional ESCRT, and compromising ESCRT function. Then we will challenge parental cells (TRex) or TRex cells expressing VPS4b E235Q with toxin and measure lysis. We also plan to compare plus/minus doxycycline as a further control. We will also use fluorescent toxins to compare binding across cell types.

      One caveat on the ESCRT work is that ESCRT has an essential role in MVB formation, and ESCRT effects might be due to perturbation of protein/lipid flux through this system in addition to their recruitment to the plasma membrane. Even with knockdowns and overexpression, it can be challenging to interpret some of the pleiotropic effects of altering the ESCRT complex. While we do not contest the role for ESCRT in plasma membrane repair, we suspect the role for ESCRT will be more complicated than previously appreciated. Digging deeper into these possibilities beyond our proposed experiment is beyond the scope of this manuscript.

      Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer 1

      *Major: The authors conclusions contradict established results, which they cite. Yet experimental conditions are not similar in two ways: toxin concentration-wise and toxin treatment duration-wise. *

      We agree with the reviewer that there were differences in experimental design between our study and the other cited studies. Due to the cited differences, our results, Gonzalez et al and Larpin et al are not necessarily contradictory on most points. Our conclusions differ from Gonzalez et al in that we do not think K+ efflux drives repair in the first hour, and differ from Larpin et al in that we observe Ca2+ flux after aerolysin challenge. Along with the toxin variables discussed below, we also discussed the potential cell type differences between the studies that may account for the discrepancy. We have now included these additional differences in our manuscript on line 435 for Larpin et al and lines 423-425 for Gonzalez.

      Our study set out to do something distinct from the prior studies. The prior studies did not compare the efficacy of distinct membrane repair mechanisms to the same toxin because that was not their study aim. Hence, our goal is not to prove the prior literature wrong, but contribute to a better understanding of the immediate membrane repair events triggered by aerolysin. We argue that the significance of our contribution is this comparative approach to membrane repair, which has not previously been done, and our finding that aerolysin engages distinct, but overlapping mechanisms compared to CDCs. We have updated our significance to better convey our advance, which is explained on lines 99-102, 128, 519-525.

      *While we appreciate the efforts of the authors to standardize the concentration of toxins used based on hemolytic units, we note that the concentrations used are very much higher than in the other studies cited. Indeed, based on table 1, materials and methods, and the various experiments, aerolysin has a LC50 of approximately 200 HU/ml, which corresponds to about 2 ug/ml. This is approximately 200x more concentrated than for example in Gonzalez et al 2011 and Larpin et al. 2021. It makes the validity of direct comparison with those studies questionable. *

      We agree with the reviewer that the toxin concentrations are different from prior studies. This is why we argue hemolytic activity needs to be reported along with toxin mass.

      One potential explanation for this difference is purification method. We do nickel NTA purification from whole bacterial lysates, instead of from the periplasm. It is possible that the most active aerolysin precipitates early or is otherwise lost in our purification process, which accounts for both the lower toxin specific activity and lack of toxin precipitation during trypsin activation that we observe. To control for impurities, we purified two preps of our aerolysin to >90% purity after nickel beads. However, we did not observe a significant change in specific activity or cytotoxic activity. We interpret this finding to suggest there was a trade-off between improved specific activity due to increased purity and loss of specific activity due to toxin inactivation during the extended purification process.

      We have included a new figure (Fig S10) showing our toxin purification and activity.

      *We noticed that the authors activate pro-aerolysin at high concentration (in the range of 1 to 5 mg/ml) and at room temperature. In our experience, under these concentration, activation leads to immediate oligomerization and massive precipitation. The final concentration of active toxin is thus unknown. *

      When we titrated the trypsin to determine the optimal concentration of trypsin to use, we did not observe oligomerization/precipitation (Fig S10B). If there was precipitation of aerolysin after trypsin treatment, we would expect a difference in cytotoxicity between pro-aerolysin and aerolysin treatment. We did not observe significant differences in cytotoxicity between pro-aerolysin and activated aerolysin (see Figs 1-2). Finally, we measured hemolytic activity on trypsin-activated toxin, so any precipitation would be expected to occur prior to assessing hemolytic activity. Thus, we argue our use of hemolytic activity measured after trypsin activation mitigates this risk.

      * The authors keep their cells in toxin-containing medium for the whole duration of the experiments, typically 45 minutes. This is in stark contrast with 45 seconds to 3 minutes transient exposure to toxin in Huffman et al 2004. *

      We agree this is one of the differences. We also note Huffman et al examined cells at 6 or 28 h later. While we ruled out the impact of MAP kinases on membrane repair occurring within 30 min of toxin challenge, we make no claims about their ability to promote cell survival at later time points. We have clarified these differences in the manuscript (line 461).*

      The authors do not report binding and oligomerization assays of the toxins. The only figure showing a western blot (fig. 7) is of low quality and shows unexpected observations. Aerolysin Y221G mutant is expected to bind and oligomerize. Yet, no band is present at about 250 kDa (expected oligomer) or at about 47 kDa (monomer). In addition, in aerolysin lanes (1 and 2) the oligomer is saturated, seems to be covering three lanes, indicating a possible spill-over. *

      We performed binding studies in Fig S3C and Fig S5. For Fig 7, in the original blot, the cell lysate is a wider band than the MV band, but there are only two bands, that remained in their respective lanes. We have now included another independent biological replicate of the aerolysin blot as Supplementary Fig S7D which shows clear demarcation between cell lysate and MV pellet. This blot was not included in the main figure because in the process of stripping and reprobing for all of the targets, we lost detection of our penultimate targets. We agree with the reviewer that oligomer bands for the Y221G were very faint, and we expected them to be stronger. In the new blot (Fig S7D), some oligomer can be detected. As a result, we are hesitant to risk over-interpreting these findings.*

      Finally, while the patch repair hypothesis is interesting, it is unclear why the authors decided to overexpress dysferlin in cell lines that normally do not express it. Sure, there is a repair phenotype but this phenotype is artificially introduced. Dysferlin is not expressed at all in HeLa cells. *

      One challenge with membrane repair is the difficulty perturbing the system due to redundancies. While loss-of-function experiments are important, gain-of-function experiments also add confidence to the system. The simplest way to perform a gain-of-function experiment is to add a well-known patch repair protein to a well-characterized cell line lacking it. Thus, exogenous expression of dysferlin enables us to test the hypothesis that increasing patch repair enhances repair against the toxins.

      We have included this rationale now in the manuscript, lines 366-369

      *Furthermore, dysferlin is not expressed in epithelial cells, which are the prime target of aerolysin. Why then focus on this protein? *

      We chose dysferlin because it is well-characterized as a patch repair protein, whose defect causes Limb-Girdle Muscular Dystrophy 2B and Miyoshi Myopathy. Additionally, setting up this assay enables future work to probe the role of individual dysferlin domains in patch repair.*

      Minor: The graphic legends should be boxed out to be clearly separated from the data. In Figure 4A, it is mixed up with the data. *

      This has been corrected.*

      Some western blots are saturated, e.g. B-actin in figure 4B. Full blots should be provided. *

      We have added full western blots as requested as Supplementary Figs S11-12.*

      In the methods, aerolysin sublytic dose for HeLa cells is specified at 62 HU/ml. In figure 5C and D, 31 HU/ml kills more than 50% of HeLa cells. This is not compatible. *

      Even when controlling by hemolytic activity, and toxin prep, we find some variability in toxin activity between assays. For the live cell experiments, 62 HU/mL remained sublytic despite the higher activity in the flow cytometry assays. We controlled for death in our live cell imaging experiments, by including TO-PRO. This confirmed the toxin was at a sublytic dose in those experiments.

      We included a new figure S10C to show the variation in LC50 per assay as a function of toxin specific activity. We have clarified that the sublytic dose was for live cell imaging experiments, lines 640-641.

      *Figure 2A and B have quite different LC50 for starting conditions ({plus minus} 200 HU/ml in A, 600-700 HU/ml in B). Why is it so different? Y-axis has a linear scale in A and a logarithmic scale in B. It would make comparison easier to have the same scale in both panels. *

      We agree there is variability between assays. We note that toxin doses change vary in other manuscripts that report toxin mass. For example, aerolysin varies by 10-fold (2 – 20 ng/mL) between figures in Gonzalez et al 2011. We interpret this variation as a common challenge for toxin studies. We mitigate this challenge by including controls for each assay so the relative change can be assessed. We provide additional transparency by including Fig S10 to show batch-to-batch variability of both our toxin preps and assays.

      We have changed the scale to linear in Fig 2.*

      The letters detonating statistically significant groups are sometimes unclear. For example in Figure 1A and B, PFO belongs to group a and b simultaneously. What does this mean? *

      Samples that share letters are not statistically distinct from each other. In the example cited, PFO is not statistically significant compared to all other bars with an a and is not statistically significant compared to all other bars with a b. While confusing at first, the alternative is a mess of stars and bars.

      This has been explained in lines 981-985.*

      In Figure 8, aerolysin hat a LC50 in cells overexpressing GFP-Dysferin of approximately 1700 HU/ml in A and of approximately 400 HU/ml in B. Why is it so different? *

      This is due to intra-assay variation. We include controls for each assay to ensure the trend remains consistent.*

      In Figure S1, it is unclear what the plots « all events » vs « single cells » mean. *

      We have clarified these plots.*

      In the discussion, the authors write « First, survival did not correlate with overexpression, which would be expected if dysferlin acted as Ca2+ sink ». What is meant? GFP-dysferlin overexpression does correlate with survival in Figure 1A. *

      We meant that the extent of Dysferlin expression did not correlate with survival. If Dysferlin acted as a calcium sink, cells expressing 100x dysferlin levels should be more resistant than cells expressing 1x dysferlin levels. If Dysferlin needs to serve a cellular function, the brightest cells may not be more resistant (or even be less resistant due to aggregates, etc). We checked to see if the brightest Dysf+ cells had better survival than the dimmest Dysf+ cells. They did not. However, all Dysf+ cells had better survival than Dysf- cells.

      We have updated the manuscript (lines 496-498) to reflect these changes.

      Significance

      *General assessment: The study strength lies in the several possible protection mechanisms that are tested. The weaknesses lie in the contradictions of the results reported here with established mechanisms, *

      We disagree with the reviewer that findings that contradict previously proposed mechanisms are a weakness for significance. Instead, we argue this is a strength of our study’s significance. Replication of prior studies’ conclusions using distinct experimental conditions is critical for the reproducibility and rigor of the underlying science, and may give new insights into toxin biology. While we acknowledge the differences in approach, these differences narrow the prior mechanisms that may have been assumed to be widely applicable. The finding that they cannot be replicated in our system suggests one or more of the differences between the studies may drive a critical aspect of aerolysin biology. For example, the Ca2+ difference with Larpin et al could be due to a cellular Ca2+ channel present in HeLa cells that is absent in THP.1/U937 cells.

      This distinction is expected to spur additional research in the aerolysin field.

      * Advance: The study contradicts previously established results but the experimental conditions used here are quite different to those used in the earlier studies, which makes the comparison quite difficult. As such it does not really fill a gap. *

      We have rephrased the significance to better convey both the gap our study fills in membrane repair and the advance that it has made. See lines 99-102, 128, 519-525.*

      Audience: The study will be of interest of specialized audience. *

      Given the emerging broad importance of membrane repair in response to endogenous pore-forming toxins, and the large gaps in the field of membrane repair, we respectfully disagree with the reviewer. We have revised our significance statements to better convey this broad appeal. See lines 99-102, 128, 519-525.

      Reviewer 2

      Major

      *3. I find the optimisation of lysin concentrations and data presentation quite confusing. I eventually understood, what was done, but I feel that the authors should be able to transform the data and plots so these are more accessible to a reader, eg a simple dose/time-response curves would be very helpful in that respect. For example, in Figure S1E, why does aerolysin appear to be less cytotoxic after 24 hrs than after 1 hr. In principle, I would expect to observe an additive effect, i.e. cell death at 1, 3, 6, 12, and 24 hrs should add to 100%; however, if 100% cells die at 500HU/ml, how can more cells die after 24hrs? Or am I missing something in the experimental design/data presentation? *

      We agree that presenting the results from cytotoxicity can be challenging. We use LC50 in the main text because it is easiest to understand. However, we provide all dose-response curves underlying those numbers in the supplemental data. We recently published our approach to assays and data analysis (Haram et al PMID: 36373947) to make it easier to understand.

      In Fig S1E, each time point is a distinct assay. In contrast to the approach suggested by the reviewer, where we read the plate at different timepoints, we used different replicates to generate the time points. As a result, the % will not add to 100. Instead, we observe that the majority of cell death occurs in the first hour. We have clarified our discussion of Fig S1E, lines 154-155.

      At 24 h, it is possible that cell growth interfered with the assay. The plate has a finite surface area. If control cells are confluent near the start of the assay, but toxin-treated cells are not due to cell death by aerolysin, the growth rates may not be equal. Since our focus is on proximal membrane repair events, and not on late signaling events, pursuing this further is beyond the scope of the current manuscript.

      *I also wonder whether using haemolytic units is appropriate (it may well be, if justified), given that the toxins used here have various membrane-binding properties. Wouldn't it make more sense to compare the cytotoxicity using nucleated cells? *

      We agree with the reviewer on the need for standardization, and do compare cytotoxicity using nucleated cells (HeLa). Our first level of standardization is the use of hemolytic units instead of toxin mass. This normalizes toxin activity to the ability to kill human red blood cells, which are widely accepted as having minimal membrane repair mechanisms. This gives us a baseline activity, and allows us to control for toxin impurities/differences between toxin preps/toxins. We prefer cytotoxicity over membrane binding for our baseline because it is a functional assay.

      After this first level of standardization, we compare the cytotoxicity in HeLa cells. This is one reason why the majority of our assays are performed in HeLa cells—we know how they behave at different toxin doses in our hands, the cells are easy to use, and we can standardize assays in the lab. We included HeLa cells as a control in Fig 5 to show the standardization requested by the reviewer. We split Fig 1 up differently to better convey the results.*

      1. The authors use "sublytic" concentrations of aerolysin (64HU) throughout most of the paper, but according to Figure S1C, 50% cells died at that concentration after 1hr, suggesting that when the cells were investigated over a shorter period of time, they were already dying - it's almost like the cells had life support turned off, but still being investigated as though they survived aerolysin treatment. This needs to be clarified or reassessed. *

      We agree with the reviewer that we did not track cell survival beyond 45 min in our live cell imaging assays. We labeled cells as ‘surviving >45 min’ to acknowledge the fact that these cells could have died at 46, 47, 60, or 600 min after the experiment ended. We focused on time points earlier than 45 min because proximal membrane repair mechanisms are expected to have occurred in that time, and had time to complete. We have updated the manuscript on lines 214-215.

      We next considered the reviewer’s excellent point that the cells alive at 30-40 min could be executing a cell death program. If this were the case, then based on our FACS data (Fig S1C), we would predict ~50% of total cells would be dead by 1 h. From Fig 3A, ~35% of the cells died in the first 45 min. From the remaining 65%, we would predict another 15% dying from this programmed cell death pathway, which would be 15/65 = ~25% of the surviving cells. We did not notice 1/4 of the surviving cells behaving distinctly. For example, the large error bars in 3H is due to a range of cell behaviors that we could not easily subgroup. For individual cells (shown in Figs 6 and 7), there is similarly no clear demarcation of 1/4 of the cells. While we see a gap with pro-aerolysin, that is ~1/3 of the cells (not the expected 1/4), and it is not repeated with aerolysin. While we can’t rule out a cell death program contributing to the top or bottom 1/4 of our results, removing the top or bottom 25% of data points would not alter our major conclusions from the live cell imaging. If a programmed cell death pathway that occurs in the 30-90 min range is identified for aerolysin, it would be interesting to see how that pathway changes repair kinetics. However, that would require identification of the death pathway.

      *

      1. What effect does the addition of 150mM KCl have on the plasma membrane, trafficking/repair - wouldn't the plasma membrane be depolarised? There were a number of papers by John Cidlowski in mid 2000s, where his team explored the effect of potassium supplementation on apoptosis - this may be worth exploring. *

      We thank the reviewer for suggesting these interesting papers. We have explored these papers, and our understanding of them is as follows. Franco et al 2008 PMID: 18940791 shows that ferroptosis is independent of high extracellular K+. This contrasts with Fas-dependent apoptosis, which is suppressed by high extracellular K+. This is consistent with the Cidlowski group’s other work (eg Ajiro et al 2008 PMID: 18294629) and Cohen’s group (eg Cain et al 2001 PMID: 11553634) showing that apoptotic DNA degradation performs better at low K+, and extracellular K+ interferes with apoptosis. Similarly, other papers have shown that NLRP3-activated pyroptosis can be blocked by addition of extracellular K+. Depletion of intracellular K+ inhibits endocytosis and other vesicle trafficking pathways.

      While these are good papers, they do not directly relate to our K+ findings, which is that blocking K+ efflux via elevated extracellular K+ levels has no impact on aerolysin-mediated killing. Therefore, to stay focused on the repair pathways, we opted not to include these papers to avoid distracting the reader from our key points. *

      1. Figure 3 and accompanied text: it would be more informative to show all the data rather than breaking it down to 45 min. In my view, *

      We have added histograms to show when individual cells died during the assay as supplemental Fig S3E. We used the three bins for the exact reason articulated by the reviewer—we wanted to consider cells that died fast vs slow differently. However, in order to interpret the data, a cutoff of 5 min was chosen as optimal. While we agree with the reviewer that the 5 min death could be dismissed, we presented the data to avoid questions about why we omitted those data.*

      1. I am curious whether EGTA diffuses into the cytosol through aerolysin pores. If so, then unlike BAPTA-am it would affect Ca inside and outside the cell. *

      We agree with the reviewer this is an interesting question. While EGTA might diffuse into the cytosol, its binding properties suggest it would be unsuitable to block cytoplasmic Ca2+ transients (see Nakamura 2019 PMID: 31632263). BAPTA binds to Ca2+ ~40x faster than EGTA, which enables it to capture Ca2+ prior to Ca2+-binding proteins. In contrast, EGTA is thought to be too slow to sequester intracellular Ca2+ before Ca2+-binding proteins. While EGTA might perturb Ca2+ close (

      *Are the authors confident that in the absence of extracellular calcium (EGTA treatment), aerolysin formed the pores at all? Have they looked, for example, at intracellular Na/K, or have any other evidence of membrane disruption? *

      Prior structural studies suggest that Ca2+ is not required for aerolysin pore formation. For example, Iacovache et al (2011) PMC3136475 induce oligomerization with low salt and pH 2+. Cryo-EM from the same group (Iacovache et al 2016 PMID: 27405240), showed pore formation under similar conditions.

      In Fig S3, aerolysin kills in the presence of EGTA at higher concentrations, suggesting that it can form pores when EGTA is present. Also, in Fig 2D, we used Tyrode’s buffer, which was made without Ca2+ or EGTA. We added the indicated amounts of Ca2+ in, and observed a reduction in lysis at low [Ca2+]. This argues against EGTA interfering with toxin oligomerization/pore formation because EGTA was not present, and the toxin still failed to kill.

      We have updated the manuscript (lines 203-205) to emphasize this point.*

      1. Figure 6 (and some other): I find the designation of statistical significance (a-f) quite confusing, as it is unclear which comparisons are statistically different. Looking at Figure S5, there was no difference between the effect of Annexin depletion on the toxicity of the three lysins. *

      Samples sharing the same letter are NOT statistically significant. This is done to avoid a mess of stars and bars with multiple comparisons. This has now been explained in lines 981-985.

      For Fig 6/ Fig S5 (now S6), there was a statistically significant difference in LC50 between control siRNA and Annexin knockdowns for SLO. We agree that visually the dose-response curve in Fig S6B looks similar. However, we note that the x-axis is a log2 scale, and the control line is distinct over the 250-1000 region. When we calculate the LC50, these differences give different LC50 values. Over multiple reps, these differences were consistent enough to be statistically different.

      Significance

      *The paper attempts to address an interesting question of aerolysin pore repair, and it is interesting from the perspective of a potential difference between various pore-forming proteins. *

      We agree with the reviewer and thank the reviewer for this assessment.*

      The study will be potentially interesting to a broad audience of biochemists/cell biologists and microbiologists working in the field of pore-forming proteins/virulence factors. *

      We agree with the reviewer and thank the reviewer for this assessment.

      Reviewer 3

      *Major comments In the first instance, the authors use a method of assaying the specific lytic activity of aerolysin in comparison to a number of different CDCs. Whilst it is acknowledged that these methods have been published in peer-review papers previously (e.g. Ray et al., Toxins, 2018), it would be great to have more information of how the specific activity is derived. Currently there is a convoluted method that makes a number of assumptions such as, but not limited to, 1) the number of dead cells measured in the FACS experiments is proportional to the activity of the different classes of PFPs however the authors do not show how they account for PFPs leading to loss of cells into debris which would involve a total cell count and *

      We thank the reviewer for raising these concerns. We tested these assumptions in our previous papers. We compared the FACS assays to other assays that measure total cells (i.e. MTT assay), and found that the FACS assay corresponds with the MTT findings. These findings were published in Keyel et al 2011 PMID: 21693578 and Ray et al 2018.

      Loss of countable events to debris is detected in our assay as saturation of cell death at a number under 100%. Since we perform dose-response curves, we can determine when the killing saturates. This is why loss of countable events does not change our ability to accurately calculate LC50.

      2) how the inflection or linear point is identified on individual experiments (e.g. Supp. Fig. 1B, 2A, 2B, 3A, 3B to name a few) and how reliable these points are (e.g showing the data points with model sigmoidal (?) curve and corresponding R values).

      This had been calculated manually in the prior version of the manuscript. To address the reviewer’s concern and to improve data quality, we reanalyzed all of our data by fitting our dose-response curves to logistic models, and determining the LC50 using that model. An in-depth explanation of our approach was just published in Haram et al PMID: 36373947, which we now cite (line 821). *

      Furthermore, the batch-to-batch variability of protein samples presented in table 1 may be an issue where inactive but folded protein can affect the formation of homo-oligomer pores so more effort to reduce the effects of batch variation would be integral to the foundation of this paper. Given that aerolysin has a very different action on cells then this new characterisation should be provided regardless of what has been previously published by the authors on the activity of CDCs on the cells.*

      We agree with the reviewer that batch-to-batch variability is a key concern for pore-forming toxins. To address the concern of batch-to-batch variability and toxin purity, we have added Supplemental Fig S10. In Fig S10C, D, we plot the LC50 against specific activity of each toxin prep when used against control cells. We found a statistical difference in LC50 between two of our toxin preps, but not between any of the others. Notably, there was no association between increasing specific activity and LC50.

      Furthermore, we tested the impact of impurities on our toxin prep. While we purify most toxins only using His-beads (obtaining ~40% purity) (Fig S10B), we purified two toxin preps to higher purity (>90%) (Fig S10A). We did not observe differences in LC50 between these toxin preps. The specific activity for these toxins did not increase. We interpret that finding to indicate the gain in specific activity for purity was offset by the loss of specific activity due to prolonged toxin purification.*

      • Can the authors provide the raw data for the total FACS observations (scatterplot for all events) and show that there is no significant loss of cells? Or at least there is accountability of the cells? *

      Our stop conditions were to collect at least 10,000 gated events instead of running for a set period of time/set volume to determine cell density. We provide example scatterplots in Fig S1A.

      * - Can the authors provide more information about how the linear regression on Supp. Fig. 1B and other experiments showing the model sigmoidal curve performed such that this work is more reproducible? *

      We agree with the reviewer that using logistic modeling would strengthen the work. To address this concern, we reanalyzed all of our data and switched to logistic modeling. This improved reproducibility for many figures. Changes that add or remove statistical significance to results include Fig 4A, loss of significance between Ca2+/DMSO and BAPTA/DMSO, Fig 6C, loss of significance for siRNA knockdown of A6 vs scrambled for ILY, and Fig 8A/B, gain of statistical significance for GFP-Dysf protecting SLO. We have updated our results accordingly.*

      The SEMs of some data points (specific lysis LC50 scatterplots, for e.g. Fig. 2C, 4A, 4C, 8A and fMAX plots, for e.g. Fig. 3B) may not be apparently representative of the skew (e.g. and individual values (including outliers). A clarification of the statistical analysis behind the results may benefit in a clearer understanding of how the SEMs were calculated and presented in the main figures. Also, further elaboration on the meaning of the lettering in the scatterplots (denoted as a, b, c etc.) across the main figures may help improve the interpretation of the data. *

      The SEMs were calculated by Graphpad and graphs also generated by Graphpad. To address the reviewer concern, we have switched all places where we plotted individual data points to median with no error bars. This will enable the reader to judge skew, outliers, etc without reliance on error bars.

      We have now further elaborated on the lettering in the scatterplots. Samples sharing the same letter are NOT statistically significant. This is done to avoid a mess of stars and bars with multiple comparisons. This has now been explained in lines 981-985.*

      Secondly, the authors present interesting results on the significance of Ca2+ on aerolysin's mechanism behind lytic activity and introduces dysfurlin-mediated patch repair as the primary cellular resistance mechanism against aerolysin mediated lysis. Results from Figure 2-4, indicate that extracellular Ca2+ plays a role in aerolysin's function and cell lysis (aerolysin triggers influx of extracellular Ca2+). However, the results presented in figure 8 suggest an impairment of dysferlin translocation from the cytosol to the plasma membrane upon removal of extracellular Ca2+. If this were the case, wouldn't dysferlin impairment sensitise cells to aerolysin? Thus, in these sets of experiments it seems that Ca2+ is a confounding factor.*

      We agree that Ca2+ is a confounding factor, which is one reason we aimed to define better membrane repair mechanisms in response to different pore-forming toxins. Our interpretation is that Ca2+ triggers a death pathway that overcomes repair, and that aerolysin toxicity is due to the activation of this pathway. In this case, the impairment of Ca2+-dependent pathways does not reduce survival because the extent of damage is reduced/not present. Figuring out this death pathway is beyond the scope of the present manuscript, but a one future direction in which we are interested. This would also account for differences observed in different cell lines.*

      • Can the authors further elaborate on how the function of dysferlin in protecting cells against aerolysin contrasts to how aerolysin kills cells? *

      We have added the requested discussion to our manuscript, lines 519-525.

      *Finally, it is also interesting to see that cells deploy different resistance mechanisms between different families of pores. In saying that, the usage of CDCs seems to be inconsistent between each set of results. For example, intermedilysin (ILY) was used in the siRNA knockdown experiments but not in others such as Ca2+ influx assays, while PFO was only used for the initial set of results. A comment on this would benefit in understanding the rationale for selecting certain CDCs for each set of experiments. *

      We thank the reviewer for raising this point. We used SLO as the primary CDC in all the experiments because it is the CDC we have best characterized and have extensively published on. We included PFO in initial experiments to give readers a better idea of how multiple CDCs compare to aerolysin in target cells. However, since we’ve previously published on PFO, including it for later experiments would have increased cost and time of experiments without providing new knowledge.

      We used ILY because it binds to the GPI-anchored protein human CD59, so its binding determinant is more similar to aerolysin, which binds GPI-anchored proteins. We included it where practical to determine the extent to which targeting may change repair responses. Since ILY does not bind to murine cells, it was omitted from experiments using murine cells.

      We have added the rationale to the manuscript on lines 138-140.*

      Minor comments Results (Nucleated cells are more sensitive to aerolysin and CDCs) - A statement of the EC50 values of aerolysin and CDCs from the haemolytic assays would be beneficial to compare activities between the two pores. *

      The hemolytic activity is defined as the EC50 for the toxin in human red blood cells. The specific activity enables comparison of toxin activity, which is reported in Table 1. We have now added Supplementary Fig S10 which further plots the aerolysin and SLO specific activities against LC50 so that the reader can better assess batch-to-batch variability. In this study, we did not use enough batches of the other toxins to make this analysis useful for them.

      * - Figure 1A: As stated in the introduction, pro-aerolysin exists as a precursor that is functionally inactive unless activated by trypsin, furin or potentially other proteases. It would benefit the reader if an explicit statement were made about this activity and how it may come about in HeLa and 3T3 cells. Why is pro-aerolysin not shown in the Casp 1/11-/- BMDM cells? *

      The cell surface furin activity that activates aerolysin is not well-characterized across different cell types. We have revised the manuscript (line 76) to indicate these activities are present on the cell membrane.

      We omitted pro-aerolysin from the Casp1/11-/- BMDM because we performed those experiments earlier in the study before we started including pro-aerolysin. Based on the other results, we judged that the time and resource costs of adding pro-aerolysin in this system outweighed the gain to the story.

      * - Figure 1C: It was stated that "Casp 1/11 -/- Mo were ~100 fold more sensitive to pro-aerolysin and aerolysin compared to PFO and SLO" but did not show the activity for pro-aerolysin in these cells. *

      We thank the reviewer for catching this typo, and have corrected this statement (line 172).

      * - Supp fig 1E: Shouldn't 24 hr incubation of aerolysin to HeLa cells result in 100% specific lysis? *

      We agree with the reviewer that these results were surprising. At 24 h, it is possible that cell growth interfered with the assay. The assay well has a finite surface area. If control cells are confluent near the start of the assay, but toxin-treated cells are not due to cell death by aerolysin, the growth rates between control and experimental wells may not be equal. Since our focus is the proximal membrane repair events, and not the late signaling events, pursuing this further is beyond the scope of the current manuscript.

      * (Delayed calcium flux kills aerolysin-challenged cells) - What is the intracellular concentration of K+ normally in cells? Similarly, what is the intracellular concentration of Ca2+? *

      Intracellular K+ is ~140 mM (see Ajiro et al 2008 PMID: 18294629), while cytosolic Ca2+ is ~100 nM at rest.

      * - Figure 2C: Based on the description in the methods and results, both buffers are supplemented with 2 mM Ca2+ but one buffer (RPMI) shows more killing with SLO and ILY. Does this mean that both buffers contain 2 mM CaCl2? If so, what are the other potential reasons why one buffer enabled greater potency in CDCs? *

      RPMI has 0.4 mM Ca2+ prior to Ca2+ supplementation. However, the 2.4 mM Ca2+ did not provide improved protection compared to RPMI alone (See Fig 2 in Ray et al 2018).

      We suspect the various amino acids added to RPMI promote membrane integrity and account for the difference from Tyrode’s buffer. Glycine has previously been implicated in promoting membrane repair, but at higher concentrations than it is present in RPMI (0.133 mM in RPMI vs the mM concentrations used to protect cells). If other amino acids also protect, and/or why they protect is beyond the scope of the present work.

      * - Figure 3H: The data for aerolysin (WT) would greatly benefit for comparison to the inactive mutant (and indicate the sustained Ca2+ increase). *

      We have added this comparison, and updated the figure legend, line 1015.

      * - Supplementary Video V1: The addition of Triton X-100 permeabilises cells; however, this wasn't evident in (A). - Video V2: Similar to previous comment on Supplementary Video V1 (for B). *

      In V1A, the video was cut short to fit the play time with other videos. From addition, the triton takes a few minutes to diffuse to the cells and permeabilize them. In V2B, the cells do become permeabilized as shown by loss of the Ca dye. The cells are out of focus, which is why the nucleus TO-PRO is not detected.*

      (Calcium influx does not activate MEK-dependent repair) - Figure 4A: Effective ionic concentration inside and outside cell is increased (if intracellular Ca2+ becomes chelated); therefore, Ca2+ may enter the cell by passive diffusion or transport by other intrinsic Ca2+ channels. *

      There is already a very steep concentration gradient for Ca2+. The cytosolic Ca2+ is ~0.1 uM, compared with growth medium at 400 uM or assay buffer at 2400 uM. Chelation of the intracellular Ca2+ is not expected to increase Ca2+ import from outside the cell.*

      (Caveolar endocytosis does not protect cells from aerolysin) - Figure 5C: What is the purpose of using HeLa cells as a control? *

      We included HeLa cells to demonstrate the toxin was active and to rule out batch-to-batch variability as one interpretation of the reduced killing of differentiated 3T3-L1 cells.

      * - "..with Alexa Fluor 647 conjugated pro-aerolysin K244C" - this should be introduced earlier as it was initially mentioned in Supp. Figure 3C. *

      We have now introduced this earlier at line 190, instead of 300

      * - Murine fibroblasts were used earlier (Figure 1). Following from this result (where the WT can be used as a positive control), can MEFs be used instead of adipocytes to see whether caveolar endocytosis plays any role in cellular resistance? *

      The 3T3-L1 cells are murine fibroblasts prior to differentiation. Since they can also be differentiated into adipocytes, we used them instead of MEFs. The other reasons we used them include the availability of Cavin knockout cells, and the extensive caveolae present in adipocytes. We included analysis of 3T3-L1 prior to differentiation them in Fig 5B.

      * - Further comment on the increased resistance of K5 knockout would benefit on the mechanism of aerolysin-mediated cytolysis. *

      We agree further characterization of this line would be interesting in the future. At the present, however, any further comment would be speculative on our part. Since the resistance was not replicated in the second CRISPR line, we suspect it is either an unexpected mutation(s) in the cell line that arose during routine cell culture, or off-target effect(s) from the CRISPR used to generate the line.

      * (Annexins minimally resist aerolysin) - Supplementary video V3 - it seems that annexin A6 is recruited to the membrane, to a greater extent (and also quicker) than SLO. This suggests that annexin recruitment is a cellular response against aerolysin challenge. *

      We agree with the reviewer that annexins are recruited to the membrane during repair. However, individual knockdown did not enhance death. This is one reason we believe functional studies (i.e. cytotoxicity) are necessary when studying the cell biology of repair events. Recruitment of the protein, and it promoting repair may be two different things.

      In V3, three of the SLO-challenged cells have translocated by the time focus is restored. In contrast, the first aerolysin cells translocate ~10 min. One complicating factor is that A6 cycles back off the membrane with the SLO challenge.

      * o SLO also shows A6 recruitment (arrows pointed). However, supplementary figure 6B does not clearly illustrate this. *

      Given the 45 min time scale, the rapid initial membrane enrichment is hard to see on the graph.

      * - As annexin A1 is sensitive to calcium, further comment on the significance of intracellular/extracellular calcium in annexin A1 recruitment and aerolysin challenge would explain observations in Figure 4A. *

      We have updated the manuscript, line 242 to include annexins and dysferlin as Ca2+-binding proteins in our discussion of intracellular calcium.*

      (Patch repair protects cells from aerolysin) - Supplementary video V4 - the intensity decreases for the inactive mutant; is this due to lysis? *

      We included TO-PRO in the experiment to rule out lysis. Since the cells remain in focus, we interpret the lack of TO-PRO to indicate no cellular lysis.

      *- The next paragraph sounds like a contradiction: "GFP-dysferlin localized to the plasma membrane and vesicles independently of extracellular Ca2+ (Fig 8C D, Video V5) o Followed by "To study the Ca2+ dependency of dysferlin, we removed extracellular Ca2+ with 2 mM EGTA and challenged with sublytic toxin doses...found less depletion of dysferlin from cytosol". *

      We thank the reviewer for pointing out our unclear language. In the second section, we intended to refer to dysferlin positive vesicles. We have rephrased the manuscript (lines 388-395) to clarify that we are focused on Ca2+-dependence of vesicle fusion, not steady-state.*

      (Methods) - Table 1: The values presented in the methods section are, overall, confusing and require clarification. *

      We have added Fig S10, and discussion of toxin activity and purity in the methods (lines 634-641) to provide further clarity on toxin activity.

      * o 10-fold difference in SLO and PFO WT - do the authors think this might change the interpretation between different figures? *

      We do not. The reason is that we changed the membrane affinity between SLO and PFO (Ray 2018), and this switches the properties of the respective toxins without changing their yields.

      * o Understood how the haemolytic activity was calculated (referred to work in 2012), but how was the haemolytic unit originally derived? *

      It was derived as a measure of activity for toxins by determining the EC50 in RBCs for a given toxin. Since species type of RBC and other factors can change the reported activity, we have normalized to using human red blood cells. This lets us assay human-specific toxins like ILY along with other toxins.

      * o How were these values (from table 1) derived to toxin concentrations used for killing nucleated cells? *

      Full discussion of our assay was recently published in Haram et al 2022 PMID: 36373947. For the cytotoxicity assays, we use the hemolytic activity. Suppose from Table 1, the toxin stock is 1.5 x10^5 HU/mL. Then to prepare a 2x working toxin stock, we dilute the toxin to 4 x10^3 HU/mL (this is a 1 in 37.5 dilution). To get the range of concentrations used in the dose response curve, we perform a 2-fold serial dilution. Finally we mix equal volumes of toxin and cells, giving us the final 1x toxin activity (2 x10^3 HU/mL for the highest concentration in this example).

      * o Therefore, an EC50 haemolytic curve showing the activities for all toxins would greatly facilitate in understanding the derivation of values for table 1.*

      The hemolytic unit already incorporates the EC50 hemolytic curve. 1 HU is the EC50 of the toxin in the human RBCs.

      * - Flow cytometry assay: What is meant by gating out the debris? And would debris also contribute to the count in dead cells? *

      We illustrate our gating strategy in Fig S1. The debris falls in the front left corner of the plot, and includes electronic noise, non-cellular debris and cellular fragments. Since one cell could give rise to multiple pieces of debris, we exclude the debris from analysis.

      * o What was added as the high PI control? *

      In Fig S1A, the high dose of toxin was used for maximal killing. In our cell populations, there is a low level (2-5%) of dead cells that serve as a control for PI staining. In the past, we’ve used 0.01% triton to validate permeabilization of the cells. We have also compared PI uptake with MTT assays (Keyel et al 2011, Ray et al 2018) to confirm that the PIhigh cells are dead.

      *Elaborating reviewer #2's comment 7 regarding the addition of EDTA : with respect to measuring the binding if fluorescently labelled aerolysin, how can the authors differentiate between full functional pores versus prepores/incomplete pores? *

      This requires electron microscopy, which is the beyond the scope of our current study. However, prior work and Fig 2D show that aerolysin forms pores without the need for Ca2+ (see next point).

      How else can the authors validate whether aerolysin remains functional in the presence of EDTA?

      Prior structural studies suggest that Ca2+ is not required for aerolysin pore formation. For example, Iacovache et al (2011) PMC3136475 induce oligomerization with low salt and pH 2+. Cryo-EM from the same group (Iacovache et al 2016 PMID: 27405240), showed pore formation under similar conditions.

      In Fig S3, aerolysin kills in the presence of EGTA at higher concentrations, suggesting that it can form pores when EGTA is present. Also, in Fig 2D, we used Tyrode’s buffer, which was made without Ca2+ or EGTA. We added the indicated amounts of Ca2+ in, and observed a reduction in lysis at low [Ca2+]. This argues against EGTA interfering with toxin oligomerization/pore formation because EGTA was not present, and the toxin still failed to kill.

      We have updated the manuscript (lines 203-205) to emphasize this point.

      Significance

      *While the work has investigated in-depth cellular resistance mechanisms, the significance and benefits of this study are unclear. For example, the authors have used different human cell lines to dissect how these cells are affected by different pores but have not stated the significance and potential benefit of studying these cell lines. Further elaboration in this aspect may increase the relevance of the study, to an audience who is interested in the field of infection and disease. *

      We have updated our significance to better convey our advance, which is explained on lines 99-102, 128, 519-525. We also added benefits of testing the cell lines chosen on lines 167-168, and 277-278. We plan to add muscle cells to address the dysferlin points, which has relevance to necrotizing soft-tissue infections.

      Description of analyses that authors prefer not to carry out

      Not applicable

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      This body of work by Thapa & Keyel explores the differences in cellular resistance mechanisms between two different pore families (aerolysin versus CDCs). Herein, the authors were able to elucidate the toxin activities across a variety of different nucleated cells, using the haemolytic assay as a reference for normalising activity. Their findings revealed that, in general, aerolysins were relatively more potent than CDCs at damaging certain nucleated cell lines. Furthermore, the authors performed an exploration of different resistance mechanisms, including MEK-dependent repair, annexins, and patch repair by dysfurlin. The work provides some supporting evidence that patch repair is the main mechanism that cells deploy to prevent aerolysin-mediated cytotoxicity. Overall, the amount of work that was put in to craft the manuscript was impressive and the manuscript showed potential prospects in further investigating 1) mode of aerolysin killing in nucleated cells and 2) the role of patch repair and function of dysferlin in cellular resistance against aerolysin.

      Major comments

      In the first instance, the authors use a method of assaying the specific lytic activity of aerolysin in comparison to a number of different CDCs. Whilst it is acknowledged that these methods have been published in peer-review papers previously (e.g. Ray et al., Toxins, 2018), it would be great to have more information of how the specific activity is derived. Currently there is a convoluted method that makes a number of assumptions such as, but not limited to, 1) the number of dead cells measured in the FACS experiments is proportional to the activity of the different classes of PFPs however the authors do not show how they account for PFPs leading to loss of cells into debris which would involve a total cell count and 2) how the inflection or linear point is identified on individual experiments (e.g. Supp. Fig. 1B, 2A, 2B, 3A, 3B to name a few) and how reliable these points are (e.g showing the data points with model sigmoidal (?) curve and corresponding R values).

      Furthermore, the batch-to-batch variability of protein samples presented in table 1 may be an issue where inactive but folded protein can affect the formation of homo-oligomer pores so more effort to reduce the effects of batch variation would be integral to the foundation of this paper. Given that aerolysin has a very different action on cells then this new characterisation should be provided regardless of what has been previously published by the authors on the activity of CDCs on the cells.

      • Can the authors provide the raw data for the total FACS observations (scatterplot for all events) and show that there is no significant loss of cells? Or at least there is accountability of the cells?
      • Can the authors provide more information about how the linear regression on Supp. Fig. 1B and other experiments showing the model sigmoidal curve performed such that this work is more reproducible?

      The SEMs of some data points (specific lysis LC50 scatterplots, for e.g. Fig. 2C, 4A, 4C, 8A and fMAX plots, for e.g. Fig. 3B) may not be apparently representative of the skew (e.g. and individual values (including outliers). A clarification of the statistical analysis behind the results may benefit in a clearer understanding of how the SEMs were calculated and presented in the main figures. Also, further elaboration on the meaning of the lettering in the scatterplots (denoted as a, b, c etc.) across the main figures may help improve the interpretation of the data.

      Secondly, the authors present interesting results on the significance of Ca2+ on aerolysin's mechanism behind lytic activity and introduces dysfurlin-mediated patch repair as the primary cellular resistance mechanism against aerolysin mediated lysis. Results from Figure 2-4, indicate that extracellular Ca2+ plays a role in aerolysin's function and cell lysis (aerolysin triggers influx of extracellular Ca2+). However, the results presented in figure 8 suggest an impairment of dysferlin translocation from the cytosol to the plasma membrane upon removal of extracellular Ca2+. If this were the case, wouldn't dysferlin impairment sensitise cells to aerolysin? Thus, in these sets of experiments it seems that Ca2+ is a confounding factor.

      • Can the authors further elaborate on how the function of dysferlin in protecting cells against aerolysin contrasts to how aerolysin kills cells?

      Finally, it is also interesting to see that cells deploy different resistance mechanisms between different families of pores. In saying that, the usage of CDCs seems to be inconsistent between each set of results. For example, intermedilysin (ILY) was used in the siRNA knockdown experiments but not in others such as Ca2+ influx assays, while PFO was only used for the initial set of results. A comment on this would benefit in understanding the rationale for selecting certain CDCs for each set of experiments.

      Minor comments

      Results

      (Nucleated cells are more sensitive to aerolysin and CDCs)

      • A statement of the EC50 values of aerolysin and CDCs from the haemolytic assays would be beneficial to compare activities between the two pores.
      • Figure 1A: As stated in the introduction, pro-aerolysin exists as a precursor that is functionally inactive unless activated by trypsin, furin or potentially other proteases. It would benefit the reader if an explicit statement were made about this activity and how it may come about in HeLa and 3T3 cells. Why is pro-aerolysin not shown in the Casp 1/11-/- BMDM cells?
      • Figure 1C: It was stated that "Casp 1/11 -/- Mo were ~100 fold more sensitive to pro-aerolysin and aerolysin compared to PFO and SLO" but did not show the activity for pro-aerolysin in these cells.
      • Supp fig 1E: Shouldn't 24 hr incubation of aerolysin to HeLa cells result in 100% specific lysis?

      (Delayed calcium flux kills aerolysin-challenged cells)

      • What is the intracellular concentration of K+ normally in cells? Similarly, what is the intracellular concentration of Ca2+?
      • Figure 2C: Based on the description in the methods and results, both buffers are supplemented with 2 mM Ca2+ but one buffer (RPMI) shows more killing with SLO and ILY. Does this mean that both buffers contain 2 mM CaCl2? If so, what are the other potential reasons why one buffer enabled greater potency in CDCs?
      • Figure 3H: The data for aerolysin (WT) would greatly benefit for comparison to the inactive mutant (and indicate the sustained Ca2+ increase).
      • Supplementary Video V1: The addition of Triton X-100 permeabilises cells; however, this wasn't evident in (A).
      • Video V2: Similar to previous comment on Supplementary Video V1 (for B).

      (Calcium influx does not activate MEK-dependent repair)

      • Figure 4A: Effective ionic concentration inside and outside cell is increased (if intracellular Ca2+ becomes chelated); therefore, Ca2+ may enter the cell by passive diffusion or transport by other intrinsic Ca2+ channels.

      (Caveolar endocytosis does not protect cells from aerolysin) - Figure 5C: What is the purpose of using HeLa cells as a control? - "..with Alexa Fluor 647 conjugated pro-aerolysin K244C" - this should be introduced earlier as it was initially mentioned in Supp. Figure 3C. - Murine fibroblasts were used earlier (Figure 1). Following from this result (where the WT can be used as a positive control), can MEFs be used instead of adipocytes to see whether caveolar endocytosis plays any role in cellular resistance? - Further comment on the increased resistance of K5 knockout would benefit on the mechanism of aerolysin-mediated cytolysis.

      (Annexins minimally resist aerolysin)

      • Supplementary video V3 - it seems that annexin A6 is recruited to the membrane, to a greater extent (and also quicker) than SLO. This suggests that annexin recruitment is a cellular response against aerolysin challenge. o SLO also shows A6 recruitment (arrows pointed). However, supplementary figure 6B does not clearly illustrate this.
      • As annexin A1 is sensitive to calcium, further comment on the significance of intracellular/extracellular calcium in annexin A1 recruitment and aerolysin challenge would explain observations in Figure 4A.

      (Patch repair protects cells from aerolysin)

      • Supplementary video V4 - the intensity decreases for the inactive mutant; is this due to lysis?
      • The next paragraph sounds like a contradiction: "GFP-dysferlin localized to the plasma membrane and vesicles independently of extracellular Ca2+ (Fig 8C D, Video V5) o Followed by "To study the Ca2+ dependency of dysferlin, we removed extracellular Ca2+ with 2 mM EGTA and challenged with sublytic toxin doses...found less depletion of dysferlin from cytosol".

      (Methods)

      • Table 1: The values presented in the methods section are, overall, confusing and require clarification.
        • 10-fold difference in SLO and PFO WT - do the authors think this might change the interpretation between different figures?
        • Understood how the haemolytic activity was calculated (referred to work in 2012), but how was the haemolytic unit originally derived?
        • How were these values (from table 1) derived to toxin concentrations used for killing nucleated cells?
        • Therefore, an EC50 haemolytic curve showing the activities for all toxins would greatly facilitate in understanding the derivation of values for table 1.
      • Flow cytometry assay: What is meant by gating out the debris? And would debris also contribute to the count in dead cells?
        • What was added as the high PI control?

      Referees cross-commenting

      Elaborating reviewer #2's comment 7 regarding the addition of EDTA : with respect to measuring the binding if fluorescently labelled aerolysin, how can the authors differentiate between full functional pores versus prepores/incomplete pores? How else can the authors validate whether aerolysin remains functional in the presence of EDTA?

      Significance

      The work presents a foundation to further investigate into the mechanism of aerolysin function, following the discovery of the role of extracellular Ca2+ in its activity. As aforementioned, the role of dysferlin in resisting aerolysin also has potential, but the limitations of this work were discussed including the absence of performing a dysferlin knockout, although performing this experiment may help to strengthen the current finding.

      While the work has investigated in-depth cellular resistance mechanisms, the significance and benefits of this study are unclear. For example, the authors have used different human cell lines to dissect how these cells are affected by different pores but have not stated the significance and potential benefit of studying these cell lines. Further elaboration in this aspect may increase the relevance of the study, to an audience who is interested in the field of infection and disease.

      Section for special notes to the editor:

      My major area of expertise and contribution to this paper is in the analysis and interpretation of activity (lytic) assays.

    1. Author Response

      Reviewer #1 (Public Review):

      Neuronal tissues are very complex and are composed of a large number of neuronal types. With the advent of single-cell sequencing, many researchers have used this technology to generate atlases of neuronal structures that would describe in detail the transcriptome profiles of the different cell types. Along these lines, in this manuscript, the authors present single-cell transcriptomic data of the fruitless-expressing neurons in the Drosophila male and female central nervous systems. The authors initially compare cell cluster composition between male and female flies. They then use the expression of known markers (such as Hox genes and KC neuronal markers) to annotate several of their clusters. Then, they look in detail at the expression of different terminal neuronal genes in their transcriptomic data: first, they look into neurotransmitter-related genes and how they are expressed in the fruitless-expressing neurons; they describe in detail these populations that they then verify the expression patterns by looking into genetic intersections of Fru with different neurotransmitter-related genes. Then, they look at Fru-neurons that express circadian clock genes, different neuropeptides and neuropeptide receptors, and different subunits of acetylcholine receptors. Finally, they look into genes that are differentially expressed between male and female neurons that belong to the same clusters. They find a large number of genes; through GO term enrichment analysis, they conclude that many IgSF proteins are differentially expressed, so they look into their expression in Fru-neurons in more detail. Finally, they compare transcription factor expression between male and female neurons of the same cluster and they identify 69 TFs with cluster-specific sex-differential expression.

      In general, the authors achieved their goal of generating and presenting a large and very useful dataset that will definitely open a large number of research avenues and has already raised a number of interesting hypotheses. The data seem to be of good quality and the authors present a different aspect of their atlas.

      The main drawback is that many of the analyses are very superficial, resulting in the manuscript being handwavy and unsupported. The manuscript would benefit by reducing the number of "analyses" to the ones that are also in vivo validated and by discussing some of the drawbacks that are inherent to their experimental procedure.

      scRNA-seq studies generate atlases that are descriptive, by their nature. Therefore, we decided to keep interesting gene-expression analyses in the paper that are based on the scRNA-seq results, especially for the discoveries that point to exciting avenues for future pursuit. We reduced the text as suggested.

      1) The authors treat their male, female, and full datasets as three different samples. At the end of the day, these are, for the most part, equivalent neuronal types. The authors should decide to a) either only use the full dataset and present all analyses in this, or b) give a clear correspondence of male and female clusters onto the full ones.

      In this paper, all the analyses presented are on the full data set, with some links to the male or female data sets included. We now make clear that the full data set is the focus of the paper (lines 137-141). We provide the male and female data sets for our reader, with the individual Seurat objects uploaded to GEO, to make it easy for the reader to do follow-up analyses using the same criteria we used. We think this is helpful for our research community. We also compare the male and female clusters to the full data set using ClustifyR and report which clusters in the male or female data set analyses correspond to those in the full data set (Source data 2), as suggested by the reviewer, though ClustifyR has some limitations based on our evaluation of this tool for other annotations (see below).

      2) Most of their sections are heavily reliant on marker genes. In fact, in almost every section they mention how many of their genes of interest are marker genes. This depends heavily on specific cutoffs, making the conclusions fragile. Similarly, GO terms are used selectively and are, in many cases, vague (such as “signaling”, “neurogenesis”, “translation”).

      We evaluated marker genes, as those provide molecular identities to the clusters, given by definition they are significantly more highly expressed in a specific cluster, compared to all clusters. We used a Wilcox rank sum test with the following parameters in Seurat: (min.pct=0.25, logfc.threshold=0.25), which resulted in all called marker genes having p values < 0.05. We did not use a more stringent criteria given that most of the marker gene analyses are descriptive, and it is important to capture a broad range of genes. Our criteria are similar to Ma et al. 2021 (PMID: 33438579) and Corrales at al. 2022 (PMID: 36289550). In the text, we refer to the top 5 marker genes in several analyses, though these marker genes have a much more significant enrichment. We agree with the reviewers’ criticisms regarding the cluster-specific GO-term analyses in the text and those have been removed from the manuscript.

      3) A few of the results are not confirmed in vivo. The authors should add a Discussion section where they discuss the inherent issues of their analyses. Are there clusters of low quality? Are there many doublets?

      We have added discussion around these topics to the conclusions section of the manuscript and the results, when appropriate.

      On the same note, their clusters are obviously non-homogeneous (i.e. they house more than one cell types. This could obviously affect the authors' cluster-specific sex-differential expression, as differences could also be attributed to the differential composition of the male and female subclusters.

      We discuss this potential limitation in the discussion of sex differences in gene expression (Lines 959-961).

      4) Immunostainings are often unannotated and, in some cases especially in the Supplement, they are blurry. The authors should annotate their images and provide better images whenever possible.

      We appreciate this being pointed out and have provided higher resolution figures. The issue was we exceeded the manuscript submission file size on initial submission.

      5) I believe that the manuscript would benefit significantly by being heavily reduced in size and being focused on in vivo rigorously confirmed observations.

      We have addressed this comment by removing some of the analyses.

      Reviewer #3 (Public Review):

      This paper uses single-cell transcriptome sequencing to identify and characterize some of the neuronal populations responsible for sex-specific behaviour and physiology. This question is of interest to many biologists, and the approach taken by the authors is productive and will lead to new insights into the molecular programs that underpin sexually dimorphic development in the CNS. The dataset produced by the authors is of high quality, the analyses are detailed and well described, and the authors have made substantial progress toward the identification and characterization of some of the neuron populations. At the same time, many other cell types whose existence is suggested by this dataset remain to be identified and matched to specific neuron populations or circuits. We expect the value of this dataset to increase as other groups begin to follow up on the data and analyses reported in this paper. In general, the value of this paper to the field of Drosophila neurobiology will be high even if it is published in close to its present form. On the other hand, the current manuscript does not succeed in presenting the key take-home messages to a broader audience. A modest effort in this direction, especially re-writing the Conclusions section, will greatly enhance the accessibility and broader impact of this paper.

      While the biological conclusions reached by the authors are generally robust and of high interest, we believe that some conclusions are not sufficiently supported by the analyses that have been performed so far and need to be reexamined and confirmed. A major question concerns the authors' ability to distinguish a shared cell type with sex-biased gene expression from a pair of closely related, sex-limited cell types. There appear to be many cases that fall into this grey area, and the current analysis does not provide an objective criterion for distinguishing between sex-specific and sexually dimorphic clusters. Below we suggest some technical approaches that could be used to examine this issue. A second problem, which we do not believe to be fatal but that needs to be discussed, concerns potential differences in developmental timing and cell cycle phase between males and females, and how these differences might impact the inferences of sexual dimorphism in cell numbers and gene expression. Finally, we identify several areas, including the expression of transcription factors in different neuronal populations, that we believe could be described in more biologically insightful ways.

      For our review, we focus on three levels of evaluation:

      1). Is the dataset of high quality, useful to a large number of people, well annotated, and clearly described?

      The data appear to be high quality. The authors use reasonable neuronal markers to infer that 99% of their cells are neuronal in origin, suggesting extremely low levels of contamination from non-neuronal cells. Moreover, the gene/UMIs detected per cell are high relative to what has been reported in previous Drosophila scRNA-seq neuron papers (e.g. Allen et al., 2020). The cluster annotations are incomplete - which is not surprising, given the complexity of the cell population the authors are working with. 46 of the 113 clusters in the full dataset are named based on published expression data, gene ontology enrichments of cluster marker genes, and overlap with other CNS single cell datasets. This leaves rather a lot outstanding. It is probably unrealistic to aim for a 100% complete annotation of this dataset. But if we're thinking about how this dataset might be used by other researchers, in most cases the validation that a given cluster corresponds to a real, distinct neuron subpopulation will be left to the user.

      A major comment we have about the quality of the dataset relates to how doublets are identified and dealt with. The presence of doublets, an unavoidable byproduct of droplet-based scRNAseq protocols (like the 10x protocol used by the authors), could affect the clustering or at least bias the detection of marker genes. In large clusters, one might expect the influence of doublets on marker gene detection to be diluted, but in smaller clusters it could cause more significant problems. In extreme cases, a high proportion of doublets can produce artifactual clusters. The potential for problems is particularly high in cases where the authors identify cells with hybrid properties, such as clusters 86 and 92, which the authors describe as being serotonergic, glutamatergic, and peptidergic. Currently, the authors filter out cells with high UMI/gene counts, but it's unclear how many are removed based on these criteria, and cells can naturally vary in these values so it is not clear to us whether this approach will reliably remove doublets. That said, we acknowledge that by limiting their 'FindMarkers' analysis to genes detected in >25% of cells in a cluster the authors are likely excluding genes derived from doublets that contaminate clusters in low (but not high) numbers. We think it would be useful for the authors to report the number of cells that are filtered out because they met their doublet criteria and compare this value to the number of expected doublets for the number of cells they recovered (10x provides these figures). We would also recommend that the authors trial a doublet detection algorithm (e.g. DoubletFinder) on the unfiltered datasets (that is, unfiltered at the top end of the UMI/gene distribution). Does this identify the same cells as doublets as those the authors were filtering out?

      We appreciate this suggestion and have now added results from the doublet detection algorithm, DoubletFinder to our manuscript. Please see above response in editorial comments. We provide a table in Figure 1 – supplement 1 that indicates the number of cells removed by our filtering criteria: We acknowledge that there may be additional doublets in our data set that were not removed in our filtering criteria in the discussion (Lines 1098-1102) and have also provided a new table in Source data 2 indicating the number of potential doublets identified by DoubletFinder that are present in each cluster.

      2). What is the value of this study to its immediate field, Drosophila neurobiology? Are the annotation and analysis of specific cell clusters as precise and insightful as they could be? Has all the most important and novel information been extracted from this dataset?

      This is the part that we are least qualified to assess, since we, unlike the authors, are not neurobiologists. We hope some of the other referees will have sufficient expertise to evaluate the paper at this level.

      One thing we noticed (more on that in Part 3) is that the authors rely on JackStraw plots and clustree plots to identify the optimal combination of PCs and resolution to guide their clustering. This represents a relatively objective way of settling on clustering parameters. However, in a number of the UMAPs it looks like there are sub clusters that go undiscussed. E.g. in Fig. 2E clusters 1 and 3 are associated with smaller, distinct clusters and the same is true of clusters 2 and 6 in Fig 4b. Given that the authors are attempting to assemble a comprehensive atlas of fru+ neurons, it seems important for them to assess (at least transcriptomically) whether these are likely to represent distinct subpopulations.

      We appreciate these comments and address this above in the editorial comments section.

      3). How interesting, and how accessible is this paper to people outside of the authors' immediate field? What does it contribute to the "big picture" science?

      Here, we think the authors missed an important opportunity by under-utilizing the Conclusions section. The manuscript has a combined "Results and Discussion" section, where the authors talk about their identification and analysis of specific cell clusters / cell types. Frankly, to a non-specialist this often reads like a laundry list, and the key conclusions are swamped by a flood of details. This is not to criticize that section - given the complexity and potential value of this dataset, we think it is entirely appropriate to describe all these details in the Results and Discussion. However, the Conclusions section does not, in its present form, pull it all back together. We recommend using that section to summarize the 5-8 most important high-level conclusions that the authors see emerging from their work. What are the most important take-home messages they want to convey to a developmental biologist who does not work on brains, or to a neurobiologist who does not work on Drosophila? The authors can enhance the value of this paper by making it more interesting and more accessible to a broader audience.

      We appreciate this suggestion and made changes to the conclusions section to address this comment.

    1. Author Response

      Reviewer #3 (Public Review):

      Yamada et al utilizes the full strength of Drosophila neural circuit approaches to investigate second-order conditioning. The new insights into the mechanisms of how a learned cue can act as reinforcement are relevant beyond the fly field and have the potential to spark broad interest. The main conclusions of the authors are justified and the experiments, to my understanding, are well done.

      Some minor aspects must be addressed. To avoid misunderstandings a clear distinction should be made between those experiments using real world sugar and those using artificial activation of dopamine neurons as reward. For example, the proposed teacher - student model is mostly based on the work established with artificial activation.

      We split Figure 1 and made two separate figures. The new Figure 1 displays experiments with only real sugar or optogenetic activation of sugar receptor neurons (new data), whereas the new Figure 2 shows mostly experiments with direct DAN activations. This new figure arrangement makes a clear distinction between experiments with sugar and DAN activation, and allows readers to compare them more easily. We also modified the second paragraph of the discussion to clarify this point.

      To emphasize the generality of the model, it might help to provide some further evidence using real world sugar approaches, especially since the only known sugar-reward driven plasticity is reported in the student (g5b`2a) but not the teacher compartments. In this line, it would be useful to extend the functional interference used during the sugar experiments beyond the a1 compartment.

      In response to the reviewer’s comment, we added new data in Figure 2D to show that blocking PAM-DANs in γ4, γ5 and β′2a compartments impairs second-order conditioning following odor-sugar first-order conditioning. In contrast to blocking α1 DANs, blocking those non-α1 PAM-DANs did not impair one-day first-order memory (Figure 2D), which further strengthens our model of differential requirement of compartments for first-order and second-order memory formation.

      We think transient blocks of individual DAN cell types during second-order conditioning after odor-sugar conditioning will be informative to map second-order memories to specific compartments in naturalistic settings. For the reasons detailed above, however, we will need to develop a new way of transient purturbation for that.

      We would also point out that, to our knowledge, sugar-reward-driven plasticity has not been fully demonstrated in MBON-γ5β′2a. Owald et al., 2015 Neuron (10.1016/j.neuron.2015.03.025) showed a reduced CS+ odor response after odor-sugar conditioning in MBON-b′2mp (their Fig 3). However, they could not investigate the plasticity of MBON-γ5β′2a because the magnitude of odor response was too low (their Figure S3).

      Further, general statements about the compartments, for example for g5 and a1, might need adjustment since the tools used, the respective driver lines, often don't label all dopamine neurons in one specific compartment. In fact, functional heterogeneity among dopamine neurons innervating the g5 compartment have been recently established (sugar-reward, extinction) and might apply here.

      To clarify the point that we are manipulating a subset of DANs in each compartment, we added “cell count” information in Figure 2A. Also, we made Figure 4-figure supplement 2 to show which subtypes of DANs are connected with SMP108.

      Lastly, I would like to recommend that the authors discuss alternative feedback pathways that might serve similar or parallel functions.

      Despite these minor points, the study is impressive.

      Figure 4C shows several candidate interneurons that may have similar functions as SMP108. For instance, CRE011 may acquire enhanced response to reward-predicting odor as an outcome of reduced inhibition from MBON-γ5β′2a, and send excitatory inputs to DANs.

      In Figure 4-figure supplement 3, we made additional scatter plots to visualize other outlier cell types in terms of their connectivity with MBONs and DANs.

    1. The development of larp in the United States followed a trajectory common to larp across theworld. The currently dominant forms of larp developed out of TRPGs, such as D&D. How-ever, its evolution intersects with other practices as well: theater, parlor games, simulations likethe Model United Nations, and, especially, historical reenactment. The Society for CreativeAnachronism (SCA), a medieval reenactment organization started in 1966, was a significantinfluence, but United States Civil War reenactment was also important. It appears that themodern form of larp emerged in many places, nearly simultaneously across the country

      I wonder what the main differences between North American Larping and the type that occurs abroad. I rarely think about activities like this on a global scale, and I mainly usually focus on what we do domestically. That is a goal of mine, to learn more about how activities and games may differ depending on where they're located/being played.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank all the Reviewers for their highly constructive reviews. Below, I have pasted the Reviewer’s comments in black and my replies in red, for easy reading.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In their study, Zhu and colleagues study how the centrosome proteins Spd-2 and Cnn in Drosophila recruit gamma-tubulin complexes to centrosomes, which is an important step in mitotic spindle formation. The authors make use of mutant flies and RNAi and find that the two factors Spd-2 and Cnn together are responsible for mitotic centrosomal accumulation of gamma-tubulin. By inactivating Spd-2 or Cnn separately, the authors show that Cnn appears to recruit the large share of mitotic gamma-tubulin pool by its CM1-domain. Interestingly, this involves only gamma-TuSCs (subcomplexes of gamma-TuRC) and not gamma-TuRCs. A smaller pool is recruited by Spd-2, and this pool depends on gamma-tubulin complex proteins that are only present in pre-assembled, complete gamma-TuRCs. This suggests that Drosophila makes microtubule nucleation templates in two ways. First, as in yeast, by direct recruitment of gamma-TuSCs to mitotic centrosomes, where additionally oligomerization needs to happen. And second, by recruitment and activation of preassembled gamma-TuRCs. Inactivation of both Cnn- and Spd-2 pathways abolishes mitosis-specific gamma-tubulin recruitment, resulting in low, but not complete loss of gamma-tubulin at centrosomes. The authors show that these low-gamma-tubulin centrosomes are still able to organize microtubules, but these microtubules have different dynamicity. Inspired by existing literature in flies and other model organisms, the authors identify Msps/Xmap215 as an important nucleation factor in this scenario.

      Major points:

      1) The authors use fly embryos with mutant Grip71, Grip75 and Grip163 alleles, which are central to the study. Most conclusions are based on the assumption that some mutants contain only gamma-TuSC, whereas wildtype cells contain a mix of gamma-TuSC and gamma-TuRC. It would be important to show sucrose gradient analyses of extracts to confirm the expected presence/absence of gamma-TuSC/gamma-TuRC.

      We agree that it would be nice to perform sucrose gradient analysis of γ-tubulin mutants in different mutant backgrounds, but unfortunately this is not as easy as the Reviewer may think. To clarify, we have used larval brain cells (not embryos) for the analysis of γ-tubulin recruitment to centrosomes. We cannot use embryos because most mutant combinations are lethal beyond larval stages, meaning that mutant adult females are not available for embryo collection (embryos use maternally loaded proteins and mRNA and so it is the genotype of the mother that is important). Performing sucrose gradients with larval brain extracts would be extremely challenging, if not impossible, because a relatively large amount of starting material is required for sucrose gradient centrifugation, and manually dissecting and preparing hundreds if not thousands of larval brains is unrealistic, especially as mutant larvae are rare.

      Given that we are not able to carry out these experiments, we have modified the text to include the caveat that some higher-order complexes may partially form in certain mutants. For example, in relation to the ability of Grip71 to recruit γ-TuSCs in cnn,grip75,grip163 mutants, the text now reads: “Thus, Spd-2 appears to recruit a very small amount of γ-TuSCs (which may, or may not, be present as larger assemblies due to an association with Grip128-γ-tubulin) via Grip71 (i.e. the recruitment that occurs in cnn,grip75GCP4,grip163 GCP6 cells), but its recruitment of γ-tubulin complexes relies predominantly on the GCP4/5/4/6 core.”

      Nevertheless, the most important conclusion is that Cnn can recruit γ-TuSCs independent of pre-formed cytosolic γ-TuRCs and this is based on the finding from one particular mutant – the spd-2,grip71,grip75,grip128,grip163 mutant – where γ-tubulin levels at mitotic centrosomes are only very slightly reduced compared to single spd-2 mutants (Figure 1B). This conclusion is based on three assumptions that we argue are all very reasonable:

      Assumption 1: flies depleted of 2, if not all 3, GCP4/5/4/6 core components (grip75,grip128,grip163) do not have a functioning GCP4/5/4/6 core. The Grip75GCP4 allele is a null mutant and is combined with a deficiency chromosome that depletes the whole Grip75GCP4 gene, and the Grip163GCP6 allele is a very strong depletion allele and is also combined with a deficiency chromosome that depletes the whole Grip163GCP6 gene. Even if the efficiency of the RNAi against Grip128GCP5 were poor, it would be hard to form a GCP4/5/4/6 core without Grip75GCP4 and in the near absence of Grip163GCP6 (which together provide 3 of the 4 molecules of the complex, including the outermost ones).

      Assumption 2: cells depleted of the GCP4/5/4/6 core cannot assemble cytosolic γ-TuRCs. This is reasonable given that even individual depletion of Grip75GCP4, Grip128GCP5 or Grip163GCP6 already strongly reduces the presence of cytosolic γ-TuRCs (Vogt et al., 2006; Vérollet et al., 2006). In spd-2,grip71,grip75,grip128,grip163 mutant brain cells, the only γ-TuRC protein not targeted, except for the γ-TuSC components, is Actin (Mozart 1 is expressed only in testes (Tovey et al., 2018) and Mzt2 does not exist in flies). In Xenopus and humans, Actin appears to facilitate γ-TuRC assembly via interactions with a GCP6-N-term-Mzt1 module, and so it would be unlikely to allow γ-TuSC assembly into higher-order complexes without GCP6 (i.e Grip163GCP6) and Mzt1.

      Assumption 3: Were Cnn not able to recruit γ-TuSCs independently of pre-formed γ-TuRCs, we would expect a much stronger reduction in γ-tubulin recruitment to centrosomes in spd-2,grip71,grip75,grip128,grip163 mutant cells. It is reasonable to assume, even without sucrose gradients, that the assembly of γ-TuRCs is strongly impeded in spd-2,grip71,grip75,grip128,grip163 mutant cells. Nevertheless, γ-tubulin is still recruited to centrosomes at ~66% compared to ~77% in spd-2 single mutant cells. While statistically significant (as stated in the updated manuscript), this reduction would surely be much greater were Cnn not able to recruit γ-TuSCs.

      In the absence of experimental data, we have therefore made these arguments in the main text by making some text modifications and adding a new paragraph, as follows:

      *“….the centrosomes in spd-2,grip71,grip75GCP4,grip128GCP5-RNAi,grip163GCP6 mutant cells had ~66% of the γ-tubulin levels found at wild-type centrosomes, only slightly lower than ~77% in spd-2 mutants alone (Figure 1A,B). Thus, the recruitment of γ-tubulin to mitotic centrosomes that occurs in the absence of Spd-2, i.e. that depends upon Cnn, does not appear to require Grip71 or the GCP4/5/4/6 core. *

      While we cannot rule out that residual amounts of GCP4/5/4/6 core components in spd-2,grip71,grip75GCP4,grip128GCP5-RNAi,grip163GCP6 mutant cells may support a certain level of γ-TuSC oligomerisation in the cytosol, we favour the conclusion that Cnn can recruit γ-TuSCs directly to centrosomes in the absence of the GCP4/5/4/6 core for several reasons: First, the alleles used for grip71 and grip75GCP4 are null mutants, and the allele for grip163GCP6 is a severe depletion allele (see Methods), and even individual mutations in, or RNAi-directed depletion of, Grip75GCP4, Grip128GCP5 or Grip163GCP6 are sufficient to strongly reduce the presence cytosolic γ-TuRCs (Vogt et al., 2006; Vérollet et al., 2006). Second, spd-2,grip71,grip75GCP4,grip128GCP5-RNAi,grip163GCP6 mutant cells are depleted for all structural γ-TuRC components except for γ-TuSCs and Actin (note that Mozart1 (Mzt1) is not expressed in larval brain cells (Tovey et al., 2018) and that Mzt2 does not exist in flies). In human and Xenopus γ-TuRCs, Actin supports γ-TuRC assembly via interactions with a GCP6-N-term-Mzt1 module (Liu et al., 2019; Wieczorek et al., 2019, 2020; Zimmermann et al., 2020; Consolati et al., 2020), and so Actin alone is unlikely to facilitate assembly of γ-TuSCs into higher order structures. Third, our data agree with the observation that near complete depletion of Grip71, Grip75GCP4, Grip128 GCP5, and Grip163GCP6 from S2 cells does not prevent γ-tubulin recruitment to centrosomes (Vérollet et al., 2006). Fourth, given the strength of mutant alleles used, one would have expected a much larger decrease in centrosomal γ-tubulin levels in spd-2,grip71,grip75GCP4,grip128GCP5-RNAi,grip163GCP6 mutant cells were Cnn not able to directly recruit γ-TuSCs to centrosomes. Thus, our finding that Cnn can still robustly recruit γ-tubulin to centrosomes in spd-2,grip71,grip75GCP4,grip128GCP5-RNAi,grip163GCP6 mutant cells strongly suggests that Cnn can recruit γ-TuSCs to centrosomes without a requirement for them to first assemble into higher-order complexes.”

      2) Given the advantage of the CnnΔCM1 separation of function mutant, I do not understand why it is not used throughout the study. Instead, full Cnn loss is used, which results in strongly reduced Spd-2 levels (Figure 2C,D). Are the observed differences between wild-type and mutants in Figure 2-5 dependent on defective PCM or do they also occur in a CnnΔCM1 background?

      This is a good point, and we agree that it would have been “cleaner” to use the CnnΔCM1 mutant in these experiments. The reason that the CnnΔCM1 mutant was not used is that this mutant allele was made only after we had already generated the multi-allele stocks and performed most of the other experiments in Figures 2-5. It would have taken a long time to go back and generate fly stocks containing the CnnΔCM1 allele instead of the cnn null mutant allele. As we have shown that the CnnΔCM1 mutant cannot recruit any γ-tubulin, we don’t believe that using this mutant would change the results regarding recruitment of γ-tubulin by the spd-2 pathway i.e. when we have examined γ-tubulin recruitment in the cnn mutant background (Figure 2). Nevertheless, in terms of the efficiency to which microtubules can be nucleated in the absence of γ-tubulin complexes, which was examined in a cnn,grip71,grip163 mutant background, it is likely that using a cnnΔCM1,grip71,grip163 mutant background would better maintain Spd-2 in the PCM and thus better allow Msps and Mei-38 to stimulate microtubule nucleation. We may therefore find that microtubules can be nucleated even more efficiently in the absence of γ-TuRCs. Note that we do state this caveat in the paper. That said, performing the experiments would not be essential to conclude that microtubules can be nucleated independent of γ-TuRCs, which is the main point of this part of the paper.

      Should the Reviewer and Editor deem it necessary, we will generate CnnΔCM1,grip71,grip163 lines to test whether or not γ-tubulin can be recruited to mitotic centrosomes under these conditions, and, if no γ-tubulin is recruited, we will generate CnnΔCM1,grip71 ,grip163,Jup-mCherry lines to test the ability of these centrosomes to nucleate microtubules (using the CherryTemp). Please note, however, that this would be several months of difficult fly genetics and data collection and we would therefore appreciate it if you consider the cost/benefit ratio when making your decision on whether you expect this data or not.

      3) Statistical tests should support the conclusions in the text. If the authors claim differences between different genetic backgrounds (e.g. that spd2-mutants only have ~77% of gamma-tubulin at mitotic centrosomes compared to wild-type), statistical tests must compare mutant mitosis vs. wild-type mitosis.

      We agree. We have now carried out the appropriate statistical tests and included them in the new version of the paper. For more detail, see the response to Reviewer 2 point 2.

      4) While Cnn, grip71, grip163 mutants do not accumulate gamma-tubulin at centrosomes in mitosis, they still have low levels of centrosomal gamma-tubulin. It is therefore misleading to refer to "gamma-tubulin negative centrosomes".

      This is a fair point. While we suspect this small fraction of γ-tubulin is non-functional in regard to microtubule nucleation i.e. it is the interphase pool of γ-tubulin and interphase centrosomes do not organise microtubules, we agree that referring to them as "gamma-tubulin negative centrosomes" is misleading. We have now changed the text to refer to them simply as “cnn,grip71,grip163 mutant centrosomes” or “cnn,grip71,grip163 centrosomes”.

      Minor points:

      1) The abstract states that gamma-TuRC is a catalyst of microtubule nucleation. By definition, a catalyst takes part in a reaction but is not part of the final product. Although our knowledge of the nucleation mechanism is still incomplete, mechanistic studies suggest a non-catalytical mechanism since gamma-TuRC was found to stay attached to the microtubule end after nucleation (Consolati et al. 2020, Wieczorek et al. 2020).

      We have now removed any reference to the γ-TuRC being a catalyst.

      2) CnnΔCM1 flies: genotyping data should be provided besides describing gRNAs.

      We are not entirely sure what the Reviewer means here. We had already stated in the main text and methods that the deletion region spanned from R98 to D167. For further clarity, we now included the word “inclusive” in both the main text and the methods: main text: “We therefore used CRISPR combined with homology-directed repair to delete the CM1 domain (amino acids 98-167, inclusive) from the endogenous cnn gene…”; Methods:“R98 to D167, inclusive. Please do let us know if further information is required.

      3) Is it important to combine spd-2 with all four mutants, grip75 grip128 grip163 and grip71? What about spd-2 grip71 cells and spd-2 grip75 grip128 grip163 cells? Should that not have the same effect?

      This comment relates to Major point 1, as our main conclusion (that Cnn can recruit γ-TuSCs) is only possible when combining spd2 with all four mutants i.e. targetting all γ-TuRC specific proteins is the most likely way to deplete as many pre-formed γ-TuRCs as possible. Depleting only Spd-2 and Grip71 would leave fully assembled γ-TuRCs in the cytosol, as assembly does not require Grip71. Depleting Spd-2, Grip75, Grip128, and Grip163 would prevent cytosolic γ-TuRC assembly, but there is a possibility that Grip71 may still act as a link between γ-TuSCs and Cnn. It was therefore necessary to deplete Spd-2, Grip75, Grip128, and Grip163, and Grip71.

      4) CM1-containing factors are the only known factors able to directly bind and activate gamma-TuRC. How do the authors envision activation of gamma-TuRC in the absence of Cnn?

      This is a good question but remains unanswered. Phosphorylation of γ-TuRCs is the most obvious possibility. For example, Aurora A phosphorylates NEDD1 (homologue of Grip71) to promote microtubule nucleation (Pinyol et al., 2013). NME7 kinase has been shown to increase the activity of purified γ-TuRCs (Liu et al., 2014). Other γ-TuRC components are also phosphorylated, but the consequences on γ-TuRC activity are not known. Another possibility is that TOG proteins indirectly promote the closing of the γ-TuRCs while adding tubulin dimers onto γ-tubulin (Thawani et al., 2020).

      5) Do the authors think that each identified pathway to microtubule nucleation (i.e. Spd-2/gamma-TuRC, Cnn/gamma-TuSC, Msps/mei38) as revealed by mutant genetic backgrounds contributes to a similar extent to overall nucleation capacity also in an unperturbed genetic background?

      Another good question, but it is very difficult to answer. Our view is that when γ-TuRCs are present and active they will likely dominate microtubule nucleation, out-competing the ability of TOG domain proteins to stimulate microtubule nucleation independently of γ-TuRCs. Nevertheless, TOG proteins will likely help promote microtubule nucleation from γ-TuRCs when both are present, as has been previously shown in vitro (Thawani et al., 2018; King et al., 2020; Consolati et al., 2020) and in fission yeast (Flor-Parra et al., 2018). We also believe that both Spd-2 and Cnn γ-TuRC recruitment pathways will contribute simultaneously. Another question is whether Cnn recruits γ-TuRCs instead of γ-TuSCs when γ-TuRCs are present in the cytosol. We assume this will depend on Cnn’s affinity of γ-TuRCs versus γ-TuSCs and on the relative levels of γ-TuRCs and γ-TuSCs in the cytosol.

      6) How does CM1 mediate binding to gamma-TuRC? Using recombinant Cnn fragments, the authors find that a Cnn triple mutant (R101Q, E102A and F115A) no longer binds gamma-tubulin, suggesting these residues together mediate binding to gamma-tubulin complexes. However, it is not tested to what extent R101, E102 and F115 individually contribute to gamma-tubulin binding. Does the binding mode in Drosophila resemble more the one in humans or in budding yeast? Also, was this done with extracts from Grip71, Grip75, Grip128RNAi, Grip163 embryos or normal embryos?

      In future, we will test the relative contributions of R101, E102 and F115, but for this study we wanted only to show that the CM1 domain was necessary for Cnn binding (hence why we directly mutated all three residues). We apologise for not stating that the IPs were carried out using wild-type embryos extracts – we have now included this information in the main text and methods.

      7) Figure 2C: Should the green channel not correspond to Spd-2?

      Thank you for pointing out this mistake – now corrected.

      8) I suggest to reconsider the color-coding of graphs. While the colored background of the dot plots in Figure 1 and 2 are a matter of taste, the coloring of graphs in Figure 4F-H is confusing. Here, genetic backgrounds of fly lines are colored in the same way as the microscopy channels in Figure 4A-E, but they do not belong together.

      We have now modified the colour-coding of images/graphs in Figure 4A-E as suggested.

      9) A tacc mutant allele is used in experiments, but is not further described. Please provide the necessary background information.

      We thank the reviewer for pointing this out. We had also forgotten to include the msps alleles used. The information for msps and tacc are now included in the methods.

      10) The authors assess spindle quality in Cnn, grip71, grip163 cells and show that spindle quality worsens with ectopic msps. For comparison it would be good to compare spindle quality side by side with a wild-type situation.

      This data is now included in Figure S4A,B.

      11) Introduction: "[...], however, as they depend upon each other for their proper localisation within the PCM and act redundantly." - Sentence is incomplete.

      I think this was just to do with how we had phased the sentence (the position of “however” was confusing). We have now rephrased the sentence: “It is complicated, however, to interpret the individual role of these proteins in the recruitment of γ-tubulin complexes, as they depend upon each other for their proper localisation within the PCM and act redundantly”.

      12) Introduction: "Cnn contains the highly conserved CM1 domain (Sawin et al., 2004), which binds directly to γ-tubulin complexes in yeast and humans (Brilot et al., 2021; Wieczorek et al., 2019)". - Choi et al 2010 should also be cited here.

      This citation has been added.

      13) Results: "Typically, interphase centrosomes have only ~5-20% of the γ-tubulin levels found at mitotic centrosomes, [...]". - Citation is needed

      We now cite our Conduit et al., 2014 paper.

      14) The authors should discuss that Msps was found to act non-redundantly with gamma-tubulin in interphase nucleation (Rogers, MBC, 2008), contrary to the conclusions in the current manuscript.

      Thank you for pointing this out. We have now modified the relevant part of the discussion to read:

      “TOG domain and TPX2 proteins have been shown to work together with γ-TuRCs (or microtubule seed templates) to promote microtubule nucleation (Thawani et al., 2018; Flor-Parra et al., 2018; Gunzelmann et al., 2018b; Consolati et al., 2020; King et al., 2020; Wieczorek et al., 2015). Consistent with this, co-depletion of γ-tubulin and the Drosophila TOG domain protein Msps did not delay non-centrosomal microtubule regrowth after cooling compared to single depletions in interphase S2 cells (Rogers et al., 2008). Nevertheless, several studies, mainly in vitro, have shown that TOG and TPX2 proteins can also function independently of γ-TuRCs to promote microtubule nucleation (Roostalu et al., 2015; Woodruff et al., 2017; Schatz et al., 2003; Slep and Vale, 2007; Ghosh et al., 2013; Thawani et al., 2018; King et al., 2020; Zheng et al., 2020; Tsuchiya and Goshima, 2021; Imasaki et al., 2022). Our data suggest that, unlike from non-centrosomal sites in interphase S2 cells, Msps can promote γ-TuRC-independent microtubule nucleation from centrosomes in mitotic larval brain cells. This difference may reflect the ability of centrosomes to concentrate Msps at a single location.”

      **Referees cross-commenting**

      This is a good paper in my opinion, they need to add some controls though, to determine the expected presence/absence of gTuSC/gTuRC in the different mutants. An important advance is the finding that gTuSC can function as nucleator in parallel to gTuRC, depending on the recruitment mechanism. Different recruitment mechanisms, nucleation templates, and regulatory strategies co-exist and provide complex regulation and robustness to nucleation/spindle assembly. We thank the Reviewer for their thorough and constructive review. We hope they will agree to allow publication without us having to perform the sucrose gradient experiments that, as discussed above, will be very difficult, if not impossible, to carry out.

      Reviewer #1 (Significance (Required)):

      This is a very well-executed study and the data is presented clearly. However, some findings would benefit from additional experiments to substantiate the main interpretations. If these points are addressed, the study would provide an important conceptual advance in the field, namely that animal cells may rely on two different gamma-tubulin complexes for nucleation at mitotic centrosomes, gamma-TuSC and gamma-TuRC, which differ not only in their composition of GCP proteins but also the mode of recruitment to the centrosome. The findings will be of interest to all cell biologists.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary This paper sets out to further our understanding of how two proteins, Cnn and Spd-2, independently recruit g-Tubulin ring complexes(g-TuRC) to mitotic centrosomes in Drosophila cells. It uses some robust classical genetics to generate mutants to reduce/remove GCP4/5/6, Dgrip71 and Cnn and Spd-2 from cells, monitoring the consequences using live imaging.

      It begins by showing that Cnn can recruit g-Tubulin independently of the core g-TuRC components or Dgrip71, and that a mutant Cnn lacking the CM1 domain cannot, strongly suggesting that, similarly to other organisms, the CM1 domain is essential for this function.

      It then demonstrates that Spd-2, in contrast, cannot localise g-Tubulin in the absence of the g-TuRC components or Dgrip71.

      In the second half of the paper, then use this tool as a proxy for centrosomes that completely lack mitotic g-Tubulin recruitment, in order to explore spindle assembly in the absence of centrosomal g-Tubulin. The show that microtubules and spindle are still nucleated but do so with different dynamics. This section is particularly convincing, given the use of the live de/repolymerisation assays using the CherryTemp device.

      Finally, the authors visualise spindle formation in the absence of centrosomal g-Tubulin, alongside a number of other MT associated proteins, including Msps.

      Major Comments 1. The claims and conclusions relating to the first half of the paper are supported by the data, but they need to be caveated by a clear explanation of the alleles used. Some are well-characterised mutant lines but have they been previously shown to completely remove the associated protein products? For the RNAi lines, do the authors have evidence (via Western blots) that these remove the protein products? It is not necessary that they show Western blots for all the lines, and it does not invalidate the major conclusions that the fly line carrying mutations in cnn, grip71, grip163 completely fails to localise g-Tubulin to mitotic centrosomes. However, they need to help the reader understand much more clearly whether these lines are complete nulls and, consequently this may impact the strength of their interpretation of the relationship between Grip163 versus Grip75, discussed both at the end of the relevant section and in the Discussion.

      We appreciate the reviewer’s concern and have now included a detailed description in the Methods section of the alleles we use and their known effect on protein levels (pasted below for convenience). We have also included western blots for cnn and spd2 mutants to show the absence of detectable protein in larval brains. Unfortunately, we could not provide western blots for the other mutants, as we don’t have working antibodies for these proteins (although for Grip71 we did make an antibody and did western blots that showed the absence of protein in grip71 mutants, but this antibody has now been commercialised and so the western blot is published on the CRB website: https://crbdiscovery.com/polyclonal-antibodies/anti-grip71-antibody/). Nevertheless, protein levels for the grip75, grip163, msps and tacc mutants have been shown previously (now cited in the new text). We have also modified the main text to allow the reader to better understand whether proteins are completely absent or strongly reduced. In response to the specific comment about interpreting the relationship between Grip163 and Grip75, as we mention in the new methods section, the Grip75 allele is a null mutant while the Grip163 mutant is a severe depletion; thus, the fact that the Grip163 mutant has a stronger effect on γ-tubulin recruitment is not due to a stronger depletion.

      New text in methods: “For spd-2 mutants, we used the dspd-2Z35711 mutant allele, which carries an early stop codon resulting in a predicted 56aa protein. Homozygous dspd-2Z35711 mutant flies lack detectable Spd-2 protein on western blots and so the allele is therefore considered to be a null mutant (Giansanti et al., 2008). This allele no longer produces homozygous flies (which is common for mutant alleles kept as balanced stocks for many years), which combined dspd-2Z35711 with a deficiency that includes the entire spd-2 gene (dspd-2Df(3L)st-j7). On western blots, there was no detectable Spd-2 protein in extracts from dspd-2Z35711 / dspd-2Df(3L)st-j7 hemizygous mutant brains (Figure S4B). For cnn mutants, we combined the cnnf04547 and cnnHK21 mutant alleles. The cnnf04547 allele carries a piggyBac insertion in the middle of the cnn gene and is predicted to disrupt long Cnn isoforms, including the centrosomal isoform (Cnn-C or Cnn-PA) (Lucas and Raff, 2007). This mutation is considered to be a null mutant for the long Cnn isoforms (Lucas and Raff, 2007; Conduit et al., 2014). The cnnHK21 allele carries an early stop codon after Cnn-C’s Q78 (Vaizel-Ohayon and Schejter, 1999) and affects both long and short Cnn isoforms – it is considered to be a null mutant (Eisman et al., 2009; Chen et al., 2017a). On western blots, there was no detectable Cnn-C protein in cnnf04547 / cnnHK21 hemizygous mutant brains (Figure S4A). For Grip71, we used the grip71120 mutant allele, which is a result of an imprecise p-element excision event that led to the removal of the entire grip71 coding sequence except for the last 12bp; it is considered to be a null mutant (Reschen et al., 2012). We combined this with an allele carrying a deficiency that includes the entire grip71 gene (grip71Df(2L)Exel6041). On western blots, there is no detectable Grip71 protein in grip71120 / grip71df6041 hemizygous mutant brains (see blots on CRB website, which were performed by us). For Grip75GCP4, we used the grip75175 mutant allele, which carries an early stop codon after Q291. Homozygous grip75175 mutant flies lack detectable Grip75GCP4 protein on western blots and so the allele is therefore considered to be a null mutant (Schnorrer et al., 2002). We combined this with an allele carrying a deficiency that includes the entire grip75GCP4 gene (grip75Df(2L)Exel7048). In the absence of a working antibody, we have not confirmed the expected absence of Grip75GCP4 protein in grip75175 / grip75Df(2L)Exel7048 hemizygous mutant flies on western blots. For Grip128GCP5, we used the UAS-controlled grip128-RNAiV29074 RNAi line, which is part of the VDRC’s GD collection, and drove its expression using the Insc-Gal4 driver (BL8751), which is expressed in larval neuroblasts and their progeny. In the absence of a working antibody, we have not confirmed the absence or reduction of Grip128GCP5 protein on western blots. RNAi was used for grip128GCP5 as its position on the X chromosome made generating stocks with multiple alleles technically challenging. For Grip163GCP6, we used the grip163GE2708 mutant allele, which carries a p-element insertion between amino acids 822 and 823 (total protein length is 1351aa) and behaves as a null or strong hypomorph mutant (Vérollet et al., 2006). We combined this with an allele carrying a deficiency that includes the entire grip163GCP6 gene (grip163Df(3L)Exel6115). In the absence of a working antibody, we have not confirmed the absence or reduction of Grip163GCP6 protein in grip163GE2708 / grip163Df(3L)Exel6115 hemizygous mutant flies on western blots. For Msps, we used the mspsp and mspsMJ15 mutant alleles. The mspsp allele carries a p-element insertion within, or close to, the 5’ UTR of the msps gene and results in a strong reduction, but not elimination, of Msps protein on western blots (Cullen et al., 1999). The mspsMJ15 allele was generated by re-mobilisation of the p-element (the genetic consequence of which has not been defined) and also results in a strong reduction, but not elimination, of Msps protein on western blots (Cullen et al., 1999; Lee et al., 2001). For TACC, we used the taccstella allele which contain a p-element insertion of unknown localisation but that results in no detectable TACC protein on western blots (Barros et al., 2005). For Mei-38, we used the UAS-controlled mei-38-RNAiHMJ23752 RNAi line, which is part of the NIG’s TRiP Valium 20 collection, and drove its expression using the Insc-Gal4 driver (BL8751). In the absence of a working antibody, we have not confirmed the absence or reduction of Mei-38 protein on western blots. RNAi was used for mei-38 as its position on the X chromosome made generating stocks with multiple alleles technically challenging. Moreover, the only available mutant of mei-38 affects a neighbouring gene.”

      I have an issue with the statistics in Figure 1 &2. I realise the t-tests in Figure 1 show the significant differences between g-Tubulin recruitment to centrosomes in interphase and mitosis, in order to demonstrate the difference between the Spd-2;Grip combination line in (B) and the Spd-2; CnnCM1 double mutant in (D). But in doing so, it draws attention to the fact that there is no similar t-test between mitotic g-Tubulin recruitment to centrosomes in WT, Spd-2 and the Spd-2;Grip combination lines. This lack of stats between conditions is further confused by the language used in the text: In the Figure legend, the authors claim mitotic centrosomal g-Tubulin levels between are WT, Spd-2 and the Spd-2;Grip combination lines "similar", and in the text they say: the spd-2 Grip combination line had g-Tubulin "similar to the levels found at spd-2 mutants alone". But then they give numbers - an average of 77% of wild type for spd2 and 66% of wild type for the spd-2 Grip combination. I'm sure if they did a t-test they would find a significant difference between these conditions. This doesn't invalidate the thrust of what they're claiming, but they do need to be consistent in language, analysis and interpretation.

      We agree that we should have performed a statistical comparison between the γ-tubulin levels for “WT mitosis” vs “spd2 mitosis” and for “spd-2 mitosis” vs “spd2,grip71,grip75,grip128,grip163 mitosis” (Figure 1B). We have now done this and found statistically significant differences in both cases. We have included the new p-values in the figure and modified the main text to read: “In fact, the centrosomes in these spd-2,grip71,grip75GCP4,grip128GCP5-RNAi,grip163GCP6 mutant cells had ~66% of the γ-tubulin levels found at wild-type centrosomes, only slightly lower than the levels found at spd-2 mutants alone (Figure 1A,B).”; and we have modified the legend to read: “A one-way ANOVA with a Sidak’s multiple comparisons test was used to make the comparisons indicated by p values in the graph. Note that there is only a small reduction in mitotic centrosomal γ-tubulin levels in spd-2 mutants and in spd-2, grip71,grip75GCP4, grip128GCP5-RNAi,grip163GCP6 mutants, showing that Cnn can still efficiently recruit γ-tubulin complexes to mitotic centrosomes when only γ-TuSCs are present.” Note that due to performing comparisons multiple times with the same data sets, it was necessary to use a one-way ANOVA with a Sidak’s multiple comparisons test (rather than paired t-tests).

      For Figure 1D, we did not compare WT mitosis vs cnn∆CM1,spd-2 mitosis, as the point here was to test whether there was an increase from interphase to mitosis in cnn∆CM1,spd-2 mutants and we wanted to maintain the statistical power of using a paired t-test (one is more likely to detect differences with a paired t-test than with a multiple comparisons ANOVA, making the conclusion that there is no difference between interphase and mitotic cnn∆CM1,spd-2 centrosomes even more solid).

      Similarly, in Figure 2, it would be better to assess any statistically significant difference between mitotic accumulation of g-Tubulin between fly lines, rather than accumulation between interphase and mitosis (which is pretty clear cut). This would help to clarify whether differences between loss of grip subunits are merely additive or synergistic. Again, this doesn't invalidate the overall result that concomitant loss of cnn, grip71 and grip163 completely abolishes mitotic centrosomal accumulation of g-Tubulin, but it is a more complete analysis of the extant data.

      As for Figure 2, we respectfully disagree that we should make comparisons between genotypes instead of, or in addition to, making comparisons between interphase and mitotic centrosomes within the same genotype. This is because we will lose statistical power by performing a multiple comparisons test. Indeed, if we were to compare both within and between selected genotypes (14 comparisons in total), then we lose the statistically significant differences between interphase and mitotic centrosomes in cnn,grip75,grip163 (p=0.04) and cnn,grip71,grip75 (p=0.08) genotypes, when there clearly appears to be a difference (as stated by the Reviewer). Given that the point of this experiment is to elucidate which proteins are required to allow maturation from interphase to mitosis, rather than which combination of mutations has the stronger effect, we feel that maintaining the paired t-test analysis is more appropriate.

      One OPTIONAL experiment that would significantly improve the study would be similar CherryTemp live imaging of the cells lacking both centrosomal g-Tubulin and Msps. Currently the manuscript finishes with a fixed analysis of MT de/repolymerisation in these cells, which provides evidence that Msps has a role in MT nucleation in the absence of centrosomal g-Tubulin-nucleated MTs, but very little else can be concluded.

      We would love to do this experiment but the genetics are complicated. We would have to generate stocks containing a cnn,grip71,GFP-PACT triple allele chromosome II and a grip163,msps,Jupiter-mCherry triple allele chromosome III. While live data would provide interesting insights into the dynamics of microtubules nucleated in the absence of γ-TuRCs and reduction of Msps, our fixed analysis is at least sufficient to implicate Msps in γ-TuRC-independent microtubule organisation.

      1. There is, perhaps surprisingly, no mention of Augmin in the paper. Augmin has been shown to recruit g-TuRC to pre-existing MTs, through the grip71 subunit (Chen et al., 2017). So, presumably, in cnn, grip71, grip163, g-Tubulin cannot be recruited to pre-existing MTs either? This could add impact to the results - in that it implies the MT nucleation seen in the absence of cnn, grip71 and grip163 actually reflects, not just loss of centrosome function, but also loss of Augmin function. Mentioning this in the discussion could help increase the impact of the paper.

      We apologise for this oversight. Indeed, it is perfectly possible that Grip71/Augmin-mediated amplification of microtubules during microtubule re-growth from centrosomes could influence the difference in recovery rates between control and mutant centrosomes. We have now modified the results section to read:

      “Our data suggest that microtubules are more resistant to cold-induced depolymerisation when they have been nucleated independently of γ-TuRCs, but that microtubules are nucleated more efficiently when γ-TuRCs are present. However, it must be considered that, due to the loss of Cnn from centrosomes in the cnn,grip71,grip163 mutant cells, general PCM levels are reduced, likely reducing the levels of any protein involved in γ-TuRC-independent microtubule nucleation. Moreover, Grip71 is necessary for γ-TuRC recruitment to microtubules, most likely via the Augmin complex (Reschen et al., 2012; Chen et al., 2017b; Dobbelaere et al., 2008; Vérollet et al., 2006), enabling microtubules to be nucleated from the sides of pre-existing microtubules. Thus, the potential for Augmin-mediated amplification of centrosome-nucleated microtubules in control cells may also contribute to the increased microtubule recovery speed in control cells. Importantly, however, both of these caveats make it even clearer that microtubules can be nucleated independently of γ-TuRCs from mitotic centrosomes in Drosophila.”

      Minor comments 1. The cnn, grip71, grip163 mutant image in Fig3 B after 40 min cooling appears to have 4 centrioles. Is this a cell that exited and re-entered mitosis?

      Cnn mutant cells often have centrosome segregation problems, resulting in cells with variable numbers of centrioles (Conduit et al., 2010b, Current Biology). We have now mentioned this in the legends for Figure 3, Figure 4, and Figure S4.

      Methods should contain more detail on the de/repolymerisation live imaging analysis (including the numbers of cells contributing to the analysis) and techniques such exponential curve fitting.

      We have now included this information in the methods and updated this information in the figure legend (to include cell numbers, not just centrosome numbers, and to indicate that GraphPad Prism was used to generate the models.

      P5 para 2 - "GPC4/5/4/6" should read "GCP4/5/6"

      We actually use the GCP4/5/4/6 nomenclature throughout as it represents the 2 copies of GCP4 to one copy of GCP5 and GCP6 in the complex, as well as the order of these molecules.

      Fig legend 1 - "error bar" should read "scale bar"

      Thanks, now corrected

      Reviewer #2 (Significance (Required)):

      The experimental approach (genetics and cell biology) taken in this manuscript is very appropriate and the experiments are of high quality. It uses the strengths of Drosophila to cleverly engineer flies to pull apart the relationship between two different ways to recruit the main MT nucleator, g-Tubulin, to mitotic centrosomes. This is an important advance for the specific research field of centrosome biology.

      By generating a fly that completely fails to localise g-Tubulin to mitotic centrosomes, the paper is able to explore whether MTs and the mitotic spindle can form in its absence. Again, there is very high quality imaging and image analysis, using a commercially available (but very cool) fast heating/cooling apparatus - the CherryTemp to explore the dynamics of MT generation. The limitation to this approach, though, is that g-Tubulin itself is still present and presumably able to nucleate MTs in the cytosol or elsewhere, albeit inefficiently. As such, it adds to a body of centrosomal and cell division research, rather than adding a highly significant conceptual advance.

      Similarly, the finding that Msps is involved in nucleating MTs in the absence of centrosomal g-Tubulin, via fixed analysis, supports other work, rather than moving the field forwards.

      Overall, assuming the caveats mentioned in the major comments are dealt with, I see this as a robust and very well carried out piece of research, that will be of interest to those investigating the broad field of cell division

      My field of expertise is Drosophila cell division

      We thank the Reviewer for their thorough and constructive review. We hope that the reviewer may agree with us and the other Reviewers that revealing the complexity of γ-TuRC recruitment and microtubule nucleation at centrosomes, particularly the finding that different types of γ-tubulin complexes are recruited to centrosomes by different tethering proteins, provides an important conceptual advance.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Centrosomes are complex and it has been appreciated for some time that they likely nucleate microtubules by more than one mechanism. However, what these mechanisms exactly are, and which are the most significant has not been clear. A major contributor to centrosomal microtubule nucleation the tubulin isoform gamma-tubulin (g-tubulin), which is present in two complexes, a smaller gTuSC that contains gamma tubulin along with GCP2 and 3 and a larger g-tubulin ring complex (gTuRC) whose assembly additionally requires GCP4/5/6. A second high-level question has been whether the centrosome has any g-tubulin-independent microtubule nucleation mechanisms. In this manuscript, the authors use a collection of mutants and RNAi conditions in the Drosophila brain to generate a picture of centrosomal microtubule nucleation pathways. They show that there are two g-tubulin-dependent and a third g-tubulin-independent microtubule nucleation pathways. They show that the first g-tubulin-dependent pathway depends on the CM1 domain of the centrosomal PCM matrix protein, Centrosomin (Cnn) and on the gTuSC components GCP2/3, but not on the components specifically required for gTuRC assembly. The second g-tubulin-dependent pathway depends on Spd-2 (and not Cnn) and requires the gTuRC-specific components and NEDD1/Grip71. By inhibiting both of these pathways, the authors also show that there is a robust g-tubulin-independent microtubule nucleation pathway. Overall, has the potential to be an impactful contribution from a conceptual point-of-view. I would be excited to recommend publication if the major comments below, particularly points 1 and 2, could be addressed.

      1. The experiment in Fig. 2B examines what is required for Spd-2 to recruit g-tubulin to mitotic centrosomes that lack Cnn. This panel should include a cnn mutant-only control, for which the readers are currently referred to an older paper from 2014. Without repeating this control in parallel to one of the conditions in this panel, it is impossible to say whether the addition of the grip71 mutation has any effect on g-tubulin levels.

      This is a good point. We will perform a cnn vs cnn,grip71 experiment and include this data in the new version of the paper. This will take a couple of months, as this will involve growing up fly lines, performing the necessary crosses, microscopy, data analysis, and manuscript updating.

      1. The experiment in Fig. 2B is in the background of a Cnn loss-of-function mutation in which centrosomal Spd-2 is at just under 40% of its levels in brains with Cnn (according to Fig. 2D). So the Spd-2 doing the recruiting-is the non-Cnn-dependent population. The authors should also do one experiment in the background of their Cnn-CM1delete mutant or their Cnn CM1 g-tubulin recruitment mutant, because these backgrounds would be expected to have normal amounts of Cnn matrix and normal levels of Spd-2. Comparing the amount of g-tubulin recruitment in a cnn loss-of-function mutant to that in a cnn-CM1delete mutant would reveal whether the Cnn-bound Spd-2 can contribute to g-tubulin recruitment in the same way that the Cnn-independent Spd-2 can. These two populations could easily differ in their ability to recruit g-tubulin. Also, is it clear that these two pathways can act in parallel (i.e. that assembly of the Cnn matrix around the centriole does not mask the ability of Cnn-independent Spd-2 to recruit g-tubulin)? Thus, there are three possibilities- all interesting- for the outcome of this experiment. The Cnn-CM1delete mutant/Cnn-CM1 g-tubulin recruitment mutants could: (1) recruit less g-tubulin than the cnn loss-of function mutant (if Cnn matrix assembly inhibits the Cnn-independent Spd-2 pathway), (2) recruit the same amount of g-tubulin as the cnn loss-of-function mutant (if the Cnn matrix does not inhibit the Cnn-independent Spd-2 pathway but Cnn-dependent Spd-2 does not itself recruit g-tubulin), or (3) recruit more g-tubulin than the cnn loss-of-function mutant (if both the Cnn-dependent and Cnn-independent Spd-2 can recruit g-tubulin).

      These are very interesting points that we have not considered before. As the reviewer suggests, we will perform an analysis of γ-tubulin levels at centrosomes in cnnnull vs cnn∆CM1 to test the ability of Cnn-dependent and Cnn-independent populations of Spd-2 to recruit γ-tubulin. This should take ~2 months.

      1. The paper needs a summary model figure that the field can understand. The current model in Fig. 2E does not suffice in this regard. It would be nice to have this model appear at the end of the paper to outline the 3 pathways for centrosomal microtubule nucleation outlined by the work. Maybe have an arc for the centrosome at the bottom of the figure and show arrows from the gTuSC to the Cnn CM1 domain from the gTuRC to the Cnn CM1 domain and the gTuRC to Spd-2 or something like this. How you draw this could be impacted by the experiment outlined above in point 2. Also, there would be a g-tubulin-independent pathway in the figure. Not everyone reads papers carefully, and you want people to be able to get the takeaway message at a glance.

      We have now completely modified the Figure and moved it to the end of the paper (new Figure 5). We thank the Reviewer for this suggestion as it really does provide a clearer message for the reader.

      1. The authors show that this pathway is modulated by loss of Minispindles (Msps)-but as this is a critical microtubule assembly factor, it seems likely that Msps loss might modulate all of the pathways. From the data in Figure 4, my main takeaway would be that Msps is not the central player in the g-tubulin independent nucleation pathway. It might make the paper more impactful to end the story after Fig. 4, move the current Fig. 5 to the supplement and add a nice model figure at the end.

      We agree that Msps may play a role beyond microtubule nucleation, including plus end growth, and that this may also influence the efficiency of spindle formation in cnn,grip71,grip163,msps mutants. Nevertheless, our microtubule regrowth data in original Figure 5A clearly show that Msps is a key player in the g-tubulin independent nucleation pathway at centrosomes. Perhaps the Reviewer missed this point as the data was in Figure 5 and not Figure 4. Moreover, the original Figure 5E shows that the effect of depleting Msps in addition to cnn, grip71 and grip163 is specific to cells containing centrosomes i.e. if Msps played a significant role in microtubule regulation beyond its role at centrosomes, then one would expect spindle formation to be worse when comparing mutant cells that lack centrosomes. Nevertheless, we now realise it would be better to include the microtubule regrowth from centrosomes data for cnn,grip71,grip163 vs cnn,grip71,grip163,msps in Figure 4, and move the spindle assembly data from original Figure 5C-E to a new supplementary Figure (Figure S4). We then end the paper on a model figure in new Figure 5.

      Minor comments: 5. In Fig. 1E the sequence labels are confusing. Please label each sequence on the left with the residue numbers in the corresponding endogenous protein that are shown in the alignment.

      You are absolutely right, I’m not sure why our labelling was like that. Now corrected.

      In Fig. 1F, please label with location of molecular weight markers

      Now added.

      Reviewer #3 (Significance (Required)):

      Repeating my text from above. Centrosomes are complex and it has been appreciated for some time that they likely nucleate microtubules by more than one mechanism. However, what these mechanisms exactly are, and which are the most significant has not been clear. A major contributor to centrosomal microtubule nucleation the tubulin isoform gamma-tubulin (g-tubulin), which is present in two complexes, a smaller gTuSC that contains gamma tubulin along with GCP2 and 3 and a larger g-tubulin ring complex (gTuRC) whose assembly additionally requires GCP4/5/6. A second high-level question has been whether the centrosome has any g-tubulin-independent microtubule nucleation mechanisms. In this manuscript, the authors use a collection of mutants and RNAi conditions in the Drosophila brain to generate a picture of centrosomal microtubule nucleation pathways. They show that there are two g-tubulin-dependent and a third g-tubulin-independent microtubule nucleation pathways. They show that the first g-tubulin-dependent pathway depends on the CM1 domain of the centrosomal PCM matrix protein, Centrosomin (Cnn) and on the gTuSC components GCP2/3, but not on the components specifically required for gTuRC assembly. The second g-tubulin-dependent pathway depends on Spd-2 (and not Cnn) and requires the gTuRC-specific components and NEDD1/Grip71. By inhibiting both of these pathways, the authors also show that there is a robust g-tubulin-independent microtubule nucleation pathway. Overall, has the potential to be an impactful contribution from a conceptual point-of-view. I would be excited to recommend publication if the major comments, particularly points 1 and 2, could be addressed.

    1. The prior for the slope is a lot easier now. We can simply specify a normal distribution with a mean of 0 and a standard deviation equal to the size of the effect we deem likely, together with a lower bound of 0 and upper bound of 1.

      Update: I was wrong on the below, the SD is not 1 here, because it's the SD for the residual term in the linear model, not the SD for the raw outcome variable.

      Previous comment:...

      I’m ‘worried’ that if you give it data you know has sigma=1, but you allow it to choose any combination of beta and sigma, you may be getting it to do give a weird posterior to both of the parameters, in a way you know can’t make sense, in order to find the most likely parameters for the weird geocentric model you imposed.

      on the other hand I would have thought that it would tend to converge to a sigma=1 anyways as the most likely, as that is ‘allowed’ by your model

      my take is that the cauchy prior you impose in that part is heliocentric; well let me expand on this. I think you know that the true std deviation of the ‘standardized heights from this population’ is 1 what you don’t know is whether it is indeed normal (i.e., whether family = gaussian is right here) thus it might be finding ‘a sigma far from 1 is likely’ under this model, because that makes your ‘skewed’ or ‘fat tailed’ data seem more likely under the normal prior A better approach might be to allow a different distribution with some sort of ‘skew’ parameter, but imposing the sd must be 1

    1. Author Response

      Reviewer #2 (Public Review):

      "The cellular architecture of memory modules in Drosophila supports stochastic input integration" is a classical biophysical compartmental modelling study. It takes advantage of some simple current injection protocols in a massively complex mushroom body neuron called MBON-a3 and compartmental models that simulate the electrophysiological behaviour given a detailed description of the anatomical extent of its neurites.

      This work is interesting in a number of ways:

      • The input structure information comes from EM data (Kenyon cells) although this is not discussed much in the paper - The paper predicts a potentially novel normalization of the throughput of KC inputs at the level of the proximal dendrite and soma - It claims a new computational principle in dendrites, this didn’t become very clear to me Problems I see:

      • The current injections did not last long enough to reach steady state (e.g. Figure 1FG), and the model current injection traces have two time constants but the data only one (Figure 2DF). This does not make me very confident in the results and conclusions.

      These are two important but separate questions that we would like to address in turn.

      As for the first, in our new recordings using cytoplasmic GFP to identify MBON-alpha3, we performed both a 200 ms current injection and performed prolonged recordings of 400 ms to reach steady state (for all 4 new cells 1’-4’). For comparison with the original dataset we mainly present the raw traces for 200 ms recordings in Figure 1 Supplement 2. In addition, we now provide a direct comparison of these recordings (200 ms versus 400 ms) and did not observe significant differences in tau between these data (Figure 1 Supplement 2 K). This comparison illustrates that the 200 ms current injection reaches a maximum voltage deflection that is close to the steady state level of the prolonged protocol. Importantly, the critical parameter (tau) did not change between these datasets.

      Regarding the second question, the two different time constants, we thank the reviewer for pointing this out. Indeed, while the simulated voltage follows an approximately exponential decay which is, by design, essentially identical to the measured value (τ≈ 16ms, from Table 1; ee Figure 1 Supplement 2 for details), the voltage decays and rises much faster immediately following the onset and offset of the current injections. We believe that this is due to the morphology of this neuron. Current injection, and voltage recordings, are at the soma which is connected to the remainder of the neuron by a long and thin neurite. This ’remainder’ is, of course, in linear size, volume and surface (membrane) area much larger than the soma, see Fig 2A. As a result, a current injection will first quickly charge up the membrane of the soma, resulting in the initial fast voltage changes seen in Fig 2D,F, before the membrane in the remainder of the cell is charged, with the cell’s time constant τ.

      We confirmed this intuition by running various simplified simulations in Neuron which indeed show a much more rapid change at step changes in injected current than over the long-term. Indeed, we found that the pattern even appears in the simplest possible two-compartment version of the neuron’s equivalent circuit which we solved in an all-purpose numerical simulator of electrical circuitry (https://www.falstad.com/circuit). The circuit is shown in Figure 1. We chose rather generic values for the circuit components, with the constraints that the cell capacitance, chosen as 15pF, and membrane resistance, chosen as 1GΩ, are in the range of the observed data (as is, consequently, its time constant which is 15ms with these choices); see Table 1 of the manuscript. We chose the capacitance of the soma as 1.5pF, making the time constant of the soma (1.5ms) an order of magnitude shorter than that of the cell.

      Figure 1: Simplified circuit of a small soma (left parallel RC circuit) and the much larger remainder of a cell (right parallel RC circuit) connected by a neurite (right 100MΩ resistor). A current source (far left) injects constant current into the soma through the left 100MΩ resistor.

      Figure 2 shows the somatic voltage in this circuit (i.e., at the upper terminal of the 1.5pF capacitor) while a -10pA current is injected for about 4.5ms, after which the current is set back to zero. The combination of initial rapid change, followed by a gradual change with a time constant of ≈ 15ms is visible at both onset and offset of the current injection. Figure 3 show the voltage traces plotted for a duration of approximately one time constant, and Fig 4 shows the detailed shape right after current onset.

      Figure 2: Somatic voltage in the circuit in Fig. 1 with current injection for about 4.5ms, followed by zero current injection for another ≈ 3.5ms.

      Figure 3: Somatic voltage in the circuit, as in Fig. 2 but with current injected for approx. 15msvvvvv

      While we did not try to quantitatively assess the deviation from a single-exponential shape of the voltage in Fig. 2E, a more rapid increase at the onset and offset of the current injection is clearly visible in this Figure. This deviation from a single exponential is smaller than what we see in the simulation (both in Fig 2D of the manuscript, and in the results of the simplified circuit here in the rebuttal). We believe that the effect is smaller in Fig. E because it shows the average over many traces. It is much more visible in the ’raw’ (not averaged) traces. Two randomly selected traces from the first of the recorded neurons are shown in Figure 2 Supplement 2 C. While the non-averaged traces are plagued by artifacts and noise, the rapid voltage changes are visible essentially at all onsets and offsets of the current injection.

      Figure 4: Somatic voltage in the circuit, as in Fig. 2 but showing only for the time right after current onset, about 2.3ms.

      We have added a short discussion of this at the end of Section 2.3 to briefly point out this observation and its explanation. We there also refer to the simplified circuit simulation and comparison with raw voltage traces which is now shown in the new Figure 2 Supplement 2.

      • The time constant in Table 1 is much shorter than in Figure 1FG?

      No, these values are in agreement. To facilitate the comparison we now include a graphical measurement of tau from our traces in Figure 1 Supplement 2 J.

      • Related to this, the capacitance values are very low maybe this can be explained by the model’s wrong assumption of tau?

      Indeed, the measured time constants are somewhat lower than what might be expected. We believe that this is because after a step change of the injected current, an initial rapid voltage change occurs in the soma, where the recordings are taken. The measured time constant is a combination of the ’actual’ time constant of the cell and the ’somatic’ (very short) time constant of the soma. Please see our explanations above.

      Importantly, the value for tau from Table 1 is not used explicitly in the model as the parameters used in our simulation are determined by optimal fits of the simulated voltage curves to experimentally obtained data.

      • That latter in turn could be because of either space clamp issues in this hugely complex cell or bad model predictions due to incomplete reconstructions, bad match between morphology and electrophysiology (both are from different datasets?), or unknown ion channels that produce non-linear behaviour during the current injections.

      Please see our detailed discussion above. Furthermore, we now provide additional recordings using cytoplasmic GFP as a marker for the identification of MBON-alpha3 and confirm our findings. We agree that space-clamp issues could interfere with our recordings in such a complex cell. However, our approach using electrophysiological data should still be superior to any other approach (picking text book values). As we injected negative currents for our analysis at least voltage-gated ion channels should not influence our recordings.

      • The PRAXIS method in NEURON seems too ad hoc. Passive properties of a neuron should probably rather be explored in parameter scans.

      We are a bit at a loss of what is meant by the PRAXIS method being "too ad hoc." The PRAXIS method is essentially a conjugate gradient optimization algorithm (since no explicit derivatives are available, it makes the assumption that the objective function is quadratic). This seems to us a systematic way of doing a parameter scan, and the procedure has been used in other related models, e.g. the cited Gouwens & Wilson (2009) study.

      Questions I have:

      • Computational aspects were previously addressed by e.g. Larry Abbott and Gilles Laurent (sparse coding), how do the findings here distinguish themselves from this work

      In contrast to the work by Abbott and Laurent that addressed the principal relevance and suitability of sparse and random coding for the encoding of sensory information in decision making, here we address the cellular and computational mechanisms that an individual node (KC>MBON) play within the circuitry. As we use functional and morphological relevant data this study builds upon the prior work but significantly extends the general models to a specific case. We think this is essential for the further exploration of the topic.

      • What is valence information?

      Valence information is the information whether a stimulus is good (positive valence, e.g. sugar in appetitive memory paradigms, or negative valence in aversive olfactory conditioning - the electric shock). Valence information is provided by the dopaminergic system. Dopaminergic neurons are in direct contact with the KC>MBON circuitry and modify its synaptic connectivity when olfactory information is paired with a positive or negative stimulus.

      • It seems that Martin Nawrot’s work would be relevant to this work

      We are aware of the work by the Nawrot group that provided important insights into the processing of information within the olfactory mushroom body circuitry. We now highlight some of his work. His recent work will certainly be relevant for our future studies when we try to extend our work from an individual cell to networks.

      • Compactification and democratization could be related to other work like Otopalik et al 2017 eLife but also passive normalization. The equal efficiency in line 427 reminds me of dendritic/synaptic democracy and dendritic constancy

      Many thanks for pointing this out. This is in line with the comments from reviewer 1 and we now highlight these papers in the relevant paragraph in the discussion (line 442ff).

      • The morphology does not obviously seem compact, how unusual would it be that such a complex dendrite is so compact?

      We should have been more careful in our terminology, making clear that when we write ’compact’ we always mean ’electrotonically compact," in the sense that the physical dimensions of the neuron are small compared to its characteristic electrotonic length (usually called λ). The degree of a dendritic structure being electrotonically compact is determined by the interaction of morphology, size and conductances (across the membrane and along the neurites). We don’t believe that one of these factors alone (e.g. morphology) is sufficient to characterize the electrical properties of a dendritic tree. We have now clarified this in the relevant section.

      • What were the advantages of using the EM circuit?

      The purpose of our study is to provide a "realistic" model of a KC>MBON node within the memory circuitry. We started our simulations with random synaptic locations but wondered whether such a stochastic model is correct, or whether taking into account the detailed locations and numbers of synaptic connections of individual KCs would make a difference to the computation. Therefore we repeated the simulations using the EM data. We now address the point between random vs realistic synaptic connectivity in Figure 4F. We do not observe a significant difference but this may become more relevant in future studies if we compute the interplay between MBONs activated by overlapping sets of KCs. We simply think that utilizing the EM data gets us one step closer to realistic models.

      • Isn’t Fig 4E rather trivial if the cell is compact?

      We believe this figure is a visually striking illustration that shows how electrotonically compact the cell is. Such a finding may be trivial in retrospect, once the data is visualized, but we believe it provides a very intuitive description of the cell behavior.

      Overall, I am worried that the passive modelling study of the MBON-a3 does not provide enough evidence to explain the electrophysiological behaviour of the cell and to make accurate predictions of the cell’s responses to a variety of stochastic KC inputs.

      In our view our model adequately describes the behavior of the MBON with the most minimal (passive) model. Our approach tries to make the least assumptions about the electrophysiological properties of the cell. We think that based on the current knowledge our approach is the best possible approach as thus far no active components within the dendritic or axonal compartments of Drosophila MBONs have been described. As such, our model describes the current status which explains the behavior of the cell very well. We aim to refine this model in the future if experimental evidence requires such adaptations.

      Reviewer #3 (Public Review):

      This manuscript presents an analysis of the cellular integration properties of a specific mushroom body output neuron, MBON-α3, using a combination of patch clamp recordings and data from electron microscopy. The study demonstrates that the neuron is electrotonically compact permitting linear integration of synaptic input from Kenyon cells that represent odor identity.

      Strengths of the manuscript:

      The study integrates morphological data about MBON-α3 along with parameters derived from electrophysiological measurements to build a detailed model. 2) The modeling provides support for existing models of how olfactory memory is related to integration at the MBON.

      Weaknesses of the manuscript:

      The study does not provide experimental validation of the results of the computational model.

      The goal of our study is to use computational approaches to provide insights into the computation of the MBON as part of the olfactory memory circuitry. Our data is in agreement with the current model of the circuitry. Our study therefore forms the basis for future experimental studies; those would however go beyond the scope of the current work.

      The conclusion of the modeling analysis is that the neuron integrates synaptic inputs almost completely linearly. All the subsequent analyses are straightforward consequences of this result.

      We do, indeed, find that synaptic integration in this neuron is almost completely linear. We demonstrate that this result holds in a variety of different ways. All analyses in the study serve this purpose. These results are in line with the findings by Hige and Turner (2013) who demonstrated that also synaptic integration at PN>KC synapses is highly linear. As such our data points to a feature conservation to the next node of this circuit.

      The manuscript does not provide much explanation or intuition as to why this linear conclusion holds.

      We respectfully disagree. We demonstrate that this linear integration is a combination of the size of the cell and the combination of its biophysical parameters, mainly the conductances across and along the neurites. As to why it holds, our main argument is that results based on the linear model agree with all known (to us) empirical results, and this is the simplest model.

      In general, there is a clear takeaway here, which is that the dendritic tree of MBON-α3 in the lobes is highly electrotonically compact. The authors did not provide much explanation as to why this is, and the paper would benefit from a clearer conclusion. Furthermore, I found the results of Figures 4 and 5 rather straightforward given this previous observation. I am sceptical about whether the tiny variations in, e.g. Figs. 3I and 5F-H, are meaningful biologically.

      Please see the comment above as to the ’why’ we believe the neuron is electrotonically compact: a model with this assumption agrees well with empirically found results.

      We agree that the small variations in Fig 5F-H are likely not biologically meaningful. We state this now more clearly in the figure legends and in the text. This result is important to show, however. It is precisely because these variations are small, compared to the differences between voltage differences between different numbers of activated KCs (Fig 5D) or different levels of activated synapses (Fig 5E) that we can conclude that a 25% change in either synaptic strength or number can represent clearly distinguishable internal states, and that both changes have the same effect. It is important to show these data, to allow the reader to compare the differences that DO matter (Fig 5D,E) and those that DON’T (Fig 5F-H).

      The same applies to Fig 3I. The reviewer is entirely correct: the differences in the somatic voltage shown in Figure 3I are minuscule, less than a micro-Volt, and it is very unlikely that these difference have any biological meaning. The point of this figure is exactly to show this!. It is to demonstrate quantitatively the transformation of the large differences between voltages in the dendritic tree and the nearly complete uniform voltage at the soma. We feel that this shows very clearly the extreme "democratization" of the synaptic input!

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01573

      Corresponding author(s): Helder Maiato and Niels Galjart

      1. General Statements

      The murine microtubule (MT) plus-end tracking protein CLASP1 has been extensively examined in cultured cells, revealing an important function for this protein in mitosis and the regulation of MT dynamics. Here we describe a major in vivo phenotype of Clasp1 knockout (KO) mice: we find that these mice die at birth due to respiratory problems. In the first version of our manuscript we tried to link this in vivo phenotype of the KO mice to CLASP1’s major roles in cultured cells, including mitosis, and we therefore included multiple results, obtained in cultured cells and in different organs.

      We thank the reviewers for their thoughtful and constructive criticisms and for their judgment that our study is - in principle - worthy of publication. Based on suggestions by reviewers #2 and #3 we have decided to focus the revised manuscript on the lung phenotype of the Clasp1 KO mice, and on a possible cause for this phenotype. We believe that our new analysis, which was partly driven by the remarks of the reviewers, is revealing a mechanism for why the mice die at birth. This mechanism suggests a role for CLASP1 in controlling epithelial and endothelial cell differentiation in the neonatal lung, and in particular protein secretion in AT2 alveolar cells.

      2. Description of the planned revisions

      General remarks

      We believe our new RNA-Seq analysis (explained in detail below, in point 3 “Description of incorporated revisions”) strongly suggests that four essential lung cell types (i.e. AT1 and AT2 cells, endothelial cells and immune cells) fail to properly differentiate in Clasp1 KO embryos. In particular AT2 cell differentiation and functioning are hampered in the KO mice.

      Brief summary of planned experiments and table of old and new Figures

      To support our new findings we will stain sections of wild type and KO lung with a selected set of antibodies and other reagents. To help the reviewers we have made a table with original Figures and Figures for the revision.

      Figure Number

      Original Figure

      Fate of original

      Revision Figure

      1

      Targeted inactivation of the Clasp1 allele

      Remains

      Targeted inactivation of the Clasp1 allele

      2

      Clasp1 KO mice show reduced rib-cage and delayed ossification

      Minor revision

      Clasp1 KO mice show reduced rib-cage and delayed ossification (Statistics will be added)

      3

      Innervation of the diaphragm is affected in Clasp1 KO mice from E14.5-E18.5

      Moved to Supp

      Newborn Clasp1 KO lungs show a drastic reduction in air inflation

      4

      Neurite outgrowth, branching capacity and microtubule dynamics are altered in Clasp1 KO neurons

      Removed

      Histological and immunological examination of the Clasp1 KO lungs demonstrating decreased air space

      5

      Histological and immunological examination of the Clasp1 KO lungs demonstrating decreased air space

      Moved Up

      (4)

      Histo-morphological analysis of the developing lung throughout embryonic development (E14.5-PN1)

      6

      Transcriptome analysis of wild type and Clasp1 KO lungs

      Major revision

      Transcriptome analysis of wild type and Clasp1 knockout lungs reveals differentiation defects in four major lung cell types (New data added, old data moved to Supp)

      7

      Loss of Clasp1 alters the ratio of alveolar type I and type II cells in the lungs

      Major revision

      Cellular analysis of Clasp1 knockout lungs (New data will be added)

      8

      -

      -

      Role of Clasp1 in AT2 function (New data will be added)

      S1

      Incidental cell division defects in mouse embryonic fibroblasts derived from Clasp1 knockout mice

      Removed

      Innervation of the diaphragm is affected in Clasp1 knockout mice from E14.5-E18.5

      S2

      Ultra-structural analysis of diaphragms

      Remains

      Ultra-structural analysis of diaphragms

      S3

      Newborn Clasp1 knockout lungs show a drastic reduction in air inflation

      Moved to Main (3)

      Cellular analysis of late stage gestation mouse lungs

      S4

      Histo-morphological analysis of the developing lung throughout embryonic development (E14.5-PN1)

      Moved to Main (5)

      Exogenous administration of glucocorticoids promotes lung maturation and partially rescues postnatal lethality

      S5

      Cellular analysis of late stage gestation mouse lungs

      Moved Up

      (S3)

      Analysis of signature genes and cell type signatures of the mouse and human lung

      S6

      Exogenous administration of glucocorticoids promotes lung maturation and partially rescues postnatal lethality

      Moved Up (S4)

      Transcriptome analysis of wild type, Clasp1, and Mll3 knockout E18.5 lungs

      Below we react to specific comments of the reviewers, describing in more detail which experiments will be carried out and why we will do these experiments.

      Specific remarks to the comments of the reviewers

      Reviewer #1.

      Comment:

      p.17: Aqp5 expression was decreased in mutant lungs as shown by RNA-seq data and RT-qPCR. However, immunolabelling with T1a does not show a decrease in the number of Type I pneumocytes (Fig. 7D). According to the data presented, it is difficult to conclude that CLASP1 is involved in Type I pneumocyte differentiation.

      A cell count should be done for Figure 7D. Immunolabeling with more markers for Type I pneumocytes, including AQP5 Ab, should be performed to determine if the decreased Aqp5 RNA expression correlates with less Type I cells. GSEA signature has to be confirmed by additional analyses.

      Answer:

      Given the flat appearance of the T1a-positive cells (see old Figure 7E) it is difficult to carry out a quantification for T1a (which is Pdpn). We will perform new IF experiments to examine AT1/2 cell numbers using additional markers (e.g. Hopx for AT1).

      Comment:

      p.17: The same comments can be made for Type II pneumocytes and SpC expression.

      Answer:

      We actually did do an Sftpc (Pro-SPC) count (see old Figure 7E), which reveals that the number of Sftpc-expressing cells is up in the Clasp1 KO. At first sight this seems surprising, given that Chil1 (a top AT2 signature gene at E18.5) is virtually absent from Clasp1 KO lungs. However, our new GSEA analysis (shown in the new Figure 6) shows that of all the E18.5 AT2 signature genes (403 genes in total) the majority is down-regulated, including Chil1 and 4 other top signature genes, but some genes are up, including Sftpc (see new Figure 6). Combined with the fact that we observe more Pro-SPC-expressing cells in the Clasp1 KO lung we hypothesise that AT2 cell numbers are up compared to wild type, giving rise to higher mRNA counts of some genes in the RNA-Seq. Differentiation of AT2 cells is significantly hampered, giving rise to lower expression of many AT2 signature genes in the RNA-Seq. By contrast, all AT1 signature genes are either down or not affected (see new Figure 6). We interpret this as evidence that AT1 cell numbers are down. The same goes for endothelial cells (EC, see new Figure 6). We will perform additional IF experiments to examine this hypothesis.

      Reviewer #3.

      Comment:

      T1α-positive cells should be quantified (Figure 7D). From the images, the number of T1α+ cells in Clasp1 KO is not consistent with the qPCR result showing markedly reduced Aqp5 transcript levels in Clasp1 KO. It is unclear whether the reduction in Aqp5 is due to impaired water channel function as the authors suggest or instead due to reduced number of AT1 cells, further investigation should be conducted.

      Answer:

      Please see our answer to reviewer #1 above. To summarise, we now have evidence that AT1 cell numbers are down. We will perform additional IF experiments to examine this hypothesis.

      Comment:

      Additional AT1 markers (Hopx, Ager, Clic5 and Rage) should be assessed by qPCR and immunostaining to determine the effect of Clasp1 knockout on AT1 cells.

      Answer:

      Please see our answer to reviewer #1 above. To summarise, we will perform new IF experiments to examine AT1/2 cell numbers using additional markers (e.g. Hopx for AT1).

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      General remarks

      As explained in detail below, we believe that our new RNA-Seq analysis has uncovered a mechanism underlying the severe lung phenotype of Clasp1 KO mice, and that it has revealed the major cell types affected in embryonic Clasp1 KO lungs.

      Brief summary of experiments

      In the first version of the manuscript we used Gene Set Enrichment Analysis (GSEA, see https://www.gsea-msigdb.org/gsea/index.jsp) to compare our RNA-seq results to publicly available scRNA-Seq datasets of cell type signature gene sets, which contain cluster marker genes for cell types identified in single-cell sequencing studies of human tissue. As stated in our manuscript, this revealed “enrichment of alveolar epithelial type I cells and lung capillary intermediate cells in WT lungs ….”. However, the analysis was restricted to what is available in the Gene Set Enrichment Analysis database of the University of San Diego. Thus, we could only compare our embryonic mouse lung data to adult human lung scRNA-Seq data.

      We recently discovered publicly available scRNA-Seq datasets of the mouse lung (see https://research.cchmc.org/pbge/lunggens/mainportal.html and https://lungcells.app.vumc.org). The data in these portals are not part of the common GSEA sets of the University of San Diego. In particular the LGEA web portal is very easy to use and data can be downloaded for individual applications. In the new version of our manuscript we compared our RNA-Seq data to scRNA-Seq data of the embryonic mouse lung, focussing on E18.5. We first overlaid differentially expressed genes in Clasp1 KO lungs with LGEA E18.5 scRNA-seq gene signatures for different cell types, and we subsequently compared all the genes in our dataset with the gene signature lists, using custom-built gene signature sets and the GSEA software. In addition, we interrogated LGEA to find out which signature genes are specifically turned on from E16.5-E18.5 in the different cell types in the developing mouse lung. We found, for example, that Chil1, which is the most severely down-regulated gene in our Clasp1 KO RNA-Seq, is a very prominent AT2 signature gene; Chil1 is hardly expressed at E16.5 and prominently comes up at E18.5.

      Our combined analysis strongly suggests that four cell types (AT1, AT2, endothelial cells (EC), and immune cells (IC)) are affected in their differentiation in the Clasp1 KO lung, and that this defect occurs in the later stages of lung development (from E16.5 onward). As the top five differentially down-regulated genes in KO lungs (including Chil1) are all top signature genes of AT2 cells, these data strongly suggest that it is this cell type that is most affected in the KO. A Metascape analysis (which includes a GO enrichment analysis, see also our specific answer to comments of reviewer #3 below) is consistent with the scRNA-Seq comparison and suggests, among others, that the secretory pathway might be hampered in the Clasp1 KO. This analysis furthermore indicates that cholesterol metabolism might be affected in the Clasp1 KO, which bears relevance to our dexamethasone rescue experiments.

      Specific remarks to the comments of the reviewers

      Reviewer #1.

      • *

      Comment:

      p.6: What is the justification to mention Nfib, Pdpn and Ndst1 mutant mice in the introduction? Do these genes have any cellular/molecular/functional relation with CLASP1?

      Answer:

      We initially wanted to provide examples of genes important for lung maturation, whose absence in knockout mice leads to lung collapse. Of the examples provided Pdpn (which is equal to the marker T1a) bears a relation with our data in that it is down-regulated in Clasp1 KO lungs (see Table S2, RNA-Seq); furthermore, we examined T1a localisation in IF stainings (see old Figure 7E). In the new version of the manuscript we modified this Introduction section, to better align with our recent results, and to introduce the papers mentioned by reviewer #3 (Nelson et al., 2017; doi:10.1242/dev.154823, Li, J. et al., Dev Cell, 44, 297-312 e5.), who points out that pressure plays an important role in lung development. In the Li et al manuscript Pdpn is mentioned as being expressed at E16.5 in so-called Id2+ cells, together with Sftpc. These cells are proposed to be the precursors of the AT1/2 epithelial cells that arise later.

      Comment:

      p.8: It is mentioned that CLASP1 is expressed in secretory cells of the lung. Which ones? Is CLASP1 expressed in nerves, muscle cells and/or fibroblasts of the diaphragm? These information are important according to the phenotypes described.

      Co-immunolabelling experiments should be done.

      Answer:

      We apologize for our incorrect phrasing. With respect to the lung, we now state that “CLASP1 is expressed in the endothelium of blood vessels, as well as in all cells lining the airways of mouse lungs at E18.5 (Fig. 1A)”.

      Comment:

      p.11: To identify the cause of the respiratory failure, the authors looked at the innervation pattern of the phrenic nerve in the diaphragm. Mutants present decreased branching but larger nerve extensions covering a wider innervated area and less neuromuscular junctions. Despite the decreased innervation of the diaphragm, its morphology is normal as well as the ultra-structure of the sarcomeres suggesting a mild phenotype rather than the cause of death of the mutants as suggested by the authors (p.20).

      Diaphragmatic muscle activity should be measured to establish if the contractile activity of the diaphragm is affected. This might support the statement of the authors.

      Answer:

      We thank the reviewer for these observations. We agree with the reviewer and have toned down our conclusions in this section. We now simply describe the innervation pattern because we believe it is interesting, and we tentatively conclude that it may contribute to the severe respiratory phenotype which is primarily due to impaired AT1/2, EC, and IC differentiation.

      Comment:

      p.13: The authors examined lung from mutants. Mutant lungs do not float and they are collapsed at birth. However, lung morphology appears normal and myofibroblasts, ciliated cells and Club cells are present as shown by IHC labeling. No difference in proliferation and apoptosis was reported.

      It would have been more informative to do BrdU/EdU immunolabeling for proliferation in order to see if differences occur in specific cell types of the lung. It is not clear why the authors have limited their IHC analysis to these three specific cell types. A complete analysis should be done.

      Answer:

      As described above (general remarks), we compared our RNA-Seq data to publicly available scRNA-Seq data from the developing mouse lung (see new Figure 6). These comparisons reveal which cell types are affected in the Clasp1 KO lung (AT1/2, EC, IC), and which process might be hampered.

      Comment:

      p.14: The authors proposed a delay in lung development according to lung morphology that appears more collapsed starting at E15.5.

      Measurement of branching would allow to quantify this delay. Since cell differentiation occurs ~E16.5, analysis of the onset of cell types can also support a delay in lung development.

      Answer:

      As described above (general remarks), we compared our RNA-Seq data to publicly available scRNA-Seq data from the developing mouse lung (see new Figure 6). This not only revealed which cell types are affected in the Clasp1 KO lung, but also suggest that a differentiation block occurs at E16.5 to E18.5. For example, Chil1, a top AT2 signature gene of E18.5, is hardly expressed at E16.5 and is strongly upregulated at E18.5. This gene fails to become up-regulated in the Clasp1 KO, indicating that epithelial precursor cells have problems differentiating to AT2 type cells. By contrast, Id2, a marker of precursor epithelial cells, is normally expressed in the Clasp1 KO, and two genes that are co-expressed with Id2 in these precursor cells (Pdpn and Sftpc) are slightly down and up, respectively, in the Clasp1 KO. Thus, while our lung morphology studies might suggest early defects, our RNA-Seq indicates that specific defects occur during the late terminal saccular stage, i.e. from E16.5 onward. We therefore agree with with Negretti et al (2021, doi: 10.1242/dev.199512, Discussion section) who state: the developmental stages of the lung are largely founded on histologically descriptive features. While this is important, such a categorization often results in debate regarding the function and identity of cell types within the boundaries of each stage. By contrast transcriptome analysis suggests that different cell types commit to change asynchronously during development, suggesting that the timing of the saccular-to-alveolar transition is fluid and highly cell-type specific.

      As shown by Li et al (2018, doi.org/10.1016/j.devcel.2018.01.008) mechanical forces contribute to embryonic lung alveolar epithelial cell differentiation. Interestingly, RNA-Seq data from Nelson et al (2017; doi:10.1242/dev.154823) suggest that CLASP1 is a “pressure sensing gene” (see also below, our answer to comments of reviewer #3). Thus, Clasp1 KO lungs might fail to properly sense pressure, which could explain, at least in part, the observed failure in epithelial differentiation.

      Comment:

      p.15: Finally, the authors conclude this section by "these data support a direct role for CLASP1 in lung maturation".

      Which direct role? How? This sentence appears premature according to the data presented. The authors should look at microtubule dynamics in lung cells from mutant embryos to see if a link exists between the proposed role of the protein and the lung phenotype observed.

      Answer:

      The reviewer is correct, knockout studies can not demonstrate a direct role of a protein in a perturbed process. We have therefore removed the word “direct” from this phrase.

      Comment:

      p.15: The authors attempted to rescue the defective lung maturation phenotype by treating pregnant females with dexamethasone at late gestational stages. Around 10% of mutants survive for more than 45 minutes to 2 hrs compared to 20-30 minutes for mutants obtained from untreated mothers (p.9). Even though it is an intriguing result, the very small numbers of "survivors" makes very difficult to reach a conclusion.

      This section should be shortened.

      Answer:

      Our new Metascape analysis, which will be presented in the new Figure 8, suggests that cholesterol metabolism is affected in the Clasp1 KO mice. Cholesterol is an important component of mammalian cell membranes, of both alveolar and lamellar body surfactant, and it is a precursor of vitamin D and steroid hormones. A cholesterol defect would explain the partial rescue by dexamethasone in the Clasp1 KO, i.e. dexamethasone can rescue a steroid hormone defect but it cannot rescue other defects (e.g. surfactant production). Given these new results we decided to leave the section on glucocorticoids as it is and come back to it when we discuss the Metascape result in the revised manuscript.

      Comment:

      p.16: To determine which molecular mechanisms are responsible for the lung defect, the authors performed RNA-seq analysis on E18.5 lung specimens. The number of genes with significant differential expression was low and the highest scores were cathepsin E for the upregulated gene and chitinase-like 1 for the downregulated gene.

      Are these two genes known for their role in lung development? Please describe.

      Answer:

      The Ctse gene, which encodes Cathepsin E, is indeed the most upregulated gene in the Clasp1 KO. Although it is up-regulated in all three KO mice, Ctse expression is quite low (normalised counts: ~2 in KO, up from ~0.2 in WT). Based on the comment of this reviewer we examined Ctse expression in the scRNA-Seq lung repositories, but we could not find any description, presumably because its expression is too low (scRNA-Seq has difficulty catching low abundance genes), consistent with our data. Furthermore, there is not much literature on the role of Cathepsin E in the lung. We therefore decided to remove any mention of Ctse in the manuscript. By contrast, the expression and function of Chil1 are described in detail.

      Comment:

      p.16: Except for the fact that Chil1 is also downregulated in mutant lungs for the H3K4 methyltransferase Mll3 gene, it is not clear why the authors compared these 2 sets of data.

      Can CLASP1 and MLL3 interact together? How? Did the authors looked at the list of genes that are commonly differentially expressed? Does it provide some clues on the mechanisms? The RNA-seq data should be analyzed more deeply.

      Answer:

      The reviewer is correct, we compared the Mll3 (i.e. Kmt2c) RNA-Seq dataset because Chil1 is down-regulated in the Mll3 KO lung at E18.5, like in the Clasp1 KO. To examine a possible relation between Mll3 and Clasp1 in more detail, we overlaid the differentially expressed genes from the Mll3 dataset with the custom-built gene signature dataset of E18.5 lung (described above). The data suggest that Mll3 knockout affects AT1 differentiation (see new Supplementary Figure S6C). This mode of action is clearly different from that of CLASP1, and since Mll3 is nuclear and CLASP1 is cytoplasmic we do not believe these proteins interact. Given our new and exciting data on the Clasp1 KO lung phenotype, we moved the Mll3 data to the new Supplementary Figure 6, and only briefly we touch upon these data in the manuscript.

      Comment:

      p.16: There is also a Clasp2 gene with a more restricted expression pattern. Clasp2 mutant mice either die from hemorrhages or survive. It is not clear why the RNA-seq data of the lungs from Clasp2-/- mice are presented since no lung phenotype is mentioned for these mice. How the lack of change in Chil1 expression in Clasp2 mutant lungs is informative?

      This should be clarified or the data should be removed.

      Answer:

      The reviewer is correct, i.e. in light of our new findings (Chil1 is a top signature gene of E18.5 AT2 cells) it makes little sense to include the Clasp2 KO RNA-Seq data, as these were generated in adult mouse lungs. We therefore removed these data from the manuscript.

      Comment:

      p.31: The authors mentioned a role for CLASP1 in the mesenchyme.

      What are the experiments and data that support this sentence?

      Answer:

      We thank the reviewer for this remark, we have no evidence for a role of CLASP1 in the mesenchyme and have removed this phrase.

      Comment:

      How do the authors reconcile their observation of CLASP1 expression in lung secretory cells (p.8) with their conclusion of defective Type I cell differentiation (p.17)?

      Answer:

      We apologize for our incorrect phrasing. With respect to the lung, we now state that “CLASP1 is expressed in the endothelium of blood vessels, as well as in all cells lining the airways of mouse lungs at E18.5 (Fig. 1A)”.

      Reviewer #2.

      Comment:

      Fig. 3. There is not a lot of detail how the analysis in B-E was done, and no primary data for the synaptic defects.

      Answer:

      We have removed these data from the manuscript.

      Reviewer #3.

      Comment:

      1. The authors showed significant reduction in the rib cage size and abnormal diaphragm innervation in Clasp1 KO. Mechanical properties play a crucial role in regulating lung development and maturation. So changes in intrathoracic space and pressure are a major limiting factor that impairs lung development and maturation (Nelson et al., 2017; doi:10.1242/dev.154823, Li, J. et al., Dev Cell, 44, 297-312 e5.). Answer:

      We thank the reviewer for these interesting papers and observations.

      Nelson et al (2017; doi:10.1242/dev.154823) devised a method to culture lung-on-a-chip where they can induce pressure in culture. They apply this to examine lung development and they also do RNA-Seq. Interestingly, they find that Clasp1 is down-regulated at high pressure compared to low pressure (log2FC 0.5, Clasp1 goes down ~1.5 fold in high pressure). Thus Clasp1 appears to be a “pressure-responsive gene”. However, Nelson et al examine gene expression at much earlier time points than we do (E12-14 versus E18.5). In our view it therefore makes little sense to compare RNA-Seq data.

      Li et al (2018 doi.org/10.1016/j.devcel.2018.01.008) show that mechanical forces help to control embryonic lung alveolar epithelial cell differentiation. More specifically, mechanical force from amniotic fluid inhalation ensures AT1 cell differentiation, whereas FGF10-mediated ERK1/2 signaling induces a protrusive structure in some cells that protects from mechanical force-caused flattening to specify AT2 fate. They conclude that future AT2 cells can “embed” into mesenchyme by exerting an acto-myosin based force and hence they can keep their cuboidal shape. The differentiation of the two cell types occurs at different time points, E16.5 for AT2, and E17.5 for AT1. In this manuscript they also mention that Id2+ tip cells express pro-SPC and Pdpn (which are up and down, respectively, in Clasp1 KO). These Id2+ cells would be the AT1/2 progenitors.

      We believe that a smaller ribcage in the Clasp1 KO does not necessarily have to be a cause of increased pressure on the lung, if the lung is also smaller. Nonetheless, since CLASP1 is a “pressure-responsive gene”, Clasp1 KO lungs might experience aberrant pressure sensing (in addition to a possible pressure difference due to a smaller ribcage). This different sensing predicts altered differentiation pathways, which is exactly what we see. We have modified the revised version of the manuscript to reflect these thoughts and observations.

      Comment:

      Since CLASP1 was found to be highly expressed in the lung endothelium (Figure 1A), this suggests the importance of CLASP1 in the lung vasculature. GSEA analysis also showed significant downregulation of genes from the lung capillary intermediate 1 cell signature gene set in Clasp1 KO (Figure 7G). Extensive crosstalk between the lung endothelium and other lung cell types is critical for the regulation of lung development. However, no further investigation was carried out to elucidate this.

      Answer:

      We have performed a new comparison, which is extensively discussed above and shows that EC are affected in the Clasp1 KO lungs, as predicted by this reviewer. We will discuss crosstalk between cell types in the new version of the manuscript.

      Comment:

      Analysis of RNA-Seq data needs to be re-written. Pathway or GO enrichment was not performed. Although the authors have identified a number of key DEGs, only Chil1 was investigated. It is also unclear how it led the authors to identify Mll3 KO experiment on the Omnibus repository. A list of overlapped genes between Mll3 KO dataset and Clasp1 KO dataset were not provided. Aqp5 (AT1 marker gene) that authors claimed to be significantly reduced in Clasp1 KO is not on the DEGs list (Table S2).

      Answer:

      We initially focused on Chil1 because its expression is almost completely abrogated in all three Clasp1 KO lungs. The identification of the Mll3 dataset was coincidental; we mentioned it because Chil1 is also affected in these KO mice. A Venn diagram of overlapping significantly deregulated genes in both datasets is shown in the new Figure S6 of the revised manuscript. However, this analysis has been superseded by the new comparison with scRNA-Seq data from the LGEA web portal. As extensively explained above this new analysis provides a satisfying explanation for the lack of Chil1 in Clasp1 KO lungs. We also performed a Metascape analysis (which includes pathway and GO enrichment analyses), which will be included in the revised version of this manuscript. Finally, the reviewer is correct that Aqp5 is not in the DEGs list, this is because the adjusted p-value did not reach the required significance. We nevertheless showed its RNA-Seq values, first because the p-value is significant, second, because RT-PCR experiments confirm it to be down-regulated, and third, because Aqp1 (another AT1 marker) is also deregulated (with an adjusted p-value that is significant). In the revised manuscript we will examine Aqp5 levels by IF staining.

      Comment:

      There is a lack of cohesion between the experimental findings presented in the paper and the RNA Seq data analysis. Pathway or GO enrichment was not performed for the DEGs the authors identified. This would help identify the key functions of the deregulated genes in Clasp1 KOs and provide a fuller picture of what pathways/biological processes are dysregulated in the absence CLASP1. Instead, the authors have focused on one single gene, Chil1 in the subsequent analysis. The authors infer that overlapped DEGs between Mll3 KO and Clasp1 KO mean that same cell types or signalling pathways are affected in embryonic lungs of Mll3 and Clasp1 KO, this is an overinterpretation. A list showing the overlap in DEGs between Mll3 KO dataset and Clasp1 KO dataset should be provided.

      Answer:

      We have improved our RNA-Seq analysis and we have performed a Metascape analysis, which includes pathway and GO enrichment analyses. Results are shown in the new Figures 6 and 8. The Metascape analysis indicates which pathways/biological processes are deregulated in the absence CLASP1. We observe, for example, defects in endocytosis, and cholesterol metabolism. Given the new data, we decided to pay less attention to the Mll3-CLASP1 comparison.

      Minor comments:

      1. Figure 1A - please label the specific cell types to aid visualisation.
      2. Figure 6B - present the Log2FC for KO vs WT instead of WT vs KO to facilitate data visualisation and interpretation
      3. Figure 6E - provide the overlapping genes in a list and include it as a supplementary table
      4. Figure 7D and 7F - Quantification is needed
      5. The statistical tests used should be added to the figure legends.
      6. There is some wording in the manuscript that is either unclear or inaccurate, please carefully check the manuscript. e.g. manuscript refers to alveolization- I would recommend changing this to the more widely used terms alveolarization or alveologenesis. The manuscript refers to 'catastrophe rate'- this term needs to be defined. Answers:

      7. This has been done.

      8. This has been done.
      9. This panel has been moved to a Supplementary Figure, as the analysis is less relevant now we will not provide the list.
      10. This will be done.
      11. This has been/will be done.
      12. This has been done. The term “catastrophe rate” has been removed.
      13. *

      4. Description of analyses that authors prefer not to carry out

      General remarks

      Based on the comments of reviewers #2 and #3 we have decided to fully focus our revised manuscript on the lung phenotype of the Clasp1 KO mice. We still do show the results on the ribcage (Figure 2) and diaphragm (Figure S2) because they might enhance the severity of the lung phenotype. We have decided not to carry out extra “non-lung” experiments.

      Specific remarks to the comments of the reviewers

      Reviewer #1.

      Comment

      p.10: Homozygous mutants are smaller. The authors reported minor skeletal phenotypes small rib cage and delayed ossification in sternum and occipital bone.

      The number of specimens analyzed was not mentioned rendering difficult to establish if these observations are important or not. Stats should be included.

      Answer:

      Whereas the results of Figure 1H, I (growth deficits at E15.5 and PN1) are based on analysis of multiple animals, the embryonic skeleton data presented in Figure 2 are based on single mouse comparisons, i.e. one WT and one KO. Given the obvious growth deficit in the KO (Figure 1H, I) and the fact that gross morphological observation did not reveal a specific body part in the KO mice that is affected (Figure 1G), we were of the opinion that a representative comparison of the skeleton is allowed and we therefore kept Figure 2 intact. Since we focus in the revision on the lung phenotype, we have decided against examining the skeletons of more mice. We are willing to remove Figure 2, or make it Supplemental, if the reviewer feels that the skeletal phenotype is too prominently displayed.

      Comment:

      p.10: The authors established MEF used to study cell division. Multipolar spindles and additional centrosomes were detected in mutant cells.

      No stats were provided to establish if the differences in numbers are significant. According to the authors, the cell division defects may explain the smaller size of mutants. The authors should check proliferation in MEF. The sentence of conclusion is not well supported according to the data presented.

      Answer:

      Based on the advice of reviewer #2, who states “I think it would be best to better focus the paper on the lung phenotype”, we have decided to remove the mitotic data on MEFs.

      Comment:

      p.12: The authors looked at the growth capacity of motor neurons and dorsal root ganglion neurons and showed a reduced growth in both cases.

      How do the authors reconcile the observation made in the diaphragm in which nerve extensions are larger with the reduced growth capacity of neurons?

      Answer:

      We thank the reviewer for this remark, which is difficult to address, as CLASPs are expressed at different levels in neurons and as different isoforms, which may even have antagonistic functions. For example, in our recent publication (Sayas et al, 2019, DOI: 10.3389/fncel.2019.00005) we find through RNA-Seq that in cultured hippocampal neurons (3DIV) Clasp2β/γ levels are increased compared to Clasp2α-mRNA and that both in hippocampal and in DRG neurons Clasp2 mRNA levels are higher than Clasp1. As CLASP2b/g have a different function compared to CLASP2a, it is conceivable that absence of CLASP1 leads to different effects due to different CLASP2 activities. However, we recognize that these are speculations. Because of this and because reviewer #2 advices against inserting the neuronal data, we have decided to completely remove these results from the manuscript.

      Comment:

      p.12: The authors used cultured hippocampal neurons for imaging microtubule growth. According to the authors, the loss of CLASP1 deregulates microtubule dynamics.

      No explanation was provided to justify the use of hippocampal neurons. What is a catastrophe rate? What is the justification to study this parameter? What does it tell us about microtubule dynamics?

      Answer:

      Although we have decided to remove the neuronal data from the revised manuscript, we would like to address this comment nonetheles. Hippocampal neurons are often used in the field, hence they represent a “golden standard”. Furthermore, the techniques to examine microtubule dynamics are well established in this system. Dynamic microtubule behaviour is described using five parameters: growth rate of microtubules, shrinkage rate of microtubules, catastrophe and rescue frequencies (the conversion of growth to shrinkage or from shrinkage to growth, respectively), and pauzing times. The marker used in our studies (EB3-GFP) accumulates at the ends of growing microtubules, allowing us to measure growth rate and the duration of a growth event. The latter is the inverse of the catastrophe frequency. Hence, using EB3-GFP we are able to examine two of the five parameters. Although this is not complete the parameters do allow us to draw (speculative) conclusions. For example, a higher growth rate indicates that free tubulin concentration is higher, as tubulin concentration is a main determinant of growth rate. This in turn means that there are less microtubules (tubulin must come from somewhere). If this correlates with the catastrophe frequency (which should be higher) than one can conclude that CLASP1 is a microtubule-stabilising protein.

      Reviewer #2.

      Comment:

      Fig. S1. It would be good to indicate the number of cells / experiments analyzed. In panel D, there is only one multi-nucleated cell, which without further analysis does not mean much. The authors correlate this mitotic defect with smaller animal size although this connection is not at all conclusive. If both CLASPs are important for mitosis, do CLASP2 KOs have similar size defects? It is also mentioned above that CLASP1 KOs show microcephaly. Are there fewer neurons that might also be linked to a stem cell division defect? I understand that this is not the central point of the paper and important to include given previous work on CLASPs, but it would be good to discuss a little clearer. It seems the authors do not think this is the/a cause of the lung phenotype, but can that be completely excluded?

      Answer:

      Based upon suggestions of this reviewer (for example: “I think it would be best to better focus the paper on the lung phenotype”) we will not address this comment beyond a statement that Clasp2 knockout mice are indeed also smaller.

      Fig. 4. Please indicate n of cells / experiments and statistics in the figure legend. In panel B and C, it would help to include the time on the figure itself and to scale the y-axis the same to better illustrate differences. It is very hard to see much in panel D. The quantifications in E and F do not make sense. How can the total neurite length (average of many neurons?) be larger than the longest neurite length?

      The switch to MT dynamics in Fig. 4 is very abrupt and the relevance is unclear. Where were these kymographs located in the neuron (growth cones or neurites)? Primary data needs to shown here. The changes in catastrophe frequency are not that large and I doubt this can be accurately measured from kymographs as shown. Yes, MTs are important in neurite growth, but the potential link here is very vague. Are similar changes in MT dynamics also seen in the MEFs?

      Minor:

      Answer:

      See above, we will not address these comments, since we will remove these data.

      Reviewer #3.

      Comment:

      The lung morphological difference and disrupted lung cell differentiation in Clasp1 KO could be secondary to the biomechanical defects. This is crucially important but is not addressed in this study, ex vivo lung culture may help to answer this question.

      Answer:

      While the experiments suggested by this reviewer are interesting, we do not have sufficient expertise (nor the equipment) to carry out such specialised experiments.

      Comment:

      CLASPs are known to regulate directed cell migration (Myer and Myers 2017, doi: 10.1242/bio.028571) and this is a key process required for lung morphogenesis. Experiments to address whether directed cell migration is affected should be conducted in Clasp1 KO mice.

      Answer:

      We agree that migration assays would be interesting to perform. However, again, we do not have the expertise to do such assays in the developing lung. Experiments in MEFs are possible, and indeed, we previously showed a role for CLASP2 in directed cel migration in MEFs (DOI: 10.1016/j.cub.2006.09.065). However, lung epithelial cells are different from MEFs, and we have shown that CLASPs have cell type- (and isoform-)specific functions. Reviewer #2 actually advised us to focus on the lung phenotype.

      Comment:

      Higher magnification images of staining for microtubule associated proteins in neurons is required to show the details of the defects.

      Answer:

      Based on the reviewers’ advice we decided to take out the neuronal data and focus the manuscript on the lung phenotype.

    1. Author Response

      Reviewer #2 (Public Review):

      The authors performed a series of impressive experiments to systematically establish each part of their CRISPRi method. They provided one of the most compact design of CRISPRi dual-guideRNA library, with a genome-wide coverage; they confirmed prior finding on the optimal repressor domain to generate a set of useful vectors for expressing the repressor; they showcased the usage of the system in multiple common cancer cell lines. The authors also took an important step towards providing a detailed and well-annotated protocol (in the supplementary materials) to help users of their methods. The items listed below would be helpful to further improve this work:

      First, while the dual guideRNA design is a useful development, the author also noted the significant rate (~30%) recombination between the two sgRNAs. This should be further discussed and evaluated in the manuscript to help readers understand the implication of this high recombination rate. For example, across replicate experiments or across cell types tested, would the recombination be stochastic, or there may be some bias of which guide would be recombined? Are there any cell-type dependencies here in terms of the recombination rate? This would also help future users to decide if they would need to check for this effect during functional screening.

      We agree that recombination is an important limitation of dual-sgRNA screens. We included additional analyses and data in the revised manuscript to help readers understand the implications of the observed recombination.

      First, we performed growth screens using dual-sgRNA libraries in two additional cell lines (RPE1 and Jurkat) to address the potential cell type specificity of lentiviral recombination. We cloned a dual-sgRNA library targeting DepMap Common Essential genes (n=2291 dual-sgRNA elements). We transduced cells with this library, harvested cells at day 7 post-transduction, amplified sgRNA cassettes from extracted genomic DNA, and sequenced to quantify sgRNA recombination rates. We found similar recombination rates of dual-sgRNA constructs isolated from these three cell types (observed K562 recombination rate 29%; observed RPE1 recombination rate 26%; observed Jurkat recombination rate 24%).

      Next, we compared the recombination rates of each dual-sgRNA element. Our expectation was that lentiviral recombination would be largely stochastic per element based on the known mechanism of lentiviral recombination previously discussed in Adamson et al. 2018 (https://www.biorxiv.org/content/10.1101/298349v1.full) given that the constant region between sgRNAs (400bp) far exceeds the length of sgRNA targeting regions (20bp). However, we would also expect apparent recombination rates to be artificially inflated for dual-sgRNAs with strong growth phenotypes, as the stronger growth phenotypes of unrecombined dual-sgRNAs compared to recombined dual-sgRNAs will lead to dropout of unrecombined dual-sgRNAs. To account for this bias, we began by comparing the recombination rate for non-targeting control dual-sgRNAs excluding those with growth phenotypes across replicates of our K562 screens. There was only a weak correlation between the recombination rate for non-targeting control dual-sgRNAs (r = 0.30; Figure 1 – Figure Supplement 1E). We next compared the recombination rates of all dual-sgRNA elements (both targeting and non-targeting) across replicates of our K562 screens. As expected, we observed that the recombination rate of elements was correlated across replicates (r = 0.77; Figure 1 – Figure Supplement 1F), and the recombination rate was strongly anticorrelated with the growth phenotype of dual-sgRNAs in K562 cells (r = -0.84; Figure 1 – Figure Supplement 1G). We have integrated these data into the manuscript.

      Second, on the repressor development and evaluation. As the author mentioned in the text, the expression level of the repressor can confound their conclusion on fitness/efficiency comparisons of CRISPR repressor. Thus, it would be helpful to perform protein level validation using the cell lines they generated, such as a WesternBlot comparison to rule out this potential issue.

      We agree that differences in expression levels of the effectors can confound comparisons and that Western Blotting for such differences would be valuable. That said, any such analyses would not substantively alter the main claim of our paper, which is that Zim3-dCas9 provides excellent on-target knockdown in the absence of non-specific effects on cell growth or gene expression. This finding is of immediate practical use to the community. By no means are we claiming that we eliminated all possible confounding factors nor do we think that it is possible to do so. To avoid overstating our findings, we had acknowledged in the discussion that expression levels may indeed be a confounding factor, we had noted in the methods section that the dCas9-MeCP2 effector uses a different coding sequence for dCas9, which may contribute to differences in expression, and we had noted that other effectors may prove useful in some settings. We have further emphasized that differences in expression levels may contribute to our results in the revised manuscript.

      This work would also benefit from including cell proliferation/viability measurement using the selected Zim3-dCas9 repressor in multiple cell lines, as it seems this was only done initially in K562 cells. As authors noted, the fitness effects of the CRISPR repressor would be a major concern when performing functional genomics screening, so such validation of fitness-neutrality of the repressor can be very helpful for potential users of their method and approach.

      To address this point, we assessed the proliferation of HepG2, HuTu-80, and HT29 cells expressing Zim3-dCas9. Expression of Zim3-dCas9 did not have a negative impact on proliferation in any of these cell types, providing further evidence that the Zim3-dCas9 will be broadly useful. We included these data in Figure 4 – Figure Supplement 2 in the revised manuscript. That said, we cannot rule out that expression of Zim3-dCas9 may be detrimental in other cell types. Indeed, we want to emphasize that users should evaluate both on-target knockdown and lack of non-specific effects of effectors in new cell models before proceeding to large-scale experiments. The assays and protocols we describe are ideally suited for this purpose. We have further emphasized this point in the discussion section to guide users.

      Third, a major resource from this work, as the authors noted, is a suite of useful Zim3-dCas9 cell lines. The authors have performed a set of experiments to demonstrate the knockdown efficiency with dozens of guideRNAs. While this is a good initial validation, to really ensure the cell lines are performing as expected, a small scale screening in pooled fashion will be more convincing. This would be a setting more relevant for potential readers, given that pooled screening would likely be the most powerful application of these cell lines.

      While conducting the work described in this manuscript, we had used the Zim3-dCas9 RPE1 cell line for a large-scale pooled screen with single-cell RNA-seq readout (Perturb-seq, Replogle et al. 2022). Across greater than 2000 target genes, the median knockdown was 91.6%, which provides strong validation that Zim3-dCas9 performs as expected in this cell line. We had noted this point in the discussion section of our manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors focused on linking physiological data on theta phase precession and spike-timing-dependent plasticity to the more abstract successor representation used in reinforcement learning models of spatial behavior. The model is presented clearly and effectively shows biological mechanisms for learning the successor representation. Thus, it provides an important step toward developing mathematical models that can be used to understand the function of neural circuits for guiding spatial memory behavior.

      However, as often happens in the Reinforcement Learning (RL) literature, there is a lack of attention to non-RL models, even though these might be more effective at modeling both hippocampal physiology and its role in behavior. There should be some discussion of the relationship to these other models, without assuming that the successor representation is the only way to model the role of the hippocampus in guiding spatial memory function.

      We thank the reviewer for the positive comments about the work, and for the detailed and constructive feedback. We agree with the reviewer that the manuscript will benefit from significantly more discussion of non-RL models, and we’ve detailed below a number of modifications to the manuscript to better incorporate prior work from the hippocampal literature, including the citations the reviewer has listed. Since our goal with this paper is to contextualise hippocampal phenomena in the context of an RL learning rule, this is really important and we appreciate the reviewers recommendations. We have added text (outlined in the point-by-point responses below) to the introduction and to the discussion that we hope better demonstrates the connections between the SR and existing computational models of hippocampus, and communicates clearly that the SR is not unique in capturing phenomena such as factorization of space and reward or capturing sequence statistics, but is rather a model that captures these phenomena while also connecting with downstream RL computations. Existing RL accounts of hippocampal representation often do not connect with known properties of hippocampus (as illustrated by the fact that TD learning was proposed in prior work to be the learning mechanism for SRs, even though this doesn’t have an obvious mechanism in HPC), so the purpose of this work is to explore the extent to which TD learning effectively overlaps with the well-studied properties of STDP and theta oscillations. In that sense, this paper is an effort to connect RL models of hippocampus to more physiologically plausible mechanisms rather than an attempt to model phenomena that the existing computational hippocampus literature could not capture.

      1) Page 1- "coincides with the time window of STDP" - This model shows effectively how theta phase precession allows spikes to fall within the window of spike-timing-dependent synaptic plasticity to form successor representations. However, this combination of precession and STDP has been used in many previous models to allow the storage of sequences useful for guiding behavior (e.g. Jensen and Lisman, Learning and Memory, 1996; Koene, Gorchetchnikov, Cannon, Hasselmo, Neural Networks, 2003). These previous models should be cited here as earlier models using STDP and phase precession to store sequences. They should discuss in terms of what is the advantage of an RL successor representation versus the types of associative sequence coding in these previous models.

      We agree that the idea of using theta precession to compress sequences onto the timescale of synaptic learning is a long-standing concept in sequence learning, and that we need to be careful to communicate what the advantages are of considering this in the RL context. We have added these citations to the introduction:

      “One of the consequences of phase precession is that correlates of behaviour, such as position in space, are compressed onto the timescale of a single theta cycle and thus coincide with the time-window of STDP O(20 − 50 ms) [8, 18, 20, 21]. This combination of theta sweeps and STDP has been applied to model a wide range of sequence learning tasks [22, 23, 24], and as such, potentially provides an efficient mechanism to learn from an animal’s experience – forming associations between cells which are separated by behavioural timescales much larger than that of STDP.” and added a paragraph to the discussion as well that makes this clear:

      “That the predictive skew of place fields can be accomplished with a STDP-type learning rule is a long-standing hypothesis; in fact, the authors that originally reported this effect also proposed a STDP-type mechanism for learning these fields [18, 20]. Similarly, the possible accelerating effect of theta phase precession on sequence learning has also been described in a number of previous works [22, 55, 23, 24]. Until recently [40, 41], SR models have largely not connected with this literature: they either remain agnostic to the learning rule or assume temporal difference learning (which has been well-mapped onto striatal mechanisms [37, 56], but it is unclear how this is implemented in hippocampus) [54, 31, 36, 57, 58]. Thus, one contribution of this paper is to quantitatively and qualitatively compare theta-augmented STDP to temporal difference learning, and demonstrate where these functionally overlap. This explicit link permits some insights about the physiology, such as the observation that the biologically observed parameters for phase precession and STDP resemble those that are optimal for learning the SR (Fig 3), and that the topographic organisation of place cell sizes is useful for learning representations over multiple discount timescales (Fig 4). It also permits some insights for RL, such as that the approximate SR learned with theta-augmented STDP, while provably theoretically different from TD (Section 5.8), is sufficient to capture key qualitative phenomena.”

      2) On this same point, in the introduction, the successor representation is presented as a model that forms representations of space independent of reward. However, this independence of spatial associations and reward has been a feature of most hippocampal models, that then guide behavior based on interactions between a reward representation and the spatial representation (e.g. Redish and Touretzky, Neural Comp. 1998; Burgess, Donnett, Jeffery, O'Keefe, Phil Trans, 1997; Koene et al. Neural Networks 2003; Hasselmo and Eichenbaum, Neural Networks 2005; Erdem and Hasselmo, Eur. J. Neurosci. 2012). The successor representation should not be presented as if it is the only model that ever separated spatial representations and reward. There should be some discussion of what (if any) advantages the successor representation has over these other modeling frameworks (other than connecting to a large body of RL researchers who never read about non-RL hippocampal models). To my knowledge, the successor representation has not been explicitly tested on all the behaviors addressed in these earlier models.

      We agree – a long-standing property of computational models in the hippocampal literature is a factorization of spatial and reward representations, and we have edited the text of the paper to make it clear that this is not a unique contribution of the SR. We have modified our description of the SR to better place it in the context of existing theories about hippocampal contributions to the factorised representations of space and goals, and included all citations mentioned here by adding the following text.

      We have added a sentence to the introduction:

      “However, the computation of expected reward can be decomposed into two components – the successor representation, a predictive map capturing the expected location of the agent discounted into the future, and the expected reward associated with each state [26]. Such segregation yields several advantages since information about available transitions can be learnt independently of rewards and thus changes in the locations of rewards do not require the value of all states to be re-learnt. This recapitulates a number of long-standing theories of hippocampus which state that hippocampus provides spatial representations that are independent of the animal’s particular goal and support goal-directed spatial navigation[27, 28, 23, 29, 30]”

      We have also added a paragraph to the discussion:

      “The SR model has a number of connections to other models from the computational hippocampus literature that bear on the interpretation of these results. A long-standing property of computational models in the hippocampal literature is a factorisation of spatial and reward representations [27, 28, 23, 29, 30], which permits spatial navigation to rapidly adapt to changing goal locations. Even in RL, the SR is also not unique in factorising spatial and reward representations, as purely model-based approaches do this too [26, 25, 67]. The SR occupies a much more narrow niche, which is factorising reward from spatial representations while caching long-term occupancy predictions [26, 68]. Thus, it may be possible to retain some of the flexibility of model-based approaches while retaining the rapid computation of model-free learning.”

      3) Related to this, successes of the successor representation are presented as showing thebackward expansion of place cells. But this was modeled at the start by Mehta and colleagues using STDP-type mechanisms during sequence encoding, so why was the successor representation necessary for that? I don't want to turn this into a review paper comparing hippocampal models, but the body of previous models of the role of the hippocampus in behavior warrants at least a paragraph in each of the introduction and discussion sections. In particular, it should not be somehow assumed that the successor representation is the best model, but instead, there should be some comparison with other models and discussion about whether the successor representation resembles or differs from those earlier models.

      We agree this was not clear. This is a nuanced point that warrants substantial discussion, and we have added a paragraph to the discussion (see the paragraph in the response to point 1 that begins “That the predictive skew of place fields can be accomplished…”).

      4) The text seems to interchangeably use the term "successor representation" and "TD trained network" but I think it would be more accurate to contrast the new STDP trained network with a network trained by Temporal Difference learning because one could argue that both of them are creating a successor representation.

      We now refer to these as “STDP successor features” and “TD successor features”. We have also replaced all references of “true successor representation/features” to “TD successor representation/feature” and have edited the text at the beginning of the results section to reflect this:

      “The STDP synaptic weight matrix Wij (Fig. 1d) can then be directly compared to the temporal difference (TD) successor matrix Mij (Fig. 1e), learnt via TD learning on the CA3 basis features (the full learning rule is derived in Methods and shown in Eqn. 27). Further, the TD successor matrix Mij can also be used to generate the ‘TD successor features’...”

      Reviewer #2 (Public Review):

      The authors present a set of simulations that show how hippocampal theta sequences may be combined with spike time-dependent plasticity to learn a predictive map - the successor representation - in a biologically plausible manner. This study addresses an important question in the field: how might hippocampal theta sequences be combined with STDP to learn predictive maps? The conclusions are interesting and thought-provoking. However, there were a number of issues that made it hard to judge whether the conclusions of the study are justified. These concerns mainly surround the biological plausibility of the model and parameter settings, the lack of any mathematical analysis of the model, and the lack of direct quantitative comparison of the findings to experimental data.

      While the model uses broadly realistic biological elements to learn the successor representation, there remain a number of important concerns with regard to the biological plausibility of the model. For example, the model assumes that each CA3 cell connects to exactly 1 CA1 cell throughout the whole learning process so that each CA1 cell simply inherits the activity of a single CA3 cell. Moreover, neurons in the model interact directly via their firing rate, yet produce spikes that are used only for the weight updates. Certain model parameters also appeared to be unrealistic, for example, the model combined very wide place fields with slow running speeds. This leaves open the question as to whether the proposed learning mechanism would function correctly in more realistic parameter settings. Simulations were performed for a fixed running speed, thereby omitting various potentially important effects of running speed on the phase precession and firing rate of place cells. Indeed, the phase precession of CA1 place cells was not shown or discussed, so it is unclear as to whether CA1 cells produce realistic patterns of phase precession in the model.

      The fact that a successor-like representation emerges in the model is an interesting result and is likely to be of substantial interest to those working at the intersection between neuroscience and artificial intelligence. However, because no theoretical analysis of the model was performed, it remains unclear why this interesting correspondence emerges. Was it a coincidence? When will it generalise? These questions are best answered by mathematical analysis of the model (or a reduced form of it).

      Several aspects of the model are qualitatively consistent with experimental data. For example, CA1 place fields clustered around doorways and were elongated along walls. While these findings are important and provide some support for the model, considerable work is required to draw a firm correspondence between the model and experimental data. Thus, without a quantitative comparison of the place field maps in experimental data and the model, it is hard to draw strong conclusions from these findings.

      Overall, this study promises to make an important contribution to the field, and will likely be read with interest by those working in the fields of both neuroscience and artificial intelligence. However, given the above caveats, further work is required to establish the biological plausibility of the model, develop a theoretical understanding of the proposed learning process, and establish a quantitative comparison of the findings to experimental data.

      Thank you for the positive comments about the work, and for the detailed and constructive review. We appreciate the time spent evaluating the model and understanding its features at a deep level. Your comments and suggestions have led to exciting new simulation results and a theoretical analysis which shed light on the connections between TD learning, STDP and phase precession.

      We have incorporated a number of new simulations to tackle what we believe are your most pressing concerns surrounding the model’s biological plausibility. As such, we have extended the hyperparameter sweep (Supp. Fig 3) to include the phase precession parameters you recommended, as well as three new multipanel supplementary figures satisfying your recommendations (Supp. Figs. 1, 2 & 4). Collectively, these figures show that the specifics of our results, which as you pointed out might have been produced with biologically implausible values (place cell size, movement speed/statistics, weight initialisation, weight updating schedule and phase precession parameters), do not fundamentally depend on the specific values of these parameters: the mechanism still learns predictive maps close in form to the TD successor features. In the hyperparameter sweep, we do find that results are sensitive to specific parameter values (Supp. Fig 3), but that interestingly, the optimal values of these parameters are remarkably close to those observed experimentally. We have also written an extensive new theory section analysing why theta sequences plus STDP approximates TD learning. In addition the methods section has been added to and reordered to make some of the subtler aspects of our model (i.e. the mapping of rates-to-rates and weight fixing during learning) more clear.

      At a high level, regarding our claim of biological plausibility, we like to clarify our intended contribution and give context to some responses below. We have added the following paragraph to the discussion in order to accurately represent the scope of our work:

      “While our model is biologically plausible in several respects, there remain a number of aspects of the biology that we do not interface with, such as different cell types, interneurons and membrane dynamics. Further, we do not consider anything beyond the most simple model of phase precession, which directly results in theta sweeps in lieu of them developing and synchronising across place cells over time [60]. Rather, our philosophy is to reconsider the most pressing issues with the standard model of predictive map learning in the context of hippocampus (e.g., the absence of dopaminergic error signals in CA1 and the inadequacy of synaptic plasticity timescales). We believe this minimalism is helpful, both for interpreting the results presented here and providing a foundation for further work to examine these biological intricacies, such as the possible effect of phase offsets in CA3, CA1 [61] and across the dorsoventral axis [62, 63], as well as whether the model’s theta sweeps can alternately represent future routes [64] e.g. by the inclusion of attractor dynamics [65].”

    1. Author Response:

      Reviewer #1 (Public Review):

      Chakrabarti et al study inner hair cell synapses using electron tomography of tissue rapidly frozen after optogenetic stimulation. Surprisingly, they find a nearly complete absence of docked vesicles at rest and after stimulation, but upon stimulation vesicles rapidly associate with the ribbon. Interestingly, no changes in vesicle size were found along or near the ribbon. This would have indicated a process of compound fusion prior to plasma membrane fusion, as proposed for retinal bipolar cell ribbons. This lack of compound fusion is used to argue against MVR at the IHC synapse. However, that is only one form of MVR. Another form, coordinated and rapid fusion of multiple docked vesicles at the bottom of the ribbon, is not ruled out. Therefore, I agree that the data set provides good evidence for rapid replenishment of the ribbon-associated vesicles, but I do not find the evidence against MVR convincing. The work provides fundamental insight into the mechanisms of sensory synapses.

      We thank the reviewer for the appreciation of our work and the constructive comments. As pointed out below, we now included this discussion (from line 679 onwards).

      We wrote:

      “This might reflect spontaneous univesicular release (UVR) via a dynamic fusion pore (i.e. ‘kiss and run’, (Ceccarelli et al., 1979), which was suggested previously for IHC ribbon synapses (Chapochnikov et al., 2014; Grabner and Moser, 2018; Huang and Moser, 2018; Takago et al., 2019) and/or and rapid undocking of vesicles (e.g. Dinkelacker et al., 2000; He et al., 2017; Nagy et al., 2004; Smith et al., 1998). In the UVR framework, stimulation by ensuing Ca2+ influx triggers the statistically independent release of several SVs. Coordinated multivesicular release (MVR) has been indicated to occur at hair cell synapses (Glowatzki and Fuchs, 2002; Goutman and Glowatzki, 2007; Li et al., 2009) and retinal ribbon synapses (Hays et al., 2020; Mehta et al., 2013; Singer et al., 2004) during both spontaneous and evoked release. We could not observe structures which might hint towards compound or cumulative fusion, neither at the ribbon nor at the AZ membrane under our experimental conditions. Upon short and long stimulation, RA-SVs as well as docked SVs even showed a slightly reduced size compared to controls. However, since some AZs harbored more than one docked SV per AZ in stimulated conditions, we cannot fully exclude the possibility of coordinated release of few SVs upon depolarization.”

      Reviewer #2 (Public Review):

      Chakrabarti et al. aimed to investigate exocytosis from ribbon synapses of cochlear inner hair cells with high-resolution electron microscopy with tomography. Current methods to capture the ultrastructure of the dynamics of synaptic vesicle release in IHCs rely on the application of potassium for stimulation, which constrains temporal resolution to minutes rather than the millisecond resolution required to analyse synaptic transmission. Here the authors implemented a high-pressure freezing method relying on optogenetics for stimulation (Opto-HPF), granting them both high spatial and temporal resolutions. They provide an extremely well-detailed and rigorously controlled description of the method, falling in line with previously use of such "Opto-HPF" studies. They successfully applied Opto-HPF to IHCs and had several findings at this highly specialised ribbon synapse. They observed a stimulation-dependent accumulation of docked synaptic vesicles at IHC active-zones, and a stimulation-dependent reduction in the distance of non-docked vesicles to the active zone membrane; while the total number of ribbon-associated vesicles remained unchanged. Finally, they did not observe increases in diameter of synaptic vesicles proximal to the active zone, or other potential correlates to compound fusion - a potential mode of multivesicular release. The conclusions of the paper are mostly well supported by data, but some aspects of their findings and pitfalls of the methods should be better discussed.

      We thank the reviewer for the appreciation of our work and the constructive comments.

      Strengths:

      While now a few different groups have used "Opto-HPF" methods (also referred to as "Flash and Freeze) in different ways and synapses, the current study implemented the method with rigorous controls in a novel way to specifically apply to cochlear IHCs - a different sample preparation than neuronal cultures, brain slices or C. elegans, the sample preparations used so far. The analysis of exocytosis dynamics of IHCs with electron microscopy with stimulation has been limited to being done with the application of potassium, which is not physiological. While much has been learned from these methods, they lacked time resolution. With Opto-HPF the authors were successfully able to investigate synaptic transmission with millisecond precision, with electron tomography analysis of active zones. I have no overall questions regarding the methodology as they were very thoroughly described. The authors also employed electrophysiology with optogenetics to characterise the optical simulation parameters and provided a well described analysis of the results with different pulse durations and irradiance - which is crucial for Opto-HPF.

      Thank you very much.

      Further, the authors did a superb job in providing several tables with data and information across all mouse lines used, experimental conditions, and statistical tests, including source code for the diverse analysis performed. The figures are overall clear and the manuscript was well written. Such a clear representation of data makes it easier to review the manuscript.

      Thank you very much.

      Weaknesses:

      There are two main points that I think need to be better discussed by the authors.

      The first refers to the pitfalls of using optogenetics to analyse synaptic transmission. While ChR2 provides better time resolution than potassium application, one cannot discard the possibility that calcium influx through ChR2 alters neurotransmitter release. This important limitation of the technique should be properly acknowledged by the authors and the consequences discussed, specifically in the context in which they applied it: a single sustained pulse of light of ~20ms (ShortStim) and of ~50ms (LongStim). While longer, sustained stimulation is characteristic for IHCs, these are quite long pulses as far as optogenetics and potential consequences to intrinsic or synaptic properties.

      We thank the reviewer for pointing this out. We would like to mention that upon 15 min high potassium depolarization, the number of docked SVs only slightly increased as shown in Chakrabarti et al., 2018, EMBO rep and Kroll et al. 2020 JCS, but it was not statistically significant. In the current study, we report a similar phenomenon, but here light induced depolarization resulted in a more robust increase in the number of docked SVs.

      To compare the data from the previous studies with the current study, we included an additional table 3 (line 676) now in the discussion with all total counts (and average per AZ) of docked SVs.

      Furthermore, in response to the reviewers’ concern, we now discuss the Ca2+ permeability of ChR2 in addition to the above comparison to our previous studies that demonstrated very few docked SVs in the absence of K+ channel blockers and ChR2 expression in IHCs. We are not entirely certain, if the reviewer refers to potential dark currents of ChR2 (e.g. as an explanation for a depletion of docked vesicles under non-stimulated conditions) or to photocurrents, the influx of Ca2+ through ChR2 itself, and their contribution to Ca2+ concentration at the active zone.

      However, regardless this, we consider it unlikely that a potential contribution of Ca2+ influx via ChR2 evokes SV fusion at the hair cell active zone.

      First of all, we note that the Ca2+ affinity of IHC exocytosis is very low. As first shown in Beutner et al., 2001 and confirmed thereafter (e.g. Pangrsic et al., 2010), there is little if any IHC exocytosis for Ca2+ concentrations at the release sites below 10 µM. Two studies using CatCh (a ChR2 mutant with higher Ca2+ permeability than wildtype ChR2 (Kleinlogel et al., 2011; Mager et al., 2017) estimated a max intracellular Ca2+ increase below 10 µM, even at very negative potentials that promote Ca2+ influx along the electrochemical gradient or at high extracellular Ca2+ concentrations of 90 mM. In our experiments, IHCs were depolarized, instead, to values for which extrapolation of the data of Mager et al., 2017 indicate a submicromolar Ca2+ concentration. In addition, we and others have demonstrated powerful Ca2+ buffering and extrusion in hair cells (e.g. Tucker and Fettiplace, 1995; Issa and Hudspeth., 1996; Frank et al., 2009 Pangrsic et al., 2015). As a result, the hair cells efficiently clear even massive synaptic Ca2+ influx and establish a low bulk cytosolic Ca2+ concentration (Beutner and Moser, 2001; Frank et al., 2009). We reason that these clearance mechanisms efficiently counter any Ca2+ influx through ChR2. This will likely limit potential effects of ChR2 mediated Ca2+ influx on Ca2+ dependent replenishment of synaptic vesicles during ongoing stimulation.

      We have now added the following in the discussion (starting in line 620):

      “We note that ChR2, in addition to monovalent cations, also permeates Ca2+ ions and poses the question whether optogenetic stimulation of IHCs could trigger release due to direct Ca2+ influx via the ChR2. We do not consider such Ca2+ influx to trigger exocytosis of synaptic vesicles in IHCs. Optogenetic stimulation of HEK293 cells overexpressing ChR2 (wildtype version) only raises the intracellular Ca2+ concentration up to 90 nM even with an extracellular Ca2+ concentration of 90 mM (Kleinlogel et al., 2011). IHC exocytosis shows a low Ca2+ affinity (~70 µM, Beutner et al., 2001) and there is little if any IHC exocytosis for Ca2+ concentrations below 10 µM, which is far beyond what could be achieved even by the highly Ca2+ permeable ChR2 mutant (CatCh: Ca2+ translocating channelrhodopsin, Mager et al., 2017). In addition, we reason that the powerful Ca2+ buffering and extrusion by hair cells (e.g., Frank et al., 2009; Issa and Hudspeth, 1996; Pangršič et al., 2015; Tucker and Fettiplace, 1995) will efficiently counter Ca2+ influx through ChR2 and, thereby limit potential effects on Ca2+ dependent replenishment of synaptic vesicles during ongoing stimulation. “

      The second refers to the finding that the authors did not observe evidence of compound fusion (or homotypic fusion) in their data. This is an interesting finding in the context of multivesicular release in general, as well as specifically for IHCs. While the authors discussed the potential for "kiss-and-run" and/or "kiss-and-stay", it would be valuable if they could discuss their findings further in the context of the field for multivesicular release. For example, the evidence in support of the potential of multiple independent release events. Further, as far as such function-structure optical-quick-freezing methods, it is not unusual to not capture fusion events (so-called omega-shapes or vesicles with fusion pores); this is largely because these are very fast events (less than 10 ms), and not easily captured with optical stimulation.

      We agree with the reviewer that the discussion on MVR and UVR should be extended. We now added the following paragraph to the discussion from line 679 on:

      “This might reflect spontaneous univesicular release (UVR) via a dynamic fusion pore (i.e. ‘kiss and run’, (Ceccarelli et al., 1979), which was suggested previously for IHC ribbon synapses (Chapochnikov et al., 2014; Grabner and Moser, 2018; Huang and Moser, 2018; Takago et al., 2019) and/or and rapid undocking of vesicles (e.g. Dinkelacker et al., 2000; He et al., 2017; Nagy et al., 2004; Smith et al., 1998). In the UVR framework, stimulation by ensuing Ca2+ influx triggers the statistically independent release of several SVs. Coordinated multivesicular release (MVR) has been indicated to occur at hair cell synapses (Glowatzki and Fuchs, 2002; Goutman and Glowatzki, 2007; Li et al., 2009) and retinal ribbon synapses (Hays et al., 2020; Mehta et al., 2013; Singer et al., 2004) during both spontaneous and evoked release. We could not observe structures which might hint towards compound or cumulative fusion, neither at the ribbon nor at the AZ membrane under our experimental conditions. Upon short and long stimulation, RA-SVs as well as docked SVs even showed a slightly reduced size compared to controls. However, since some AZs harbored more than one docked SV per AZ in stimulated conditions, we cannot fully exclude the possibility of coordinated release of few SVs upon depolarization.”

      Reviewer #3 (Public Review):

      Precise methods were developed to validate the expression of channelrhodopsin in inner hair cells of the Organ of Corti, to quantify the relationship between blue light irradiance and auditory nerve fiber depolarization, to control light stimulation within the chamber of a high-pressure freezing device, and to measure with good precision the delay between stimulation and freezing of the specimen. These methods represent a clear advance over previous experimental designs used to study this synaptic system and are an initial application of rapid high-pressure freezing with freeze substitution, followed by high-resolution electron tomography (ET), to sensory cells that operate via graded potentials.

      Short-duration stimuli were used to assess the redistribution of vesicles among pools at hair cell ribbon synapses. The number of vesicles linked to the synaptic ribbon did not change, but vesicles redistributed within the membrane-proximal pool to docked locations. No evidence was found for vesicle-to-vesicle fusion prior to vesicle fusion to the membrane, which is an important, ongoing question for this synapse type. The data for quantifying numbers of vesicles in membrane-tethered, non-tethered, and docked vesicle pools are compelling and important.

      We thank the reviewer for the appreciation of our work and the constructive comments.

      These quantifications would benefit from additional presentation of raw images so that the reader can better assess their generality and variability across synaptic sites.

      The images shown for each of the two control and two experimental (stimulated) preparation classes should be more representative. Variation in synaptic cleft dimensions and numbers of ribbon-associated and membrane-proximal vesicles do not track the averaged data. Since the preparation has novel stimulus features, additional images (as the authors employed in previous publications) exhibiting tethered vesicles, non-tethered vesicles, docked vesicles, several sections through individual ribbons, and the segmentation of these structures, will provide greater confidence that the data reflect the images.

      Thank you very much for pointing this out. We now included more details in supplemental figures and in the text.

      Precisely, we added:

      • More details about the morphological sub-pools (analysis and images):

        -We now show a sequence of images with different tethering states of membrane proximal SVs together with examples for docked and non-tethered SVs as we did in Chakrabarti et al., 2018 for each condition (Fig. 6-figure supplement 2, line 438). Moreover, we included for each condition additional information, we selected further tomograms, one per condition, and depict two additional virtual sections: Fig. 6-figure supplement 2.

        -Moreover, we present a more detailed quantification for the different morphological sub-pools: For the MP-SV pool, we analyzed the SV diameters and the distances to the AZ membrane and PD of different SV sub-pools separately, we now included this information in Fig. 7 For the RA-SVs, we analyzed in addition the morphological sub-pools and the SV diameters in the distal and the proximal ribbon part as done in Chakrabarti et al. 2018. We now added a new supplement figure (Fig. 7-figure supplement 2, line 558 and a supplementary file 2).

      • We replaced the virtual section in panel 6D: In the old version, it appeared that the ribbon was contacting the membrane and we realized that this virtual section was not representative: actually, the ribbon was not directly contacting the AZ membrane, a presynaptic density was still visible adjacent to the docked SVs. To avoid potential confusion, we selected a different virtual section of the same tomogram and now indicated the presynaptic density also as graphical aid in Fig. 6.

      The introduction raises questions about the length of membrane tethers in relation to vesicle movement toward the active zone, but this topic was not addressed in the manuscript.

      We apologize for not stating it sufficiently clear, we now rephrased this sentence. We now wrote:

      “…and seem to be organized in sub-pools based on the number of tethers and to which structure these tethers are connected. “

      Seemingly quantification of this metric, and the number of tethers especially for vesicles near the membrane, is straightforward. The topic of EPSC amplitude as representing unitary events due to variation in vesicle volume, size of the fusion pore, or vesicle-vesicle fusion was partially addressed. Membrane fusion events were not evident in the few images shown, but these presumably occurred and could be quantified. Likewise, sites of membrane retrieval could also be marked. These analyses will broaden the scope of the presentation, but also contribute to a more complete story.

      Regarding the presence/absence of membrane fusion events we agree with the reviewer that this should be clearly addressed in the MS. We would like to point out that we

      (i) did not observe any omega shapes at the AZ membrane, which we also mention in the MS. We can also report that we could not see them in data sets from previous publications (Vogl et al., 2015, JCS; Jung et al., 2015, PNAS).

      (ii) To be clear on our observations on potential SV-SV fusion events we now point out in the discussion from line 688ff:

      “We could not observe structures which might hint towards compound or cumulative fusion, neither at the ribbon nor at the AZ membrane under our experimental conditions. Upon short and long stimulation, RA-SVs as well as docked SVs even showed a slightly reduced size compared to controls. However, since some AZs harbored more than one docked SV per AZ in stimulated conditions, we cannot fully exclude the possibility of coordinated release of few SVs upon depolarization.”

      Furthermore, we agree with the reviewer that a complete presentation of endo-exocytosis structural correlates is very important. However, we focused our study on exocytosis events and therefore mainly analyzed membrane proximal SVs at active zones.

      Nonetheless, in response to the reviewer’s comment, we now included a quantification of clathrin-coated (CC) structures. We determined the appearance of CC vesicles (V) and CC invaginations within 0-500 nm away from the PD. We measured the diameter of the CCV, and their distance to the membrane and the PD. We only found very few CC structures in our tomograms (now added in a table to the result section (Supplementary file 1). Sites for endocytic membrane retrieval likely are in the peri-active zone area or even beyond. We did not observe obvious bulk endocytosis events that were connected to the AZ membrane. However, we do observe large endosomal like vesicles that we did not quantify in this study. More details were presented in two of our previous studies: Kroll et al., 2019 and 2020, however, under different stimulation conditions.

      Overall, the methodology forms the basis for future studies by this group and others to investigate rapid changes in synaptic vesicle distribution at this synapse.

      Reviewer #4 (Public Review):

      This manuscript investigates the process of neurotransmitter release from hair cell synapses using electron microscopy of tissue rapidly frozen after optogenetic stimulation. The primary finding is that in the absence of a stimulus very few vesicles appear docked at the membrane, but upon stimulation vesicles rapidly associate with the membrane. In contrast, the number of vesicles associated with the ribbon and within 50 nm of the membrane remains unchanged. Additionally, the authors find no changes in vesicle size that might be predicted if vesicles fuse to one-another prior to fusing with the membrane. The paper claims that these findings argue for rapid replenishment and against a mechanism of multi-vesicular release, but neither argument is that convincing. Nonetheless, the work is of high quality, the results are intriguing, and will be of interest to the field.

      We thank the reviewer for the appreciation of our work and the constructive comments.

      1) The abstract states that their results "argue against synchronized multiquantal release". While I might agree that the lack of larger structures is suggestive that homotypic fusion may not be common, this is far from an argument against any mechanisms of multi-quantal release. At least one definition of synchronized multiquantal release posits that multiple vesicles are fusing at the same time through some coordinated mechanism. Given that they do not report evidence of fusion itself, I fail to see how these results inform us one way or the other.

      We agree with the reviewer that the discussion on MVR and UVR should be extended. It is important to point out that we do not claim that the evoked release is mediated by one single SV. As discussed in the paper (line 672), we consider that our optogenetic stimulation of IHCs triggers the release of more than 10 SVs per AZ. This falls in line with the previous reports of several SVs fusing upon stimulation. This type of evoked MVR is probably mediated by the opening of Ca2+ channels in close proximity to each SV Ca2+ sensor. We indeed sometimes observed more than one docked SV per AZ upon long optogenetic stimulation. This could reflect that possibility. However, given the absence of large structures directly at the ribbon or the AZ membrane that could suggest the compound fusion of several SVs prior or during fusion, we argue against compound MVR release at IHCs. As mentioned above, we added to the discussion (from line 679 onwards).

      We wrote:

      “This might reflect spontaneous univesicular release (UVR) via a dynamic fusion pore (i.e. ‘kiss and run’, (Ceccarelli et al., 1979), which was suggested previously for IHC ribbon synapses (Chapochnikov et al., 2014; Grabner and Moser, 2018; Huang and Moser, 2018; Takago et al., 2019) and/or and rapid undocking of vesicles (e.g. Dinkelacker et al., 2000; He et al., 2017; Nagy et al., 2004; Smith et al., 1998). In the UVR framework, stimulation by ensuing Ca2+ influx triggers the statistically independent release of several SVs. Coordinated multivesicular release (MVR) has been indicated to occur at hair cell synapses (Glowatzki and Fuchs, 2002; Goutman and Glowatzki, 2007; Li et al., 2009) and retinal ribbon synapses (Hays et al., 2020; Mehta et al., 2013; Singer et al., 2004) during both spontaneous and evoked release. We could not observe structures which might hint towards compound or cumulative fusion, neither at the ribbon nor at the AZ membrane under our experimental conditions. Upon short and long stimulation, RA-SVs as well as docked SVs even showed a slightly reduced size compared to controls. However, since some AZs harbored more than one docked SV per AZ in stimulated conditions, we cannot fully exclude the possibility of coordinated release of few SVs upon depolarization.”

      2) The complete lack of docked vesicles in the absence of a stimulus followed by their appearance with a stimulus is a fascinating result. However, since there are no docked vesicles prior to a stimulus, it is really unclear what these docked vesicles represent - clearly not the RRP. Are these vesicles that are fusing or recently fused or are they ones preparing to fuse? It is fine that it is unknown, but it complicates their interpretation that the vesicles are "rapidly replenished". How does one replenish a pool of docked vesicles that didn't exist prior to the stimulus?

      In response to the reviewers’ comment, we would like to note that we indeed reported very few docked SVs in wild type IHCs at resting conditions without K+ channel blockers in Chakrabarti et al. EMBO Rep 2018 and in Kroll et al., 2020, JCS. In both studies, a solution without TEA and Cs was used for the experiments (resting solution Chakrabarti: 5 mM KCl, 136.5 mM NaCl, 1 mM MgCl2, 1.3 mM CaCl2, 10 mM HEPES, pH 7.2, 290 mOsmol; control solution Kroll: 5.36 mM KCl, 139.7 mM NaCl, 2 mM CaCl2, 1 mM MgCl2, 0.5 mM MgSO4, 10 mM HEPES, 3.4 mM L-glutamine, and 6.9 mM D-glucose, pH 7.4). Similarly, our current study shows very few docked SVs in the resting condition even in the presence of TEA and Cs. Based on the results presented in ‘Response to reviewers Figure 1’, we assume that the scarcity of docked SVs under control conditions is not due to depolarization induced by a solution containing 20 mM TEA and 1 mM Cs but is rather representative for the physiological resting state of IHC ribbon synapses. Upon 15 min high potassium depolarization, the number of docked SVs only slightly increased as shown in Chakrabarti et al., 2018 and Kroll et al. 2020, but it was not statistically significant. In the current study, we report a similar phenomenon, but here depolarization resulted in a more robust increase in the number of docked SVs.

      To compare the data from the previous studies with the current study, we included an additional table 3 (line 676) now in the discussion with all total counts (and average per AZ) of docked SVs.

    1. In our way of delivering orders we emphasise explaining the context two levels up. I may tell my soldiers to raid a compound, but I would also tell them that the reason for this is to create a distraction so that the Colonel can divert the enemy away from a bridge, and that the reason the Brigadier wants the Colonel to divert the enemy is so that the bridge is easier to cross. Not only do the soldiers then know why it’s important to raid the compound (so that others can cross the bridge), but they know that if for some reason they can’t raid the compound, creating any other diversion or distraction will do in a pinch, and if they can’t do that they can still try to do something to make it easier to cross the bridge. It lets everyone adapt to change as it happens without additional instruction if they aren’t able to get in touch with me. Again I think tech could possibly learn from that.

      def

    1. Author Response:

      eLife assessment

      This paper reports a useful set of results that uses a reduced network model based on a previously published large-scale network model to explain the generation of theta-gamma rhythms in the hippocampus. Combining the detailed and reduced models and comparing their results is a powerful approach. However, the evidence for the main claim that CCK+ basket cells play a key role in theta-gamma coupling in the hippocampus is currently incomplete.

      We thank the reviewers for their thorough and thoughtful notes, and we are pleased that there is acknowledgement of the combination of models as a powerful approach.  We agree with many of the comments made and we intend to address them in subsequent revisions. 

      In particular, we think that our ‘narrative’ as presented was perhaps not as clear as it could have been, based on the somewhat different comments from the reviewers (R#1 and #3).  That is, we created a reduced population rate model based on the theta/gamma generation hypotheses from the detailed model and then explored the PRM in more detail to predict cellular contributions.  The goal was not to validate the original detailed model per se (R#1) nor to do a fitting of parameters in the PRM directly from the detailed model (R#3).  Rather, it was to obtain a set of parameter values in PRM that would be in accordance with the hypotheses of the detailed model that could be fully explored to derive cellular-based predictions that could help design experiments to understand theta/gamma rhythms.

      Responses specific to the Reviewers are given below.

      Reviewer #1 (Public Review):

      This paper investigates potential mechanisms underlying the generation of hippocampal theta and gamma rhythms using a combination of several modeling approaches. The authors perform new simulation experiments on the existing large-scale biophysical network model previously published by Bezaire et al. Guided by their analysis of this detailed model, they also develop a strongly reduced, rate-based network model, which allows them to run a much larger number of simulations and systematically explore the effects of varying several key parameters. The combined results from these two in silico approaches allow them to predict which cell types and connections in the hippocampus might be involved in the generation and coupling of theta and gamma oscillations.

      In my view, several aspects of the general methodology are exemplary. In the current work as well as several earlier papers, the authors are re-using a large-scale network model that was originally developed in a different laboratory (Bezaire et al., 2016) and that still represents the state-of-the-art in detailed hippocampal modeling. Such model reuse is quite rare in computational neuroscience, which is rather unfortunate given the amount of time and effort required to build and share such a complex model. Very often, and also, in this case, the original publication that describes a detailed model provides only limited validation and analysis of model behavior, and the re-use of the same model in later studies represents a great opportunity to further examine and validate the model.

      Combining detailed and simplified models can also be a powerful approach, especially when the correspondence between the two is carefully established. Matching results from the two models, in this case, allow strong arguments about key mechanisms of biological phenomena, where the simplified model allows the identification and characterization of necessary and sufficient components, while the detailed model can firmly anchor the models and their predictions to experimental data.

      On the other hand, I have several major concerns about the implementation of these approaches and the interpretation of the results in the current study. First of all, the detailed model of Bezaire et al. is considered strictly equivalent, in all of its relevant details, to biological reality, and no attempt is made to verify or even discuss the validity of this assumption, even when particular details of the model are apparently critical for the results presented. I see this as a fundamental limitation of the current work - the fact that the Bezaire et al. model is the best one we have at the moment does not automatically make it correct in all its details, and features of the model that are essential for the new results certainly deserve careful scrutiny (preferably via detailed comparison with experimental data).

      An important case in point is the strength of the interactions between specific neuronal populations. This is represented by different quantities in the detailed and simplified model, but the starting point is always the synaptic weight (conductance) values given by Bezaire et al. (2016), also listed in Tables 2 and 3 of the current manuscript. Looking at these parameters, one can identify a handful of connections whose conductance values are much higher than those of the other connections, and also more than an order of magnitude higher (50-100 nS) than commonly estimated values for cortical synapses (normally less than about 5 nS, except for a few very special types of synapse such as the hippocampal mossy fibers). Not surprisingly, several of these connections (such as the pyramidal cell to pyramidal cell connections, and the CCK+BC to PV+BC connections) were found to be critical for the generation and control of theta and gamma oscillations in the model. Given their importance for the conclusions of the paper, it would be essential to double-check the validity of these parameter values. In this context, it is worth noting that, unlike the anatomical parameters (cell numbers and connectivity) that had been carefully calculated and discussed in Bezaire and Soltesz (2013), biophysical parameters (the densities of neuronal membrane conductances and synaptic conductances) in Bezaire et al. (2016) were obtained by relatively simple (partly manual) fitting procedures whose reliability and robustness are mostly unknown. Specifically for synaptic parameters in CA1, a more systematic review and calculation were recently carried out by Ecker et al. (2020); their estimates for the synaptic conductances in question are typically much lower than those of Bezaire et al. (2016) and appear to be more in line with widely accepted values for cortical (hippocampal) synapses.

      Furthermore, some key details concerning the construction of the simplified rate model are unclear in the current manuscript. The process of selecting cell types and connections for inclusion in the rate model is described, and the criteria are mostly clear, although the results are likely to be heavily affected by the problems discussed above, and I do not understand why the strength of external input was included among the selection criteria for cell types (especially if the model is meant to capture the internal dynamics of the isolated CA1 region). However, the main issue is that it remains unclear how the parameters of the rate model (the 24 parameters in Table 4) were obtained. The authors simply state that they "found a set of parameters that give rise to theta-gamma rhythms," and no further explanation is provided. Ideally, the parameters of the rate model should be derived systematically from the detailed biophysical model so that the two models are linked as strongly as possible; but even if this was not the case, the methods used to set these parameters should be described in detail.

      An important inaccuracy in the presentation of the results concerns the suggested coupling of theta and gamma oscillations in the models. Although the authors show that theta and gamma oscillations can be simultaneously present in the network under certain conditions, actual coupling of the two rhythms (e.g., in the form of phase-amplitude coupling) is not systematically characterized, and it is therefore not clear under what conditions real coupling is present in the two models (although a probable example can be seen in Figure 1C(ii)).

      The Discussion of the paper states that gamma oscillations in the model(s) are generated via a pure interneuronal (ING) mechanism. This is an interesting claim; however, I could not find any findings in the Results section that directly support this conclusion.

      Finally, although the authors write that they can "envisage designing experiments to directly test predictions" from their modeling work, no such experimental predictions are explicitly identified in the current manuscript.

      As noted above, our goal was not to validate the original detailed model but to carry out further analysis of the Bezaire model in its re-use, since as noted by this Reviewer, the original publication was limited in validation and analysis.  Further validation/extensions of Bezaire et al can be carried out given their acknowledged limitations (some as mentioned by the Reviewer).  However, as noted, more detailed models of CA1 microcircuitry now exist (Ecker et al 2020), and it would be interesting to examine whether and how these more detailed models might express theta/gamma rhythms.  In essence, we completely agree that all the details of the Bezaire et al model are not automatically correct.  We were using it as a biological proxy, albeit imperfect.  However, it is able to produce theta/gamma rhythms using parameter values that are experimentally derived in many ways (Bezaire & Soltesz 2013), and with minimal tuning, and thus our assumption is that it captures a potential ‘biological balance’ to generate these rhythms.  Hence, we carried out additional simulations and explorations to derive generation hypotheses that are “applied” in the development of the reduced population rate model (PRM).  The “ING” aspect is due to CCK+BCs and PV+BCs firing coherent gamma rhythms that are imposed onto the PYR cell population as mentioned in the Results.  Without PYR input, they still fire coherent gamma rhythms.  Experiments in which theta/gamma rhythms are characterized (CFC, frequencies)  with and without the presence of CCK+BCs would allow the main prediction of the modeling work to be explored – i.e., whether CCK+BCs are essential for the existence of these coupled rhythms.  We know from Dudok et al that there are alternating sources of perisomatic inhibition, but how they might control theta/gamma rhythms has not been explored to the best of our knowledge.

      We will more fully describe our process for PRM parameters in subsequent revisions as well as formally apply CFC metrics.

      Reviewer #2 (Public Review):

      The goal of this study is to find a minimal model that produces both theta and gamma rhythms in the hippocampus CA1, based on the full-scale model (FSM) of Bezaire et al, 2016. The FSM here is treated as equivalent to biological data. This seems to be a second part of a study that the same authors published in 2021, and is extensively cited here. The study reduces the FSM to a neural rate model with 4 neurons, which is capable of producing both rhythms. This model is then simulated and its parameter dependencies are explored.

      The authors succeed in producing a rate model, based on 4 neuron types, that captures the essence of the two rhythms. This model is then analyzed at a descriptive level to claim that the synapse from one interneuron type (CCK) to another (PV+) is more effective than its reciprocal counterpart (PV+ to CCK synapse) to control theta rhythm frequency.

      The results fall short on several fronts:<br /> The conclusions rely exclusively on the assumption that the FSM is in fact able to faithfully reflect the biological circuits involved, not just in its output, but in response to a variety of perturbations. Although the authors mention and discuss this assumption, in the end, the reader is left with a (reduced) model of a (complex) model, but no real analysis based on this reduction. In fact, the reduced model is treated in a manner that could have been done with the full one. Thus the significance of the work is greatly reduced not by what the authors do, but by what they fail to do, which is to properly analyze their own reduced model. Consequently, the impact of this study on the field is minimal.<br /> Related to the first point, throughout the manuscript, multiple descriptive findings, based on the authors' observations of the model output, are presented as causal relationships. Even the main finding of the study (that one synapse has a larger effect on theta than another) is not quantified, but just simply left as a judgment call by the authors and reader of comparing slopes on graphs.

      We agree with this Reviewer that analysis of the PRM is needed and is currently underway.  It will hopefully help us understand what ‘balances’ are essential for theta/gamma rhythm expression.  However, the overall goal of this work was not to “find” a minimal model per se, but rather to determine how theta/gamma rhythms in the hippocampus are generated (hence building on previous works).  However, it was important to use the detailed model (biological proxy – albeit imperfect – see response to Reviewer#1) to obtain hypotheses on which the PRM is based.  We do not envisage the minimal model as a `replacement’ for the detailed model in general, but rather, to show that using a combination approach (detailed and/or experimental observations with ‘derived’ reduced models) allows us to gain insight into cellular contributions to rhythm generation. Quantification of observations will be applied in subsequent revisions.

      Reviewer #3 (Public Review):

      While full-scale and minimal models are available for CA1 hippocampus and both exhibiting theta and gamma rhythms, it is not fully clear how inhibitory cells contribute to rhythm generation in the hippocampus. This paper aims to address this question by proposing a middle ground - a reduced model of the full-scale model. The reduced model is derived by selecting neural types for which ablations show that these are essential for theta and gamma rhythms. A study of the reduced model proposes particular inhibitory cell types (CCK+BC cells) that play a key role in inhibitory control mechanisms of theta rhythms and theta-gamma coupling rhythms.

      Strengths:<br /> The paper identifies neural types contributing to theta-gamma rhythms, models them, and provides analysis that derives control diagrams and identifies CCK+BC cells as key inhibitory cells in rhythm generation. The paper is clearly written and approaches are well described. Simulation data is well depicted to support the methodology.

      Weaknesses:<br /> The derivation methodology of the reduced model is hypotheses based, i.e. it is based on the selection of cell types and showing that these need to be included by ablation simulations. Then the reduced model is fitted. While this approach has merit, it could "miss" cell types or not capture the particular balance between all types. In particular, it is not known what is the "error" by considering the reduced model. As a result, the control plots (Fig. 5 and 6) might be deformed or very different. An additional weakness is that while the study predicts control diagrams and identifies CCK+BC cell types as key controllers, experimental data to validate these predictions is not provided. This weakness is admissible, in my opinion, since these recordings are not easy to obtain and the paper focuses on computational investigation rather than computationally guided experiments.

      This Reviewer has provided a succinct description of our work which we will leverage in subsequent revisions as we more fully describe our process – thank you.  We agree with the Reviewer that we could ‘miss’ cell types and not capture particular balances etc., as we based our PRM on hypotheses from the detailed model.  Our PRM and its reference parameter values are ‘designed’ based on hypotheses from our set of explorations of the detailed model, and we were able to determine particular predictions that can be experimentally explored.  Subsequent theoretical analyses will help us understand the required ‘balances’ but as noted above (see response to Reviewer#2), we are not aiming for a minimal model (in general), but rather to use such a combined approach (detailed model and/or experimental observations with ‘derived’ reduced models) to come up with (cellular-based) predictions underlying theta/gamma generation.  As noted by this Reviewer, specific inhibitory cell recordings are not easy to obtain and we hope our work would help with computationally guided experiments – i.e, even though the reduced model may ‘miss’ other aspects, it would hopefully capture some aspects that are biologically salient for consideration in experimental design and future detailed model explorations.

    1. Author Response

      Reviewer #1 (Public Review):

      This paper describes the results of a MEG study where participants listened to classical MIDI music. The authors then use lagged linear regression (with 5-fold cross-validation) to predict the response of the MEG signal using (1) note onsets (2) several additional acoustic features (3) a measure of note surprise computed from one of several models. The authors find that the surprise regressors predict additional variance above and beyond that already predicted by the other note onset and acoustic features (the "baseline" model), which serves as a replication of a recent study by Di Liberto.

      They compute note surprisal using four models (1) a hand-crafted Bayesian model designed to reflect some of the dominant statistical properties of Western music (Temperley) (2) an ngram model trained on one musical piece (IDyOM stm) (3) an n-gram model trained on a much larger corpus (IDyOM ltm) (4) a transformer DNN trained on a mix of polyphonic and monophonic music (MT). For each model, they train the model using varying amounts of context.

      They find that the transformer model (MT) and long-term n-gram model (IDyOM stm) give the best neural prediction accuracy, both of which give ~3% improvement in predicted correlation values relative to their baseline model. In addition, they find that for all models, the prediction scores are maximal for contexts of ~2-7 notes. These neural results do not appear to reflect the overall accuracy of the models tested since the short-term n-gram model outperforms the long-term n-gram model and the music transformer's accuracy improves substantially with additional context beyond 7 notes. The authors replicate all these findings in a separate EEG experiment from the Di Liberto paper.

      Overall, this is a clean, nicely-conducted study. However, the conclusions do not follow from the results for two main reasons:

      1) Different features of natural stimuli are almost always correlated with each other to some extent, and as a consequence, a feature (e.g., surprise) can predict the neural response even if it doesn't drive that response. The standard approach to dealing with this problem, taken here, is to test if a feature improves the prediction accuracy of a model above and beyond that of a baseline model (using cross-validation to avoid over-fitting). If the feature improves prediction accuracy, then one can conclude that the feature contributes additional, unique variance. However, there are two key problems: (1) the space of possible features to control for is vast, and there will almost always be uncontrolled-for features (2) the relationship between the relevant control features and the neural response could be nonlinear. As a consequence, if some new feature (here surprise) contributes a little bit of additional variance, this could easily reflect additional un-controlled features or some nonlinear relationship that was not captured by the linear model. This problem becomes more acute the smaller the effect size since even a small inaccuracy in the control model could explain the resulting finding. This problem is not specific to this study but is a problem nonetheless.

      We understand the reviewer’s point and agree that it indeed applies not exclusively to the present study, but likely to many studies in this field and beyond. We disagree, however, that it constitutes a problem per se. We maintain that the approach of adding a feature, observing that it increases crossvalidated prediction performance, and concluding that therefore the feature is relevant, is a valid one. Indeed, it is possible and even likely that not all relevant features (or non-linear transformations thereof) will be present in the control/baseline model. If a to-be-tested feature increases predictive performance and therefore explains relevant variance, then that means that part of what drives the neural response is non-trivially related to the to-be-tested feature. The true underlying relationship may not be linear, and later work may uncover more complex relationships that subsume the earlier discovery, but the original conclusion remains justified.

      Importantly, we wish to emphasize that the key conclusions of our study primarily rest upon comparisons between regression models that are by design equally complex, such as surpriseaccording-to-MT versus surprise-according-to-IDyOM and comparisons across different context lengths. We maintain that the comparison with the Baseline model is also important, but even taking the reviewer’s worry here into account, the comparison between different equally-complex regression models should not suffer from it to the same extent as a model-versus-baseline comparison.

      2) The authors make a distinction between "Gestalt-like principles" and "statistical learning" but they never define was is meant by this distinction. The Temperley model encodes a variety of important statistics of Western music, including statistics such as keys that are unlikely to reflect generic Gestalt principles. The Temperley model builds in some additional structure such as the notion of a key, which the n-gram and transformer models must learn from scratch. In general, the models being compared differ in so many ways that it is hard to conclude much about what is driving the observed differences in prediction accuracy, particularly given the small effect sizes. The context manipulation is more controlled, and the fact that neural prediction accuracy dissociates from the model performance is potentially interesting. However, I am not confident that the authors have a good neural index of surprise for the reasons described above, and this limits the conclusions that can be drawn from this manipulation.

      First of all, we would like to apologize for any unclarity regarding the distinction between Gestalt-like and statistical models. We take Gestalt-like models to be those that explain music perception as following a restricted set of rules, such as that adjacent notes tend to be close in pitch. In contrast, as the reviewer correctly points out, statistical learning models have no such a priori principles and must learn similar or other principles from scratch. Importantly, the distinction between these two classes of models is not one we make for the first time in the context of music perception. Gestalt-like models have a long tradition in musicology and the study of music cognition dating back to (Meyer, 1957). The Implication-Realization model developed by Eugene Narmour (Narmour, 1990, 1992; Schellenberg, 1997) is another example for a rule-based theory of music listening, which has influenced the model by David Temperley, which we applied as the most recently influential Gestalt-model of melodic expectations in the present study. Concurrently to the development of Gestalt-like models, a second strand of research framed music listening in light of information theory and statistical learning (Bharucha, 1987; Cohen, 1962; Conklin & Witten, 1995; Pearce & Wiggins, 2012). Previous work has made the same distinction and compared models of music along the same axis (Krumhansl, 2015; Morgan et al., 2019a; Temperley, 2014). We have updated the manuscript to elaborate on this distinction and highlight that it is not uncommon.

      Second, we emphasize that we compare the models directly in terms of their predictive performance both of upcoming musical notes and of neural responses. This predictive performance is not dependent on the internal details of any particular model; e.g. in principle it would be possible to include a “human expert” model where we ask professional composers to predict upcoming notes given a previous context. Because of this independence of the relevant comparison metric on model details, we believe comparing the models is justified. Again, this is in line with previously published work in music (Morgan et al., 2019a), language, (Heilbron et al., 2022; Schmitt et al., 2021; Wilcox et al., 2020), and other domains (Planton et al., 2021). Such work compares different models in how well they align with human statistical expectations by assessing how well different models explain predictability/surprise effects in behavioral and/or brain responses.

      Third, regarding the doubts on the neural index of surprise used: we respond to this concern below, after reviewer 1’s first point to which the present comment refers (the referred-to comment was not included in the “essential revisions” here).

      Reviewer #2 (Public Review):

      This manuscript focuses on the basis of musical expectations/predictions, both in terms of the basis of the rules by which these are generated, and the neural signatures of surprise elicited by violation of these predictions.

      Expectation generation models directly compared were gestalt-like, n-gram, and a recentlydeveloped Music Transformer model. Both shorter and longer temporal windows of sampling were also compared, with striking differences in performance between models.

      Surprise (defined as per convention as negative log prior probability of the current note) responses were assessed in the form of evoked response time series, recorded separately with both MEG and EEG (the latter in a previously recorded freely available dataset). M/EEG data correlated best with surprise derived from musical models that emphasised long-term learned experiences over short-term statistical regularities for rule learning. Conversely, the best performance was obtained when models were applied to only the most recent few notes, rather than longer stimulus histories.

      Uncertainty was also computed as an independent variable, defined as entropy, and equivalent to the expected surprise of the upcoming note (sum of the probability of each value times surprise associated with that note value). Uncertainty did not improve predictive performance on M/EEG data, so was judged not to have distinct neural correlates in this study.

      The paradigm used was listening to naturalistic musical melodies.

      A time-resolved multiple regression analysis was used, incorporating a number of binary and continuous variables to capture note onsets, contextual factors, and outlier events, in addition to the statistical regressors of interest derived from the compared models.

      Regression data were subjected to non-parametric spatiotemporal cluster analysis, with weights from significant clusters projected into scalp space as planar gradiometers and into source space as two equivalent current dipoles per cluster

      General comments:

      The research questions are sound, with a clear precedent of similar positive findings, but numerous unanswered questions and unexplored avenues

      I think there are at least two good reasons to study this kind of statistical response with music: firstly that it is relevant to the music itself; secondly, because the statistical rules of music are at least partially separable from lower-level processes such as neural adaptation.

      Whilst some of the underlying theory and implementation of the musical theory are beyond my expertise, the choice, implementation, fitting, and comparison of statistical models of music seem robust and meticulous.

      The MEG and EEG data processing is also in line with accepted best practice and meticulously performed.

      The manuscript is very well-written and free from grammatical or other minor errors.

      The discussion strikes a brilliant balance of clearly laying out the interim conclusions and advances, whilst being open about caveats and limitations.

      Overall, the manuscript presents a range of highly interesting findings which will appeal to a broad audience, based on rigorous experimental work, meticulous analysis, and fair and clear reporting.

      We thank the reviewer for their detailed and positive evaluation of our manuscript.

      Reviewer #3 (Public Review):

      The authors compare the ability of several models of musical predictions in their accuracy and in their ability to explain neural data from MEG and EEG experiments. The results allow both methodological advancements by introducing models that represent advancements over the current state of the art and theoretical advancements to infer the effects of long and shortterm exposure on prediction. The results are clear and the interpretation is for the most part well reasoned.

      At the same time, there are important aspects to consider. First, the authors may overstate the advancement of the Music Transformer with the present stimuli, as its increase in performance requires a considerably longer context than the other models. Secondly, the Baseline model, to which the other models are compared, does not contain any pitch information on which these models operate. As such, it's unclear if the advancements of these models come from being based on new information or the operations it performs on this information as claimed. Lastly, the source analysis yields some surprising results that don't fit with previous literature. For example, the authors show that onsets to notes are encoded in Broca's area, whereas it should be expected more likely in the primary auditory cortex. While this issue is not discussed by the authors, it may put the rest of the source analysis into question.

      While these issues are serious ones, the work still makes important advancements for the field and I commend the authors on a remarkably clear and straightforward text advancing the modeling of predictions in continuous sequences.

      We thank the reviewer for their compliments.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      Rodríguez-Real, Huertas and colleagues here explore the roles of centrosomes in DNA damage responses, focussing on DNA repair activities. They show that centrosome depletion by PLK4 inhibition leads to reduced levels of homologous recombination and increased nonhomologous end-joining, along with altered level of nuclear focus formation by DNA repair proteins. Knockdown of genes that encode components of centriolar subdistal appendages (SDAs) cause reduced levels of RPA foci, with CRISPR-generated CEP170 heterozygotes also showing defects in focus formation. Knockdown of CEP170 impairs homologous recombination, although NHEJ activities are unaffected. Some increase in sensitivity to DNA damaging agents is seen in CEP170- or centriole-deficient cells, albeit with a modest effect size. CEP170 status is shown to affect mutational signatures and patient prognosis in different cancer samples.

      While the experiments are generally well-presented and controlled, the effects seen are not large, so that the the conclusions that the authors draw are not entirely substantiated by the data presented, even without the suggestion of a mechanism. There are several additional experiments and clarifications that I consider necessary to provide appropriate support for the phenomenon.

      Major points

      1. The lack of cell cycle arrest or phenotype in the U2OS cells after a week's treatment with centrinone is somewhat surprising, given their p53 status. The initial description of centrinone showed a distinct impact on U2OS proliferation, albeit after 2 weeks' treatment (although the present paper shows robust impact on centriole numbers after only 1 week in centrinone). It would be useful to know the percentage of mitotic cells, or if there is any increased cell death observed at this stage of treatment.
      2. In the I-SceI assays, were transduction efficiencies or apoptosis within the experiment impacted by centrinone treatment? If not, it would be useful to state that this was examined and that there were no confounding effects; having only normalised data does not allow the reader to exclude these potential confounding factors.
      3. The authors present binary data for a given type of nuclear focus (positive or negative for RPA/ BRCA1/ RAD51), while the supporting images show altered numbers/ intensities. For example, the BRCA1 signals shown in Fig. 3D are less readily distinguished than they are in Fig. 1D. These data should be reconsidered: it is possible that these observations reflect different kinetics of focus formation, rather than a change in IRIF formation capacity. Numbers and a timecourse should be provided, with details of how these are quantitated provided in the Methods.
      4. Are the BRCA1 and RAD51 results seen with centrinone treatment of U2OS cells recapitulated in the Saos-2 and RPE1 lines?
      5. Some additional analysis is needed of the extent to which cells are sensitised to DNA damaging treatments by CEP170 deficiency or centrinone treatment. It should be confirmed that these experiments were performed in biological triplicate, rather than a technical triplicate (within a single experiment); if this is not the case, these experiments should be done in triplicate. Analysing p53-deficient hTERT-RPE1 clones, Kumar et al. (NAR Cancer 2020 PMID: 33385162) showed <10% survival with 100 ng/ml NCS. Hustedt et al. (Genes Dev 2019 PMID: 31467087) showed just over 50% survival with 10 nM CPT treatment, although their data for IR were comparable to the current study. Given the wide variation that these assays seem to incur, the extent to which a ≈20% difference in clonogenic survival is biologically significant may be limited. A rescue of the CEP170 siRNA, and/ or washout in the centrinone experiment would make these data more convincing. The knockdown of CEP170 in Figure 4 should be correctly labelled (not as CEP170+/-); given that the authors have generated CEP170 heterozygotes in Figure 2, this is potentially confusing.
      6. Direct data for the (centrosomal) phosphorylation of CEP170 are limited; it has not been demonstrated that the S637A mutants are fully functional in terms of the centrosome functions of CEP170, so that the conclusion regarding a requirement for centrosomal CEP170 phosphorylation is not sufficiently supported by the available data. The CEP170-dependent changes in RPA focus positive cell percentages shown in Figure 3 are not very marked. The relevant sections should be revised, or the authors should include additional experiments showing directly a phosphorylation of CEP170.
      7. It is difficult to interpret the mutational spectrum data and their significance. These should be compared with data for mutations in NDEL1 mutant cells, and/or other SDA components.
      8. The Kaplan-Meier curves data are intriguing, but their interpretation is highly speculative, given that there are no data on treatment groups included in this study. It is unclear whether other genes that affect SDAs might also impact survival (in the same, or different cancers), so the presentation of those patient groups where CEP170 status impacted survival seems selective, given the ubiquity of HR and centrosomes. These data would be better included as Supplemental information.
      9. The independence of p53 status/ responsiveness of the system is a crucial aspect of this study. Sir et al. (JCB 2013 PMID: 24297747) showed no DNA repair defect in centrosome-deficient chicken DT40 cells. This paper is very relevant to the current study and should be discussed. Similarly, the work by Lambrus et al (JCB 2015 PMID: 26150389) should also be considered.

      Minor points

      1. References for the RPE1 TP53/ SAS6 mutant cell lines should be provided (or controls for their generation presented).
      2. Fig S1K should correct its x-axis to reflect the time intervals correctly.
      3. Fig 2D should show blow-ups of the centrosomes.
      4. To avoid any potential confusion, it would be helpful to indicate in the Figure proper which cells are used for the various analyses.
      5. The 'basal side' of the centriole is not a standard term- this should be clarified. This may be confusing, given the role of centrioles in the basal body.
      6. The consideration of Seckel syndrome seems somewhat speculative at this stage in the exploration of this phenomenon.

      Referees cross commenting I think the comments from Reviewers #2 and #3 are reasonable and justified; there is good convergence between the comments that we all made and I have no issues to raise in this cross-commentary.

      Significance

      Strengths: Much previous work linking centrosomes and DNA damage responses has addressed cell cycle and checkpoint roles of the centrosome, so that a direct role in (nuclear) DNA repair is intriguing. Limitations:The present study shows a relatively moderate impact of centrosome defects on DNA repair, without a clear mechanism. There are some technical details that should be addressed. The relatively limited sensitization to DNA damaging treatments caused by centrosome deficiency questions the biological significance of the phenomenon.

      Advance: The current study presents some new findings that potentially show DNA repair defects resulting from the loss of centrioles (or SDA proteins). This has not been demonstrated to date.

      Audience: The idea of subdistal appendage components contributing to homologous recombinational repair of DNA damage is of potential interest to several fields, ranging from basic centrosome biology through translational to clinical cancer research.

      Reviewer's expertise: basic/ cell biology.

    1. Author Response

      eLife Assessment:

      This manuscript follows the still unanswered concept of 'original antigenic sin' and shows the existence of a 24-year periodicity of the immune response against influenza H3N2. The valuable work suggests a long-term periodicity of individual antibody response to influenza A (H3N2) within a city. But, to substantiate their argument, the authors would need to provide additional supporting data.

      Thank you for your comments. We have performed additional analyses and included those results in the revision to support our findings.

      Specifically, we included a sensitivity analyses that predicting phases by fitting models with 35- and 6-years periodicity, which were found to provide poorer predictions than the 24-year periodicity used in our main results (Figure 4 – figure supplementary 1).

      We also generated a antigenic map with the locations of our tested strains shown in the map. We also compared the paired antigenic distance of A(H3N2) strains (including our tested strains). These results (Figure 1 – figure supplementary 3) suggested that the tested strains that we used spanned the circulation of A(H3N2) since its emergence and well covered the antigenic space of the virus.

      Reviewer #1 (Public Review):

      The authors suggest that there is a long-term periodicity of individual antibody response to influenza A (H3N2). The interesting periodicity may be surely appeared. Though the authors assume that the periodicity is driven by pre-existing antibody responses, the authors could provide more supportive data and discuss some possibilities.

      Thank you for your comments and please find our point-to-point responses below.

      1) The authors can investigate whether the periodicity reflects an epidemic/invasion record of A(H2N3) within Guangzhou or the surrounding city, e.g., the numbers of flu-infected people yearly can be referred to.

      Thank you for your comments. We aimed to investigate the periodicity in individual level antibody responses, so we made several efforts to minimize the impacts of population level A(H3N2) activity in our analyses. In particular, we have removed the average activity at population level (i.e., strain-specific intercepts), to minimize the impact of higher circulation of a certain stain on the periodicity.

      In our simulations, we tested models that only incorporated population level activity but not including cross-reactions (Figure 3B, I), which did not recover the observed periodicity. In the models that including both population level activity and cross-reactions, we found that less predictable population level activities (i.e., less regular annual epidemics) would increase the variations in individual-level long-term periodicity (Figure 3G-H). We also found that measured periodicities did not vary substantially when comparing those measured at baseline compared to those measured at follow up (~3-4 years later). These results suggested that the local epidemics may only have limited impacts on the observed periodicity in individual’s antibody responses, while the cross-reactions between previous exposed and currently circulating strains may be the main drivers.

      To address this comment, we added a paragraph in discussion (lines 336-342):

      “In this study, we did not explore the interactions between individual level antibody responses with population level A(H3N2) activity (e.g., epidemic sizes). We minimized the impacts from population level by performing the Fourier analysis with individual departures from population average and validating the results with data from the Vietnam cohort. Simulation results further suggested that the population level virus activity alone was not able to recover the observed periodicity, though epidemics with less regularity seemed to increase the variability in individual-level periodicity in the presence of broad cross-reactions (Figure 3G-H).”

      2) The authors can consider whether the participants are recently/previously vaccinated and/or infected with flu. The remaining antibodies may reflect a long memory but may show a recent activation.

      Thank you for your comments. We agree with the reviewer that the observed seroconversion of the circulating strains may reflect responses recent re-exposures. Given the low influenza vaccine coverage in our cohort (1.3%, 10 out of 777) and in China in general (<5% [3, 4]), we believe that our observed periodicity and seroconversion patterns were unlikely to be caused by to recent influenza vaccinations.

      We think that the pervasive exposure to A(H3N2) could be a driver to the observed seroconversions to circulating strains between our baseline and follow-up were likely due to the pervasive exposures (or reinfections for those who developed into infections). Using the same data set, we previously reported 98% and 74% of participants experienced 2- and 4-fold rise to any of the 21 tested A(H3N2) strains [5].

      As the reviewer and previous studies suggested, the antibody responses could reflect long term memories that were activated after recent exposures [1, 6]. We generated our hypothesis based on this features, and to characterize the periodicity that may arose from the interactions between long term memories and newly generated antibodies.

      We incorporate the re-infection mechanism in our simulations, with and without subsequent cross-reactions with previously exposed distant strains (Figure 3I). Results indicate that reinfection alone cannot recover the observed long-term periodicity (Figure 3A), while reinfection plus the resulting cross-reactions can recover such long-term periodicity (Figure 3D). Therefore, we believe that the repeated exposures or re-infections would not affect our reported periodicity, while they may be drivers of continuous formulation of the life-course antibody profiles and the observed periodicity. Of particular note is the consistency of measured periodic behaviour at baseline and follow up (~3-4 years later).

      To address this comment, we reported the vaccination status of our participants when introducing the data (lines 127-129) and in the discussions (lines 280-282 and 313-315):

      “Only 0.6% (n = 5) of participants self-reported influenza vaccinations between the two visits, therefore, the observed changes in HI titers between the two visits were likely due to natural exposures.”

      “Due to the low influenza coverage in our participants and in China in general, the observed seroconversions likely reflected antibody responses after natural exposures during the study period.”

      “Particularly, our simulation results suggested that model including repeated exposures or population level A(H3N2) activity alone did not recover the long-term periodicity (Figure 3).”

      3) The strains inducing high HI titers may have similar mutations and may be reactive to the same antibodies. What are the mutation frequencies among 21 A(H3N2) strains?

      Thank you for your comments. We selected the 21 tested strains to cover the span of the circulation of A(H3N2) strains since 1968 and antigenic diversity. We prioritized with the strains that were included in the vaccine formulation and tested to create the antigenic map by Fonville et al. [1].

      We reproduced the antigenic map (up to strains isolated in 2010) by Fonville et al. [1] and compared the antigenic locations of our tested A(H3N2) strains (Figure 1—figure supplement 3). The 21 strains (or their belonging antigenic clusters if the strains were not used for the map) largely tracked the antigenic evolution of A(H3N2) since its emergence in 1968, with a reportedly mutation rate of 0.778-unit changes in antigenic space per year [1, 2].

      We further calculated the paired antigenic distance of strains tested in the antigenic map, which was highly correlated with the time intervals between the isolation of the two strains. The figure also suggested our tested strains cover the time spans and antigenic distances that were shown in the original antigenic map. In addition, our observed periodicity was identified in individual time series of residuals, which has removed the shared virus responses or assay measurements (Figure 1). Therefore, we believe that the impact of specific mutations may have limited impacts on our findings.

      To address this comment, we included the reproduced antigenic map showing the locations of the tested strains and their pair-wise antigenic distance in Figure 1—figure supplement 3 and referenced in the main text (line 127).

      Reviewer #2 (Public Review):

      This is a well-thought-out, clearly exposed article. It builds upon the platform of 'original antigenic sin' (OAS), a notion first developed from studying individuals infected with influenza. According to OAS, the initial infection will set the dominant immune response targets (antigens) that immune cells will recognize, such that infection with a related strain will cause a strong response focused mainly against the initially infecting strain, that then goes on to protect against the new-infecting strain. This study builds off this idea, showing that as strains become increasingly antigenically distant as inferred by the time between strain appearance, the cross-protection can drop to a point where it needs to be invigorated with a potentially new response. The potential biological mechanisms behind this aren't discussed, but a model is built that conveys the potential for 'relative risk' of an individual over the course of the life, based essentially on when one was born.

      Thank you for your comments. We expanded our introduction hoping to include more biological mechanisms, especially those related with original antigenic sin.

      “Antibodies mounted against a specific influenza virus decay (in either absolute magnitude or antigenic relevance) after exposure until re-exposure or infection to an antigenically similar virus occurs, whereupon back-boosting of antibodies acquired from previous infections (e.g., activation of memory B cells) can occur, as well as updating antigen specific antibodies to the newly encountered infection (e.g., activation of naïve B cells.” (lines 80-84)

      “Original antigenic sin (OAS) is a widely accepted concept describing the hierarchical and persistent memory of antibodies from the primary exposure to a pathogen in childhood. Recent studies suggested that non-neutralizing antibodies acquired from previous exposures can be boosted and may blunt the immune responses to new influenza infections.” (lines 92-97)

      The basic premise was to measure from serum influenza haemagglutinin-inhibition (HI) titers of 21 strains of influenza A (H3N2) - related strains causing disease at various times over a period of some 40 years- from a diverse set of ≈800 participants of various ages, at two time points, spaced 2 yr apart. The authors then calculated the HI titer for the 21 strains for each individual. From this, each participant's age, their age at the time of a strain's development, and when a strain emerged were used to assess whether there was periodicity to immune responses by performing a splined Fourier transform for each individual and then examining the composite pattern across time for HI titers. The authors propose that on average there is a 24-year periodicity to immune responses to influenza strains, such that after the initial infection, cross-reactivity reduces to the point where it may be less meaningful for protection over around 24-year, and suggests activation of a 'new' immune response might be required to control the more distant strain involved in the response at that time. The periodicity was longer than would be predicted if age were not a factor involved in the HI titer patterns across time. Further, variability in the periodicity was shown to involve broad cross-reactivity between strains and narrow cross-reactivity in more highly-related (closer in time) strains, individual HI titer, and periodic population fluctuations. In the literature, viral strains are estimated to mutate to the point of losing 50% cross-reactivity with a T1/2 of approximately 2.5 yr, which would make the inferred lifespan plausible but perhaps surprisingly long, implying there are immune feedback parameters that influence periodicity. The authors also use an independent cohort of approximately 150 individuals from a separate, published, study to validate some findings revealed in the primary data set.

      Thank you for your comments and sorry for the confusion. We agree with the reviewer that the onward protection from the cross-protection should be shorter than 24-year periodicity that was identified in the retrospective antibody responses. We hope to clarify that we identified long-term periodicity by retrospectively investigating the individual antibody profiles, which were results of multiple previous exposures and immunity and cross-reactions that arose from these previous exposures. Therefore, the long-term periodicity is a retrospective characterization, and should not be directly interpretated as the duration of onward protection.

      As shown in Figure 4A, the 24-year periodicity consists of phases when individuals’ titers are higher (phase I & II) and lower (phase III & IV) than the population average. As such, the duration of onward protection may be shorter than the entire periodicity. Assuming the protection decreasing with lower titer levels, the onward protection is expected to decrease in phase II and take 1-6 years to drop from the furthest to population average. This is consistent with findings that homotypic cross-protection against PCR-confirmed infections up to about five seasons (lines 291-293), but whether such protection is driven by the declining of cross-reactions still need further investigations.

      To address this comment, we rephrased our discussion and make the interpretation less confusing. (lines 285-287):

      “Of note, the long-term periodicity is a retrospective characterization of individual antibody profiles that arose from multiple exposures and cross-protection, which should not be directly interpreted as the duration of onward protection conferred by the existing antibodies.”

      Strengths: Overall, the study is well executed and the patterns that are visually apparent in Figure 1A (the 'raw' data) are built on to inform a model of the potential breadth of cross-reactivity in a given individual at any given time after birth, integrated with the influenza strains to which they are most likely to have been first exposed. It is a complex thing to make sense of data involving many individuals who could be infected or vaccinated at any and variable points in time over the course of their life, but the authors derive a model that probabilistically accounts for possible infection events, so controls for this nicely, or at least to a degree that is practicable.

      Thank you for your supportive comments. We hope to clarify that we identified the long-term periodicity using the residuals of individual HI titers after extracting the population activity that is visually noticeable in Figure 1A. By doing this, we hope to minimize the impacts of population level A(H3N2) activity and laboratory measurements on individual antibody responses (Figure 1C; detailed methods in lines 396-412).

      Questions related to the main limitation: The level of math in this paper makes it hard for a basic biologist to critique the approach, but the argued points are intriguing. Foremost, in the final part of the paper the authors move from building a model to testing its potential to predict HI titers in the final quarter strains of the study period, placing individuals into one of four phases: I) early increasing to high titer response, II) waning response phase where they are returning back to the average population-level response against a strain, III) sub-par response against a strain and then reinitiation of HI titers in phase IV. Pleasingly this shows a good correlation between individuals' ages and their predicted phase. However, while the fit predicts phase well in Fig 4C and 4D, it looks to perform less adequately in Fig 4B.

      1) Why is this?

      Thank you for your comments and sorry for the confusion. In Figure 4B, we aimed to characterize and predict the position instead of the amplitude in the individual time series of residuals. Therefore, we fitted the model using only harmonic terms (i.e., sine and cosine functions; Equation 12 on page 26) [7], while we believe there may be other factors that could affect the observations but were not included in the model. The perditions from the model inform the position and velocity of harmonic oscillators rather than the amplitude or extent of the wave, therefore, the predictions did not exactly fit the observations.

      To address this comment, we expand the corresponding methods hoping to make it clear (lines 661-663):

      “Of note, we fitted the model aiming to estimate the position of the harmonic oscillators and did not consider for other non- harmonic factors, therefore the model may not fully capture the variations of the data.”

      2) Another point for consideration is that the time between samplings (2010-2012) is comparatively short, given a 24-yr predicted periodicity. What would happen to the predictions if the periodicity were 35-yr or 6-yr? Would the model fail to call individuals accurately in these cases?

      Thank you for your comments. We repeated our predictions in Figure 4F-G by assuming a 35-year and 6-year periodicity respectively as suggested. Results suggested that model predictions with either 35-year or 6-year did not outcompete the model predictions assuming a 24 years old (Figure 4—figure supplement 1). For instance, the observed proportion of seroconversion to circulating strains in each cohort have correlation coefficients of 0.49 (p-value = 0.05), 0.63 (p-value = 0.02) and -0.12 (p-value = 0.69) with the predicted proportion of phase IV when assuming a 35-, 24- and 6-year periodicity, respectively.

      We also hope to clarify that we investigated the prediction potentials of long-term periodicity from two perspectives. Except for using the periodicity to predict the seroconversions between baseline and follow-up, we also predict the phase of each individual in the year of 2012 only using HI titers against strains that were isolated before 2002. Our results suggested our 10-years ahead predictions well correlated with observations (Figure 4C).

      To address this comment, we also included the results of analyses using alternative 35- and 6-year periodicity as Figure 4—figure supplement 1, and reported in the main text (lines 262-264).

      3) Similarly, if the samples were taken further apart, would the model still be effective at predicting phase?

      Thank you for your comments. We hope to clarify that we collected two cross-sectional serum samples, while we identified the long-term periodicity and predicted phase with serums collected from each visit, separately. For instance, in our sensitivity analysis that using serum collected in follow-up (Figure 1—figure supplement 1), we revealed similar long-term periodicity (baseline in Figure 1) with that identified using the baseline serums, despite pervasive exposures during this time period (time separating samples varied from 3-4 years). In addition, the Vietnam data collected sera from six consecutive years. These data showed a similar long-term periodicity (Figure 2—figure supplement 5).

      For the phase prediction, we used residuals of HI titers against 14 historical strains that were isolated between 1968 and 2002, and predicted the phase of strain that was isolated in the year 2012. This prediction was derived purely by depending on the periodic pattern of the time series and without information for strains isolated 10 years prior to 2012. Therefore, the prediction was 10 years ahead and was well correlated with observations from the complete time series, further supporting that there may be an intrinsic cycling in individual antibody responses and that this cycle is fairly stationary and predictable.

    1. This unique positioning made us view our work with renewed purpose, greater creativity, and a sense of urgency. The experience also underscored what we have always known—that diversity, equity, and inclusion are core to the realization of the institutional mission on each one of our campuses.

      Even though the pandemic set us as a world back it also helped us realize how much more needs to be done. It also allowed us to think deep and develop ways to address other inequities we may face a a nation.

    1. Author Response

      Reviewer #1 (Public Review)

      [...] One potential issue is that the high myelination signal is associated with the compartment in V2 (pale stripes) which was not functionally defined itself but by the absence of specific functional activations. No difference was reported between those stripes that were defined functionally. Other explanations for the differential pattern of a qMRI signals, e.g. ROI distribution for presumed pale stripes is not evenly distributed (more foveal), ROIs with low activations due to some other factor show higher myelin-related signals, cannot be excluded based on the analysis presented.

      Indeed, it would have been advantageous to directly functionally delineate pale stripes in V2. Since we were not able to achieve this by fMRI, we needed an indirect method to infer pale stripe contributions in the analysis. We also added a statement in the discussion section to emphasize this more (p. 9, lines 286–288).

      Furthermore, different myelination between thin and thick stripes was not tested, since we did not have a concrete hypothesis on this. Despite the conflicting findings of stronger myelination in dark or pale CO stripes in the literature, no histological study stated myelination differences between dark CO thin and thick stripes. Therefore, our primary interest and hypothesis was lying in comparing the different myelination of thin/thick and pale stripes using MRI.

      Thank you very much for this comment about potential other sources of differential qMRI parameter patterns. Indeed, based on the original analysis we could not exclude that the absence of functional activation around the foveal representation may have biased our analysis. We therefore added a supporting analysis, in which we excluded the region around the foveal representation from the analysis. The excluded cortical region was kept consistent between participants by excluding the same eccentricity range in all maps. We added more details in the results section of the revised manuscript (p. 8, lines 189–202). In Figure 5-Supplement 1 and Figure 5-Supplement 3, results from this supporting analysis are shown which reproduced the primary findings from the main analysis, particularly the relatively higher myelination of pale stripes.

      ROI definitions solely based on fMRI activation amplitude have additional limitations. However, we find it unlikely that a small fMRI effect size and low contrast-to-noise ratio (i.e. stochastic cause of low statistical parameter values/”activation”) has impacted the results, since Figure 3 shows that we could achieve a high degree of reproducibility for each participant.

      We would note that the fact that we found consistent differences across MPM and MP2RAGE sessions makes some potential artifacts driving the differences unlikely. We also find it unlikely that systematic cerebral blood volume differences between stripes would have driven the results. A higher local blood volume would lead to increased BOLD responses but also to a higher R1 value due to the deoxy-hemoglobin induced relaxation, which is opposite to the observation of higher activity in the thick/thin stripes but lower R1 values.

      Further studies using other functional metrics (e.g. VASO, ASL etc.) may help us to even more clearly demonstrate specificity but were out of the scope of this already rather extensive study. Although we have added extensive further analyses in the revised manuscript such as controlling for foveal effects or registration performance, we did not see a possibility to fully exclude a systematic bias that might potentially be caused by unknown factors.

      Another theoretical and practical issue is the question of "ground truth" for the non-invasive qMRI measures, as the authors - as their starting point - roundly dismiss direct histological tissue studies as conflicting, rather than take a critical look at the merit of the conflicting study results and provide a best hypothesis. If so, they need to explain better how they calibrate their non-invasive MR measurements of myelin.

      We agree and have now further elaborated on the limits of specificity of the R1 and R2* signal as cortical myelin marker (p. 2, lines 68–88; p. 6, line 163; p. 8, line 216; p. 9, lines. 257–260). However, we still think that it is important for the reader to appreciate the conflicting results in histological studies using staining methods for myelin, which adds to the study’s background.

      We did not intend to give the impression that MRI provides the missing ground-truth to adjudicate histological controversies, but that it provides an alternative and additional view on the open questions. We changed the introduction to better reflect the aspect that the study offers a unique view by providing myelination proxies and functional measures in the same individual, which allows for direct comparison and investigation of structure-function relationships (see p. 2, lines 68–70; p. 3, lines 93–95), which is not accessible to any other approach. Nevertheless, we would like to note that R1 has been well established as a myelin marker under particular conditions (Kirilina et al., 2020; Mancini et al., 2020; Lazari and Lipp, 2021). It has also been widely used for cortical myelin mapping across a variety of populations, systems and field strengths. We added this statement to the introduction (see p. 2, lines 82-85). We note that we excluded volunteers with pathologies or neurological disorders from the study and their mean age was about 28 years. Thus, we had conditions comparable to previous (validation) studies.

      Because of the contradictory findings of histological studies, we could not further finesse the hypothesis beyond our previous a priori hypothesis that we expected differences in the myelin sensitive MRI metrics between the thin/thick versus pale stripes. To improve the contextual understanding, we added a paragraph in the discussion section covering in more depth how the MRI results relate to known histological findings (see pp. 8–9, lines 216–240).

      While this paper makes an important contribution to the question of the association of specific myelination patterns defining the columnar architecture in V2, it is not entirely clear whether the authors can fully resolve it with the data presented.

      Indeed, we agree that non invasive aggregate measures, such as the R1 metrics, offer limited specificity which precludes a fully conclusive inference about cortical myelination. We have further emphasized this on several occasions in the text (see p. 2, lines 68–88; p. 6, line 163; p. 8, line 216; p. 9, lines. 257–260). Since the correspondence of cortical myelin levels and R1 (and other metrics) is an active area of research, we expect that the understanding, sensitivity and specificity of R1 to cortical myelination will further improve. We note that the use of qMRI is a substantial advance over weighted MRI typically used, which suffers from lack of specificity due to instrumental idiosyncrasies and varying measurement conditions.

      Reviewer #2 (Public Review)

      [...] Unfortunately, this particular study seems to fall into an unhappy middle ground in terms of the conclusions that can be drawn: the relaxometry measures lack the specificity to be considered "ground truth", while the authors claim that the literature lacks consensus regarding the structures that are being studied. The authors propose that their results resolve whether or not stripes differ in their patterns of myelination, but R1 lacks the specificity to do this. While myelin is a primary driver of relaxation times in cortex, relaxometry cannot be considered to be specific to myelin. It is possible that the small observed changes in R1 are driven by myelin, but they could also reflect other tissue constituents, particularly given the small observed effect sizes. If the literature was clear on the pattern of myelination across stripes, this study could confirm that R1 measurements are sensitive to and consistent with this pattern. But the authors present the work as resolving the question of how myelination differs between stripes, which over-reaches what is possible with this method. As it stands, the measured differences in R1 between functionally-defined cortical regions are interesting, but require further validation (e.g., using invasive myelin staining).

      We agree that we have inadvertently overstated the specificity of R1 at several occasions in the text. We therefore toned down the statements concerning the correspondence between R1 and myelin throughout the manuscript (e.g. see p. 2, lines 68–88; p. 6, line 163; p. 8, line 216; p. 9, lines. 257–260).

      We also removed the phrase that gave the impression that MRI can conclusively resolve the conflicting results found in histological studies. In the Introduction, we changed the corresponding paragraph by emphasizing the alternative view, which can be obtained from MRI by the possibility to investigate structure-function relationships in the living human brain, which would not be possible by invasive myelin staining (see p. 2, lines 68–70; p. 3, lines 93–95).

      We acknowledge that – perhaps aside from electron microscopy – all common markers have shortcomings, which limit their specificity. For example, classic histology is not quantitative and resulted in conflicting results. It even includes the very fundamental issue, that the composition of myelin varies across the brain and within brain areas significantly (e.g., its lipid composition (González de San Román et al., 2018)). Thus, we regard the different invasive/non-invasive measures as complementary. R1 adds to this arsenal of measures and can be acquired non invasively. It has been shown to be a reliable myelin marker under certain circumstances. It follows the known myeloarchitecture patterns of the human brain, which was also checked for the data of the present study (see Figure 4 and Appendix 2). It is responsive to traumatic changes (Freund et al., 2019), development (Whitaker et al., 2016; Carey et al., 2018; Natu et al., 2019) and plasticity (Lazari et al., 2022). Since we studied healthy volunteers with no known pathologies that were sampled randomly from the population, we believe that the previous results generally apply and suggest sufficient specificity of the R1 marker. Of course, we cannot fully exclude bias due to unknown factors that have not been investigated/discovered by validation studies yet. However, in this case we expect that the systematic differences between stripe types would remain an important result most likely pointing to another interesting biological difference between stripes.

      While more research is needed to clarify the precise role of R1 for cortical myelin, we think that the meaningful determination of quantitative MR parameter within one cortical area is still interesting for the neuroscientific community.

      Moreover, the results make clear that R1 differences are not sufficiently strong to provide an independent measure of this structure (e.g., for segmentation of stripe). As such, one would still require fMRI to localise stripes, making it unclear what role R1 measures would play in future studies.

      Indeed, the observed small effect sizes in the present study still requires a functional localization with fMRI. We expected small effect sizes using R1 and R2* due to the known small inter-areal or intra-cortical differences of MRI myelin markers. Therefore, this study aimed at a proof-of-concept investigating whether intra-areal R1 differences at the spatial scale of columnar structures can be detected using non-invasive MRI. Our study shows that these differences can be seen but currently not at the single voxel level. We anticipate that with further improvements in sequence development and scanner hardware, high-resolution R1 estimates with sufficient SNR can be acquired making fMRI redundant (for this kind of investigations). Please see the reply to the next comment concerning the impact of using R1 in future studies.

      The Introduction concludes with the statement that "Whereas recent studies have explored cortical myelination ... using non-quantitative, weighted MR images... we showed for the first time myelination differences using MRI on a quantitative basis". As written, this sentence implies that others have demonstrated that simpler non-quantitative imaging can achieve the same aims as qMRI. Simply showing that a given method is able to achieve an aim would not be sufficient: the authors should demonstrate that this constitutes an important advance.

      Thank you for this comment. It goes to the heart of the concerns raised about specificity and sensitivity of MRI based myelin metrics. We elaborate here on the main advantage of using qMRI in our current study and why it is more specific than weighted MR imaging. However, we emphasize that a thorough comparison between qMRI and weighted MRI is highly complex and refer to our recent review paper on qMRI for further details (Weiskopf et al., 2021), which are beyond the scope of our paper. The signal in weighted MRI, even when optimally optimized to the tissue of interest, additionally depends on both inhomogeneities in the RF transmit and receive (bias) fields. Other methods like using a ratio image (T1w/T2w) can cancel out the receive field bias entirely (in the case of no subject movements between scans) but not the transmit field bias. This hampers the direct analysis and interpretation of signal differences between distant regions of the brain. For high resolution imaging applications, the usage of high magnetic fields such as 7 T is beneficial or even mandatory due to signal-to-noise (SNR) penalties. With increasing field strength, these inhomogeneities also apply to small regions as V2. For these cases, qMRI is advantageous since it provides metrics which are free from these technical biases, significantly improving the specificity. As high-field MRI has the potential to non invasively study the structure and function of the human brain at the spatial scale of cortical layers and cortical columns, we believe that the results of our current study, which successfully demonstrate the applicability of qMRI to robustly detect small differences at the level of columnar systems, is relevant for future studies in the field of neuroscience.

      We emphasized these considerations in the revised manuscript (see. p. 9, lines 273–285).

      The study includes a very small number of participants (n=4). The advantage of non-invasive in-vivo measurements, despite the fact that they are indirect measures, should be that one can study a reasonable number of subjects. So this low n seems to undermine that point. I rarely suggest additional data collection, but I do feel that a few more subjects would shore up the study's impact.

      The present study was conducted in line with a deep phenotyping study approach. That is, we focused on acquiring highly reliable datasets on individuals. We did not intend to capture the population variance, which is often the goal of other group studies, since low level and basic features such as stripes in V2 are expected to be present in all healthy individuals. Thus we traded off and prioritized test-retest measurements for fMRI sessions and using an alternative MP2RAGE acquisition over a larger number of individuals. This resulted in 6–7 scanning sessions on different days for each individual, summing up to 26 long scanning session in total. We also note that the used sample size is not smaller than in other studies with a similar research question. For example, another fMRI study investigating V2 stripes in humans used the same sample size of n=4 (Dumoulin et al., 2017).

      The paper overstates what can be concluded in a number of places. For example, the paper suggests that R1 and R2 are highly-specific to myelin in a number of places. For example, on p7 the text reads" "We tested whether different stripe types are differentially myelinated by comparing R1 and R2..." Relaxation times lack the specificity to definitively attribute these changes purely to myelin. Similarly, on p11: "Our study showed that pale stripes which exhibit lower oxidative metabolic activity according to staining with CO are stronger myelinated than surrounding gray matter in V2." This implies that the study directly links CO staining to myelination. In addition to using non-specific estimates of myelination, the study does not actually measure CO.

      We agree that we did not clearly point out the limitations of R1 myelin mapping. Therefore, we toned down the statements about the connection between cortical myelin and R1. The mentioned statements in the reviewer’s comment were changed accordingly (see p. 6, line 163; p. 11, lines 353–354). We also included a small paragraph to clarify the used terminology (color-selective thin stripes, disparity-selective thick stripes) in the manuscript (see p. 4, lines 110–114) to avoid the inadvertent conflation of CO staining and actually measured brain activity.

      I'm confused by the analysis in Figure 5. I can appreciate why the authors are keen to present a "tripartite" analysis (thick, thin, and pale stripes). But I find the gray curves confusing. As I understand it, the gray curves as generated include both the stripe of interest (red or blue plots) and the pale stripes. Why not just generate a three-way classification? Generating these plots in effect has already required hard classification of thin and thick stripes, so it is odd to create the gray plots, which mix two types of stripes. Alternatively, could you explicitly model the partial volume for a given cortical location (e.g., under the assumption that partial volume of thick and thin strips is indicated by the z-score) for the corresponding functional contrast? One could then estimate the relaxation times as a simple weighted sum of stripe-wise R1 or R2.

      Figure on weighted average of stripe-wise R1 and R2. (a) shows the weighted sum of R1 (de-meaned and de-curved) over all V2 voxels. z-scores from color-selective thin stripe experiments and disparity-selective thick stripes were used as weights in the left and middle group of bars, respectively. An intermediate threshold of zmax=1.96 was used, i.e., final weights were defined as weights=(z-1.96). Weights with z<0 were set to 0. For pale stripes (right group of bars), we used the maximum z-score value from thin and thick stripe measurements. We then set all weights with z≥1.96 to 0 and used the inverse as final weights. i.e., weights = -1 * (max(z)-1.96). (b) shows the same analysis for R2. Error bars indicate 1 standard error of the mean.

      (1) Yes, indeed. We agree that modeling the partial volume of each compartment (thin, thick and pale stripes) in each V2 voxel would be the most elegant approach. However, we note that z-scores between thin and thick stripe experiments may not reflect the voxel-wise partial volume effect, since they are a purely statistical measure and not a partial volume model. Having said this, we think that this general approach can give some additional insights and we provide results for a similar analysis here. We calculated the weighted sum of R1 and R2 values over all V2 voxels for each stripe compartment (thin, thick and pale stripes) independently (see above figure). For R1, we see the same pattern of R1 between stripe types as in the manuscript (Figure 5). Additionally, we show the differences here for each subject, which further demonstrates the reproducibility across subjects in our study. For R2, no clear pattern across subjects emerged, confirming the results in our manuscript. Since, this analysis did not add relavant new information to the manuscript, we refrained from adding this figure to the manuscript, in order not to overload it.

      (2) In our current study, we were not primarily interested in investigating differences between thin/thick stripes and pale stripes. While histological analysis found differences (though not consistent) between CO dark stripes (more myelinated, (Tootell et al., 1983)) and CO pale stripes (more myelinated, Krubitzer and Kaas, 1989)), no study stated myelin differences between CO dark stripes. This does not fully exclude the possibility of myelination differences but suggests that if myelination differences between CO dark stripes existed, they would presumably be smaller than differences between CO dark and CO pale stripes. Thus, it would be even more difficult to demonstrate than the hypothesis of this manuscript.

      Therefore, we decided to directly test two compartments against each other instead of modeling all three compartments within a single model. In our analysis, we thereby loosely followed the analysis methods described in Li et al. (2019), which compared myelin differences between thin/thick and pale stripes in macaques. We note that this demonstrates further consistency, since it is not trivial that both thick and thin stripes show lower R1 values than the pale stripes. For example, there may be no or opposite differences.

      (3) Just for clarification, the plots in Figure 5 show the comparison of R1 (or R2*) between two compartments in V2. The red (blue) curve includes the thin (thick) stripe of interest. The gray curve includes everything in V2 minus contributions from thick (thin) stripes of interest. If we take the thin stripe comparison as example (Figure 5a), then red contains the thin stripes of interest while gray contains everything minus the thick stripes. Therefore, assuming a tripartite stripe arrangement, the gray curve contains both thin and pale stripe contributions.

      References

      Carey D, Caprini F, Allen M, Lutti A, Weiskopf N, Rees G, Callaghan MF, Dick F. Quantitative MRI provides markers of intra-, inter-regional, and age-related differences in young adult cortical microstructure. Neuroimage 2018; 182:429–440.

      Dumoulin SO, Harvey BM, Fracasso A, Zuiderbaan W, Luijten PR, Wandell BA, Petridou N. In vivo evidence of functional and anatomical stripe-based subdivisions in human V2 and V3. Sci Rep 2017; 7:733.

      Freund P, Seif M, Weiskopf N, Friston K, Fehlings MG, Thompson AJ, Curt A. MRI in traumatic spinal cord injury: from clinical assessment to neuroimaging biomarkers. Lancet Neurol 2019; 18:1123–1135.

      González de San Román E, Bidmon H-J, Malisic M, Susnea I, Küppers A, Hübbers R, Wree A, Nischwitz V, Amunts K, Huesgen PF. Molecular composition of the human primary visual cortex profiled by multimodal mass spectrometry imaging. Brain Struct Func 2018; 223:2767–2783.

      Kirilina E, Helbling S, Morawski M, Pine K, Reimann K, Jankuhn S, Dinse J, Deistung A, Reichenbach JR, Trampel R, Geyer S, Müller L, Jakubowski N, Arendt T, Bazin P-L, Weiskopf N. Superficial white matter imaging: Contrast mechanisms and whole-brain in vivo mapping. Sci Adv 2020; 6:eaaz9281.

      Krubitzer LA, Kaas JH. Cortical integration of parallel pathways in the visual system of primates. Brain Res 1989; 478:161–165.

      Lazari A, Lipp I. Can MRI measure myelin? Systematic review, qualitative assessment, and meta-analysis of studies validating microstructural imaging with myelin histology. Neuroimage 2021; 230:117744.

      Lazari A, Salvan P, Cottaar M, Papp D, Rushworth MFS, Johansen-Berg H. Hebbian activity-dependent plasticity in white matter. Cell Rep 2022; 39:110951.

      Li X, Zhu Q, Janssens T, Arsenault JT, Vanduffel W. In Vivo Identification of Thick, Thin, and Pale Stripes of Macaque Area V2 Using Submillimeter Resolution (f)MRI at 3 T. Cereb 2019; 29:544–560.

      Mancini M, Karakuzu A, Cohen-Adad J, Cercignani M, Nichols TE, Stikov N. An interactive meta-analysis of MRI biomarkers of myelin. Elife 2020; 9:e61523.

      Natu VS, Gomez J, Barnett M, Jeska B, Kirilina E, Jaeger C, Zhen Z, Cox S, Weiner KS, Weiskopf N, Grill-Spector K. Apparent thinning of human visual cortex during childhood is associated with myelination. PNAS 2019; 116:20750–20759.

      Tootell RBH, Silverman MS, De Valois RL, Jacobs GH. Functional Organization of the Second Cortical Visual Area in Primates. Science 1983; 220:737–739.

      Weiskopf N, Edwards LJ, Helms G, Mohammadi S, Kirilina E. Quantitative magnetic resonance imaging of brain anatomy and in vivo histology. Nat Rev Phys 2021; 3:570–588.

      Whitaker KJ, Vértes PE, Romero-Garcia R, Váša F, Moutoussis M, Prabhu G, Weiskopf N, Callaghan MF, Wagstyl K, Rittman T, Tait R, Ooi C, Suckling J, Inkster B, Fonagy P, Dolan RJ, Jones PB, Goodyer IM, NSPN Consortium, Bullmore ET. Adolescence is associated with genomically patterned consolidation of the hubs of the human brain connectome. PNAS 2016; 113:9105–9110.