10,000 Matching Annotations
  1. Oct 2025
    1. ndividualization theorists who describe changes in personal relationshipstend to focus primarily on relationships between romantic partners or within the nuclear family(Bauman, 2003; Beck & Beck-Gernsheim, 1995). Moreover, within family studies, researchaddressing extended family relationships is relatively scarce

      the individualization theory has not been thoroughly observed directly (not focused on extended family)

    2. According to individualization theory, the process of individualization has weakened familybonds in Western societies, beginning with the erosion of ties with extended family members

      Individualization Theory- The theory that the increase of individual choice (along with families shifting from an instrumental to an emotional function) have cause extended family bonds to become unstable

    1. patriarchal family structures, and African slavery in an attempt topeacefully coexist with the whites

      The attempt to replicate the actions of the powerful. Just like Haiti.

    2. homesteading,” “settling,” and“westward expansion,” in reality it was one of imperial conquest andconsolidation

      This is a very particular kind of conquest, as the goal is to make it land for use by average people. It is also the only contemporary example I can think of where the imperialist and the conquered are in adjacent territories.

    3. On the heels of thisrebellion, the British government decided to take direct control of India in 1858,making Queen Victoria Empress of India. This, however, did not improveconditions for the Indian people.

      This is more traditional colonialism

    4. Forty years later, however, power hadshifted to Britain, whose possession of Industrial Revolution technology andweaponry enabled the British to easily defeat the Chinese in the first of two“Opium Wars,” and impose their will on that country, forcing the Chinese tocontinue to allow the importation of opium—grown in Britain’s colony ofIndia—into China.

      This is very interesting because it isn't settler colonialism but rather the imposition of a one sided trading relationship through military force.

    Annotators

    1. students to bring what they’ve learned into the workforce immediately, and that includes using new technologies.

      I think it's important to keep in mid technology is a big part of society now and even in school, but it is important to limit the use of AI with it.

    2. In a recently published study in Science Direct, University of California, Irvine, researchers surveyed 1,001 students to understand their usage and concerns around using ChatGPT. Among students who held concerns, the top themes were around ethics, quality, careers, accessibility and privacy or surveillance.

      This quote shows that some people do have succus with Ai and its a important part of the future

    3. One-quarter of respondents indicated AI would not be relevant to their future career, and 17 percent said they don’t want the training at all.

      This quote shows how some people might be scared that AI might take over there lives and the world soon in a age of new Technology

    4. Over half (55) of those responses were flat-out refusal to engage with AI. A few said they don’t know how to use AI or are not familiar with the tool, which impacts their ability to apply appropriate use to coursework.

      This quote shows how much people don't know much about Ai and that they need to know because the world is changing

    5. some students shared in the Irvine study that they were worried a reliance on ChatGPT

      This quote shows that a bunch of students heavily rely on ChatGPT

    6. “I do not see any application in a chat bot. I spend more time fixing its mistakes than I would actually writing the thing,”

      Having it do your whole assignment you would spend more time trying to make it seem like AI didn't write it then you would just writing the assignment out.

    7. when, how or whether to use generative artificial intelligence to help with coursework

      I think its good to use AI when u need ur writing in a certain format or need punctuation. Or even to give a summary of an article and its key points. I think those are good way to use AI for coursework but having it do the whole assignment is not helping.

    8. “I am afraid to be flagged, so I refrain from utilizing it at all,” a junior from Florida Gulf Coast University wrote in the Student Voice survey.

      This quote shows how fear of being flagged online can stop students from using AI resources.

    1. “Culturally proficient coaching intends for the person being coached to be educationally responsive to diverse populations.”

      This is an excellent goal, and becoming culturally proficient as a coach is a must in order to be effective.

    1. ween$75.0020 in stockletterpress poster by Beautiful Angle poster was created in June 2023 in Tacoma WA USA.20 in stock The Space Between quantity Add to cart Categories: 2023, January Description

      TEXT: The meaning of this text is saying that space between physical and spirituality are permeable. Then it goes on to bring up music in heaven and wondering if there is similarity here on earth, if there is no boundary between earth and heaven. The font of the next part is in a different font signifying the importance of it, it says "Reach down, hands of angels. Reach across, hands of humans", I feel like this text shows the permeable between heave and earth almost as if the angels will guide humans. Then it goes on to prove this when it says "which were sent from heaven, to cut the ropes that bind me?" signifying that the angles are being portrayed as saviors to humans. The text is written in a ancient typewriter font, tells us that the thin places are something that has been around.

      COLOR:I feel like the image being black and white signifies that further. These colors contrast each other simply signifies timeless.

    2. TEXT: The meaning of this text is saying that space between physical and spirituality are permeable. Then it goes on to bring up music in heaven and wondering if their is similarity here on earth, if their is no boundary between earth and heaven. The font of the next part is in a different font signifying the importance of it, it says "Reach down, hands of angels. Reach across, hands of humans", I feel like this text shows the permeable between heave and earth almost as if the angels will guide humans. Then it goes on to prove this when it says "which were sent from heaven, to cut the ropes that bind me?" signifying that the angles are being portrayed as saviors to humans. The text is written in a ancient typewriter font, tells us that the thin places are something that has been around.

      COLOR:I feel like the image being black and white signifies that further. These colors contrast each other simply signifies timeless.

    3. The Space Betwee

      image: the image is of a person tied up and what seems to be angel maybe coming down to untie it and rescue it from what looks like the illuminati symbol in a cloud.

      color: the entire picture is grey which could represent the colorless life the person feels since their binded but once their unbinded maybe color will come back into their life

    1. What divisions in Cherokee society are alluded to in the first petition?

      Some Cherokees did in fact sell their land and move, which essentially weakened everyone else.

    2. Our beloved Children and head men of the Cherokee Nation, we address youwarriors in council. We have raised all of you on the land which we now have,which God gave us to inhabit and raise provisions

      I like how they establish their ethos here. They sort of speak for all mothers here, calling on the authority of those who raised the warriors that are now in charge. Its clear that there is still a remnant of the former respect for women.

    3. Although it is not commonfor our sex to take part in public measures, we nevertheless feel justified inexpressing our sentiments on any subject where our interest is as much at stake asany other part of the community

      They are now at the point where they have to justify their inclusion on the matter.

    4. be brought to a savage state again, for we have,by the endeavor of our Father the President, become too much enlightened to throwaside the privileges of a civilized life

      They seem to believe the rhetoric about them. Perhaps they actually were flourishing as farmers idk.

    5. he land was given to us by the Great Spirit above as our commonright, to raise our children upon, & to make support for our rising generations. Wetherefore humbly petition our beloved children, the head men & warriors, to holdout to the last in support of our common rights, as the Cherokee nation have beenthe first settlers of this land; we therefore claim the right of the soi

      This is a very cool debate to see. The mass scale, strictly defined individualism and liberalism vs. a smaller, more intimate and natural form of economy and governance based on solidarity and common belief.

    6. Cultivate and raise corn & cotton and your mothers andsisters will make clothing for you which our father the president has recommendedto us all

      True. Monroe did push for assimilation. Its very interesting that that call him "our father the president".

    7. for it was the good will of our creator to place us here,and you know our father, the great president,3 will not allow his white children totake our country away

      This is very interesting. I wonder why they believed in Monroe like this.

    8. In 1835, a smallunauthorized faction of Cherokee men ignored the above resolutions andsigned the Treaty of New Echota, ceding the Cherokee lands in the southeastfor new territory in what today is the state of Oklahoma.

      I never knew that this is what caused this event.

    9. men who engaged in commercial farming orpurchased enslaved Africans to do that work for them constituted the kernelof an emerging elite clas

      Its so interesting how this happens. I guess ultimately no matter how successful you were as a non-white person, the shape of society was dictated by white people, and this is what success meant.

    Annotators

    1. For instance, Illinois generates more electricity from nuclear energy than any other state in the US, and it accounted for about 12% of US nuclear power generation in 2021 (EIA, 2022a)

      point towards nuclear power.

    2. So why are Illinois and Denmark appropriate cases to compare and find insights that might be mutually usable in both geographies? In this article, these two areas were selected because of their endeavors of being leading territories in creating clean energy systems – both territories have goals of 100% renewable energy or climate-neutral society by 2050. In terms of physical and economic characteristics mentioned in Table 2, in both territories, there is a high population density and high concentration of Gross Domestic Product (GDP), relatively good weather, flat surface, a big mass of water around, one single metropolitan area, an increasing awareness of environmental issues including flooding risks from climate change, and finally a huge dependence on the fossil fuels in the past (mostly coa

      Why compare Denmark to Chicago

    1. Does it feel like the typebars are catching in the guide, or something binding deeper in the machine? If it's the guide (this is going to sound crazy), grasp the top of the typebar and sort of wiggle it side to side in the segment. Not overly forceful as you don't want to bend the segment slots, but just enough to allow the spring steel to realign. Test, and adjust if needed. It sounds kind of unhinged, but this is the fix for most American made machines that have bars getting stuck in the guide. I've done this with Royals, Coronas, and Underwoods in front of clients before and they look at me like I'm an idiot wizard. Ha If the top of the bar is seriously bent this won't work and you'll need a repair person to use some side alignment pliers, but if the bar is just tweaked it usually works very well with a bit of practice. My unsubstantiated belief of why this occurs is US companies using the same grade(if not the same company) of steel in their bars that tend to be a little softer than their European competitors. *You DON'T want to try this with an Olympia or other German made machines. 😅 If it's coming from deeper in the machine, check the linkages to make sure they're not tweaked and binding against each other. Hope this helps!

      advice via Nashville Typewriter, a repair person. <br /> https://reddit.com/r/typewriters/comments/1o4qxvn/chasing_problems_stuck_keys/

    1. There is the empty chapel, only the wind’s home.

      After reading Weston's Perilous Chapel (Chapter XIII), Eliot's inclusion of an empty chapel takes on new significance. Weston explores the recurring motif of the "Perilous Chapel" within Grail legend. She goes on to tell numerous stories of knights encountering a mysterious chapel on a voyage. For instance, Here, Eliot references an "empty chapel" as the speaker travels through a desolate landscape of rock and an absence of water (therefore connecting to a loss in fertility). in other words, "What the Thunder Said" is a further exploration of a physical wasteland. This dark mood and sense of ruin in taken beyond the land and into a manmade space, the chapel. It is described as "empty" without any windows to let in sunlight. It is also the home of the wind, highlighting a lack of spirit of divinity as a force from nature sweeps through the space. The lack of windows could also symbolize a disconnect between nature and man as a key gateway between the two is absent. This is followed up by a reference to "dry bones", a powerful image of ultimate death. This image comes in stark contrast to the chapel stories explored by Weston. For instance, PErceval rides through a storm before encountering the chapel. Here, the building has been cursed by the black hand and it is up to Pereval to first, fight off the force, and second, bless the chapel by spreading holy water over the walls. I interpreted this as a haunting force of death. Therefore, I believe Weston and Eliot explore the similar themes of death and decay in different means. In other words, Eliot ties the glum setting of the wasteland to the absence of a higher power in the empty chapel

    2. What is that sound high in the air Murmur of maternal lamentation

      The note for this line roughly translates to: "Already half Europe, at all events half Eastern Europe, is on the road to Chaos. In a state of drunken illusion she is reeling into the abyss and, as she reels, she sings a drunken hymn such as Dmitri Karamazov sang. The insulted citizen laughs that song to scorn, the saint and seer hear it with tears" (Hesse).

      In the framework of TWL, the saint and seer can indeed be read as Tiresias-- the narrator, and the ultimate observer of the collective human tragedy. He alone views the fall of humanity with profound compassion and insight: while others "laugh at the song to scorn", he mourns it.

      What fascinates me the most in these two lines is the description of the lamentation as "maternal." It intensifies the emotional and spiritual dimension of grief, for there is nothing more sacred and compelling than the cry of a mother for her child: Mary's witnessing and mourning of Jesus's crucifixion, the blight on earth caused by Demeter's lamentation of her daughter Persephone's abduction by Hades (again, creating A WASTE LAND), Achilles' mother grieving his prophesied death in the Trojan War, and Constance's emotional regression after the capture and death of her son in the Shakespearean play King John. In Hesse's analysis, Europe is now distilled into the archetype of the bereaved mother, mourning the cultural and spiritual fall of her people. Within Eliot’s poem, Tiresias’s lament thus acquires a maternal resonance. Having lived as both man and woman---and, according to myth, having even borne children---Tiresias embodies the universal capacity for empathy and suffering. His tears are not only those of a prophet who witnesses and forsees the suffering of his fellow species, but are the sacred image of a profoundly empathetic mother.

    3. Forgot the cry of gulls, and the deep sea swell And the profit and loss.

      This line describes what Phlebas the Phoenician has left behind in death: the sensory and commercial world. “Cry of gulls” calls directly to sense of hearing, but I would argue that this sound is so distinct and piercing it somehow reaches other senses, as well as that its context engages other senses (following the gulls with one’s eyes, smelling the salt of the sea). “Profit and loss” alludes to Phlebas’ mercantile life. In his death these things are “forgotten”—defining the futility of their pursuit. The verb “forgot” here is interesting. Instead of these things being just cut off in/with death, they are erased from the mind—while other things remain/persist/live one. There is something slightly active about forgetting, though it is not under one’s control. But the forgetting occurs with the death, marking that event as some kind of turning point. This seems to align somewhat with De Quincey’s elevation of death to the realm of the sublime.

      The other thing that is forgotten is “the deep sea swell.” I think this speaks to the pushes and pulls of mortal life: temptations. A sea swell travels far across the sea’s surface and has a regular, rhythmic motion. It is the aftereffect of distant storms or strong winds that continues to move even when the surrounding sea appears calm. It is a residual pulse of the sea. Thus temptation Eliot defines as one of the sea’s roles. But the same body Eliot then describes as having purified/purifying Phlebas—in the way it “picked his bones in whispers.” “Picked” suggests a slow, intentional process of removal, and “whispers” mirrors the quality of fading memory. So the sea also has a role in the forgetting of the aforementioned. Notably this is a current “under.”

      But the sea is an incredibly violent actor as well, claiming Phlebas’s life (see the rough draft of “Death by Water”)—perhaps for what it is to afterwards do? Adding to the ideas on temptation explored above, it must also be noted that the sea is the means by which travel and quest for knowledge occur (in the world of Eliot’s references), as well as that which seems to entice such questing. But this pursuit of knowledge (following the biblical arc) ends disastrously, at the hands of the sea—as seen with the story of Ulysses who addresses Dante, and as with Phlebas (whose fuller story is detailed in the earlier draft, see especially the first couple lines). There is a going against/beyond human bounds here—but I feel there is tension between this idea and humans reaching the life they “should” lead (this is actually kind of exactly embodied in Dante!!). I want to explore this further, along with the different, seemingly quite contradictory, roles of the sea in the poem.

    1. ♖🌐.🌌💬.🚧.🤝🌐 HyperPost Development ColabSpace -

      create/curate slef-documenting information pertaining to all development by the indy.player/actor 🎭/gyuri/ within the Web Hosted directory for the Peergos account: ♖/hyperpost part of on going 👨‍💻development ♒stream

      drawing hands autopoiesis

    1. hile the eff ects of such insti-tutions in practice are never uniform or homogenous, looking closely at social dynamics in school allows us to see in sharp relief the ways in which school is fundamental to structuring society along racial and class lines.

      I think this is very interesting and also accurate, although my personal experiences have been a bit different. I did not feel as if race was a huge factor in school until I cam to college, likely because I went to a predominately hispanic school with most of the instructors also being hispanic. But now at university, I see how race can play a huge role in the education system and how there are structures which go along racial lines, at times unrecognized.

    2. dominant position in contemporary capitalist societies, as the only institution that has “the obligatory audience of the totality of the children in the capitalist social formation, eight hours a day for fi ve or six days out of seven.”

      basically saying that school is the only thing most people collectively do and they do for a long time.

    3. Latinos Lag behind in Academics,” he wrote that “Hispanic students” were not “pulling their weight” and attributed the cause to cultural factors, suggesting that Asian parents “push their children to move toward academic success,” while Latino parents “are well- meaning but less active

      Racial inequality has been so deeply engraved into people of color that now you're forced to compete not only against the oppressive system set up by the white elites but now youre placed into competiotn with other people of color

    4. n the spring of 2005, Alhambra High received the Title I Academic Achievement Award, which recognizes schools with low- income stu-dent populations that have made progress in closing the achievement gap.2Principal Russell Lee- Sung arranged a meeting with student government leaders to announce the good news. One of the students present at the meeting was Robin Zhou, an inquisitive, first- generation Chinese American senior, who began to wonder why this gap— a persistent discrepancy in test scores and grade point averages between Alham-bra’s Asian American and Latina/o students— existed in the first place. The results of his speculations ran in Robin’s monthly school newspaper column, “Nerd Rants,” on March 22, 2005. Following the headline, “Latinos Lag behind in Academics,” he wrote that “Hispanic students” were not “pulling their weight” and attributed the cause to cultural factors, suggesting that Asian parents “push their children to move toward academic success,” while Latino parents “are well- meaning but less active.”

      This not only neglects the crucial role of poverty and institutional inequality, but also may exacerbate the conflicts among different ethnic groups within schools. It reminds us that the narrative about academic disparities needs to go beyond cultural attributions and shift towards a more comprehensive analysis of social structures.

    1. Alice says in her interview that even as a child she did not feel animosity toward the black kids that threw things at her bus, as she understood—likely because of the Japanese American experience with white racism—some of the complexities of a racial order where whites controlled school segregation or desegregation.Chou & Feagin 2ed.indb 647/3/14 1:08 PM

      Some children can be aware while others are oblivious of the racism that is occurring because of its normalization in society. Those affected, like in this instance, are forced to accept that it is reality even when they specifically have done nothing to provoke except be born their race. Those inflicting the dance at times aren't aware of what theyre doing, especially children who are likely just copying the behavior of their parents. This is why there needs to be an emphasis on education to ensure all students are aware of their actions and history which took place.

    2. Social science research indicates that children are not born with racist interpre-tations or proclivities but learn racial interpretations and racial framing of the

      I totally agree with this. Many people started to discriminate others based on their environment for friends that they make with. Because of the fact that you are immersed in an environment with lots of hatred to another race, it is barely possible for you not to discriminate that same race. Also, children stepped into their youth, they might get in many different new things such as smoking or vaping, and they think that copying others behavior might make themselves look cool. And I guess copying other people to discriminate another group of people also makes certain people feel like it’s a cool behavior. Hence, the problem lies with how to hinder children from copying inappropriate behaviors at their youth. And my answer to that question is school should educate them in advance by telling them what is wrong and what is right.

    3. In the public mind Asian Americans are often synonymous with academic excellence, in part because their group scores on standardized tests and their college enrollment levels often exceed those of other groups, often including whites.

      Speaking of standardized test, it is definitely most of the Chinese students strength. However, I would like to share phenomenon I see in my high school. I attended the top tier international school back in China, so most of the kids are very smart and hard-working. Therefore, many of the students who are in our school achieved a 1550+ SAT standardized score. For those that usually perform academic excellence in school subject, people would not doubt the authenticity of their score. However, for those who did not perform that well on school subjects, but did perform very well on the standardized testing would be doubted. When I was in the bathroom, I heard people saying a person must have been cheated on SAT because she is not good at school subjects, and that Asians are prone to be cheating. The point I want to make here is that there are rumors saying Asians love cheating, but this is not true, because it depends on individual‘s values. When I hear people of my same race targeted at our own race and play around with the typical stereotype as a joke, I got very angry. This is the same case as for many Black people. I watched a movie regarding slavery, and a line in the movie that stand out the most to me is “ we as the blacks must stop using the N word ourselves before anyone else start to respect us”. This should applies to all different races of people in order to win other peoples respect

    4. This demonstrates a poignant aspect of the reality of those who are not white in this society: they often have a complex relationship with whites, wanting to gain their friendship and approval even at the cost of humiliation.

      The emotional and social burden carried by many racialized individuals to endure or downplay discrimination in order to be accepted in unfortunate. It highlights how racism doesn’t only operate through open hostility but also through subtle dynamics of power and belonging. The desire for social inclusion can force individuals to internalize or minimize harm, showing how oppression is often sustained through everyday interpersonal interactions rather than just overt acts of hatred.

    5. Yet, her later comments did recognize the racial targeting, a slow shift in the interview that happened with numerous respondents.

      At first, what seems like an isolated or neutral event can later be recognized as part of a larger pattern of racial targeting. This delayed recognition shows how systemic racism works not just through direct acts, but through its ability to be hidden and passive.

    6. remember in fifth grade specifically ... I went home every single day in tears because people made fun of me every day. And it probably didn’t help that I didn’t have siblings. I wasn’t particularly tough at all. But I would go home every day in tears because I just felt different, and somebody made fun of me.... I felt that I was inadequate and unable to do anything because I was Asian. It was just the little things that kids would say.... And I would be well aware of the fact that I was different from them

      early racists experiences can deeply affect a childs self esteem and sense of belonging in turn causing kids to internalize their feelings of inadequacy

    7. In the public mind Asian Americans are often synonymous with academic excellence, in part because their group scores on standardized tests and their college enrollment levels often exceed those of other groups, often including whites. One study found that whites, African Americans, Hispanics, and Native Americans perceived Asian Americans to be superior in college preparedness, motivation, and expectations of future career success.3 These perceptions of aca-demic achievement come with a price, however, as Asian American students are also portrayed in the media and in much private discussion among non-Asians as robotic overachievers in the classroom who are nerdy, passive, or inept on a social level

      This passage reveals the double-edged effect of the "model minority" stereotype. Although the academic achievements of Asian students are often regarded as a group advantage, this label strips them of their individual differences and diversity, confining them to the category of "robots" or "lacking social skills".

    8. On December 3, 2009, fifty Asian American students were attacked on and around their South Philadelphia High School campus. Thirty of them sustained injuries serious enough to warrant a hospital visit. These Asian American stu-dents were targeted, and school officials had ignored their complaints of bully-ing and pleas for protection for years. School days were rough for these Asian American students, as their classmates routinely hurled racial epithets; pelted them with food; and beat, punched, and kicked them in school hallways and bathrooms. The students finally had enough after this day of massive attacks, and they staged a boycott for eight days. Media outlets highlighted the racial tensions between the mostly black student attackers and the Asian American victims.2

      This incident highlights the school's failure in fulfilling its role as a safe environment, and also reveals the severity of racial discrimination in the campus setting. What is even more worthy of reflection is that Asian students have repeatedly expressed fear and unease, but their voices have long been ignored.

    9. n the public mind Asian Americans are often synonymous with academic excellence, in part because their group scores on standardized tests and their college enrollment levels often exceed those of other groups, often including whites.

      This tends to come with a lot of pressure that is put on the students because of stereotypes like the model minority. As mentioned in the text, stereotypes amongst Asians include. having high standardized scores, which can lead them to feeling forced to fit into this category. This leads to unhealthy habits which can affect their mental health nd well-being. Many are oblivious to the dangers that come with forcing stereotypes, especially when being forced to such high standards.

    1. text: the text seems like something a preacher would say or something you could possible even find in a bible so the text is trying to portray or give the sense of a "holy" message.

      color: the whole page beside the door and text is gold and when i think of gold i think of how the streets in heaven are gold streets like it says in the bible so i think the color is trying to show that this is a religious poster

    1. Optional in-person attendance by team member for first 12 days London/nearby)

      Is this the on-set accountant? If yes, why only 1-2 days? If not, what would he/she do?

    2. Verify all suppliers properly documented

      I can't understand what this means? Is it referring to any supplier specific information or the invoices or the legitimacy of the supplier? How can a person be documented?

    3. Reduced Loss — Real-time submission and management by our team

      I don't see a point in retaining this one. It's already mentioned in the earlier bullets, why the duplication?

    4. Time Savings — We eliminate weeks of manual email searching● Team Efficiency — No more asking stakeholders to search through inboxes

      The context for these bullets is same though there is a slight variation in the perspective, I'd suggest compile them into one point "Workflow Efficiency - Save weeks of effort by automating email searches and empowering your team to focus on meaningful work, not inbox digging."

    5. Ecobride's expert team and proprietary platform delivers end-to-end compliance from BFIcertification through HMRC submission.
      1. From - through; grammatically incorrect. Was it intended to mean Ecobride deals stuff from the start (BFI), till the end (CLAIMS)? Because this sentence is currently ambiguous.
      2. This section shall show the product differentiation but the context says otherwise. It sounds like what any other film-prod a/c ing firm would do.

    Annotators

    1. A Sócrates no le gustaba la escritura (este es un ensayo sobre la inteligencia artificial)20 de agosto de 2025A Sócrates no le convencía eso de escribir. Su argumento principal era que, al tener las ideas siempre a la mano en un dispositivo externo a la mente humana, esto atrofiaría nuestra memoria: ya no haríamos un esfuerzo por recordar largos poemas épicos, o largas listas de hechos científicos. Pero tampoco haríamos un esfuerzo por recordar nuestros propios argumentos sobre disquisiciones varias. Todo estaría por ahí, en papel o en piedra, listo para consultarse cuando se nos diera la gana. Esto, habría dicho Sócrates, nos daría una “simulación” del conocimiento, en vez de permitirnos acceder a un “verdadero” conocimiento de las cosas. Por supuesto, yo sólo sé de esto porque uno de los discípulos de Sócrates, un tal Platón, escribió en su Fedro acerca de lo que su maestro pensaba de la escritura. A pesar de las críticas de Sócrates, la escritura triunfó como tecnología: casi todas las sociedades del planeta la han adoptado y buena parte de nuestro conocimiento, nuestras comunicaciones y nuestra vida en general está basada en esta invención. Esta victoria, a pesar de las críticas de “tradicionalistas” como Sócrates, ha sido puesta en paralelo con el estado de las cosas con la inteligencia artificial: una nueva tecnología que tiene muchos críticos, pero que eventualmente se impondrá y cambiará nuestra manera de vivir por completo. Yo mismo, en otras conversaciones sobre otras cosas, he recurrido a esta historia de Sócrates con la escritura. Recuerdo en algún taller dictado hace muchos años haber dicho que las redes sociales (con todas las críticas que merecían y aún merecen) se impondrían como tecnología, cambiarían nuestra manera de vivir (y sí, estoy citando esto no como un buen ejemplo, sino como un ejemplo de que uno puede usar este argumento para cualquier innovación). Así como las críticas de Sócrates no pudieron parar el éxito de la escritura, nosotros no podríamos parar el auge de las redes sociales. Pero Sócrates tuvo razón en algo: la escritura sí atrofió nuestra memoria. No la de todos, por supuesto, pero sin duda relegó el acto de recordar a un segundo plano, tanto individualmente (alguien con memoria eidética, o con el conocimiento oral de su pueblo es impresionante, pero no es tan respetado como antes), como colectivamente (después de milenios de escritura, cada vez hay menos personas por ahí recitando La Ilíada y cada vez son menos las sociedades en las que importa la tradición oral). Pero, a cambio, la escritura nos abrió la posibilidad de conocer mucho más allá de lo que puede guardar una memoria humana individual. Los grandes avances de la ciencia, la filosofía, o la literatura (occidentales y orientales, del sur y del norte), no habrían sido posibles sin la escritura, sin la posibilidad de intercambiar ideas a lo largo de países, continentes y siglos. Un discípulo de Platón, Aristóteles, a veces es descrito como una de las últimas personas que sabían todo lo que había por saber. No porque estuviera al tanto de todo el conocimiento en general, sino porque en su época la escritura aún no era tan popular y la cantidad de conocimiento a la que podía potencialmente tener acceso un individuo seguía siendo muy limitada. Quizás conociera todo lo que había que conocer en su mundo, pero ese mundo era bastante pequeño. Probablemente ignoraba conocimientos de China, o América, pero no podía saber que los ignoraba. Eso es imposible de sostener ahora. Ninguna persona por sí sola puede tener en su cabeza todo el conocimiento humano. Pero sí tiene acceso, potencialmente, a todo este conocimiento, en internet, en libros, incluso en ChatGPT. Cada formato con sus errores y sesgos. Por su parte, las redes sociales (en un sentido amplio que incluye foros y blogs) atrofiaron nuestro sentido de habitar una realidad común. Pero a cambio nos dieron la posibilidad de cambiar las dinámicas del poder de la información. Ahora “cualquiera” (en el sentido de Ratatouille) puede hacer escuchar su voz, no sólo los guardianes de la información a los que hemos estado acostumbrados. Esto tiene sus cosas buenas y malas, pero sin duda ha cambiado cómo vivimos e interactuamos. Una de las críticas que se le suele hacer a la inteligencia artificial generativa (que como conté en otro post, es una sección muy específica de la IA) y que yo mismo hago, es que va a atrofiar nuestra capacidad de hacer y pensar cosas críticamente. Si decides programar usando sólo un chatbot (una práctica llamada “vibe coding” en inglés), vas a delegar constantemente no sólo el trabajo, sino la capacidad de aprender cómo hacerlo. Nunca vas a aprender a programar bien. Ni siquiera vas a saber cómo corregir los errores que salgan de ese vibe coding, porque no vas a saber identificarlos. Lo mismo puede pasar con cualquier actividad humana que se le delegue a una inteligencia artificial: escribir, componer o tocar música, pensar en argumentos, lo que sea. Emily Bender, una de las autoras del famoso artículo académico “On the Dangers of Stochastic Parrots”, que argumenta que las inteligencias artificiales generativas son sólo máquinas que reproducen patrones (y por lo tanto no “entienden” lo que escriben, ni “tienen consciencia”) planteó en estos días en su blog que esto, delegar el aprendizaje de habilidades, es un costo de oportunidad. Es decir que, al hacerlo, se pierde la alternativa, que en este caso es poder hacer cosas nosotros mismos (incluso cosas mundanas e insulsas como enviar un correo electrónico laboral). Por supuesto, muchos de todas maneras la usan y la seguirán usando para realizar actividades que quizás no les son tan importantes. No podemos negar que la inteligencia artificial esté aquí para quedarse. El asunto es cómo va a quedarse. A diferencia de la escritura, no es claro cuál es el beneficio concreto que pueda traernos la inteligencia artificial para que se justifique su eventual omnipresencia (y el atrofiamiento que ella implica). Si absolutamente todos adoptáramos su uso en todas las áreas de la vida, pronto nadie tendría habilidades. Es más, sólo podríamos acceder a habilidades pagando el precio de suscripción (que inevitablemente será aumentado por las compañías de IA que en estos momentos están operando a pérdidas para fidelizar a sus clientes). El vibe coding funciona porque hay gente que sabe programar. Un programador que sabe lo que hace puede pedirle a una IA que le haga un código y luego puede revisar y corregir sus inevitables* errores. O puede corregir los errores de las personas que no saben programar pero usaron un chatbot para escribir código. De hecho hay toda una industria de programadores dedicados a hacer estos arreglos. Muchas empresas de software ahora no están contratando a programadores junior, con la idea de que alguien puede producir código à la vibe coding y luego un programador más experto lo puede corregir. ¿Pero qué van a hacer cuando esos programadores expertos se retiren y las empresas pierdan esas habilidades? Por ahora, muchas confían en las promesas de mejoría de la industria de la inteligencia artificial*. Pero yo postulo que este, como todos los sectores, eventualmente se dará cuenta que tener habilidades humanas es mucho más valioso. De hecho muchas ya se han dado cuenta. Y las personas se darán cuenta también: incluso si la industria de la inteligencia artificial no está en una burbuja y si sí se apodera de todas nuestras vidas, las personas nos daremos cuenta de que obtener habilidades es mucho más valioso de delegárselas a una máquina. Ya que escribo como trabajo, muchas veces me han preguntado si no creo que seré reemplazado por una inteligencia artificial. Yo creo que no. Aunque seguramente muchas personas usarán estas herramientas para escribir cosas, consideren lo que pasaría si todo el texto del mundo fuera creado por IA: los modelos de lenguaje en los que están basados estas herramientas simplemente regurgitarían infinitamente otros textos, si bien coherentes, de baja calidad y de dudosa verosimilitud ya regurgitados por otra inteligencia artificial. Eventualmente habría un mercado para algún humano que entrara, cuando menos, a revisar, a editar, a hacer algo con el texto. A escribir. La escritura fue revolucionaria, por todas las razones ya mencionadas; pero la inteligencia artificial parece cada vez más ser una “tecnología normal”, como lo plantean en un artículo académico Arvind Narayanan y Sayash Kapoor. Una tecnología que transformará muchas cosas, pero que no es tan utópica como la pintan sus mercaderes, ni tan distópica como dicen sus más fuertes críticos. Sino una tecnología más, que tendrá sus usos y aplicaciones, sus consecuencias y efectos, pero no cambiará a toda la sociedad de pies a cabeza. En su blog, Bender también argumenta que aún podemos, como sociedad, influenciar el impacto que pueda tener la inteligencia artificial en nuestras vidas. La escritura es sencilla y, ya inventada, es prácticamente inevitable (como cuenta el escritor de ciencia ficción Ted Chiang en un cuento sobre la escritura y la memoria). La inteligencia artificial es muy compleja y aún no nos ha demostrado que se justifique para ser inevitable y que sus críticos quedemos como Sócrates. *La industria de la inteligencia artificial argumenta que su producto mejorará tanto que los errores sí llegarán a ser evitables. A mí no me convence ese argumento. Por: Pablo Medina Uribe

      mo estudiante de arquitectura, este texto me resulta especialmente relevante porque plantea una reflexión que también atraviesa nuestro campo: hasta qué punto delegar en la tecnología puede limitar nuestras propias capacidades. Así como Sócrates temía que la escritura atrofiara la memoria, hoy podríamos decir que la inteligencia artificial puede atrofiar nuestra forma de pensar y crear. En arquitectura, depender completamente de herramientas digitales o de IA para diseñar puede hacernos perder el criterio, la intuición y la sensibilidad espacial que se construyen con la experiencia y el pensamiento crítico. El ensayo de Pablo Medina Uribe invita a usar la tecnología como apoyo, no como sustituto del aprendizaje ni del proceso creativo. JULIANA MORA

    1. Since the energy now depends on K, these levels are only 2⁢J+1 degenerate due to the 2⁢J+1 different M values that arise for each J value. The eigenfunctions |J,M,K> are the same rotation matrix functions as arise for the spherical-top case.

      Could emphasize that since K is squared in the energy that for |k| > 0 the degeneracy is 2(2J+1) while it is 2J+1 for K = 0.

    2. The rotational eigenfunctions and energy levels of a molecule for which all three principal moments of inertia are distinct (a asymmetric top) can not easily be expressed in terms of the angular momentum eigenstates and the J, M, and K quantum numbers. However, given the three principal moments of inertia Ia, Ib, and Ic, a matrix representation of each of the three contributions to the general rotational Hamiltonian in Equation 4.3.5 can be formed within a basis set of the {|J,M,K⟩} rotation matrix functions. This matrix will not be diagonal because the |J,M,K⟩ functions are not eigenfunctions of the asymmetric top Hr⁢o⁢t. However, the matrix can be formed in this basis and subsequently brought to diagonal form by finding its eigenvectors {C n, J,M,K } and its eigenvalues {En}. The vector coefficients express the asymmetric top eigenstates as ψn⁡(θ,φ,χ)=∑J,M,KCn,J,M,K|J,M,K⟩ Because the total angular momentum J2 still commutes with Hr⁢o⁢t, each such eigenstate will contain only one J-value, and hence Ψn can also be labeled by a J quantum number: ψn,J⁡(θ,φ,χ)=∑M,KCn,J,M,K|J,M,K⟩ To form the only non-zero matrix elements of Hr⁢o⁢t within the |J,M,K⟩ basis, one can use the following properties of the rotation-matrix functions: ⟨j,⟩=⟨j,⟩=1/2<j,⟩=h⁡2⁢[J⁢(J+1)−K⁢2], ⟨j,⟩=h2⁡K2 ⟨j⟩=−⟨j⟩=h2⁡[J⁢(J+1)−K⁢(K±1)]⁢1/2⁢[J⁢(J+1)−(K±1)⁢(K±2)]⁢1/2⁢⟨j⟩=0 Each of the elements of Jc2, Ja2, and Jb2 must, of course, be multiplied, respectively, by 1/2⁢Ic, 1/2⁢Ia, and 1/2⁢Ib and summed together to form the matrix representation of Hr⁢o⁢t. The diagonalization of this matrix then provides the asymmetric top energies and wavefunctions.

      Should be rewritten to make clear degeneracy due to space frame of Mj quantum number.

    1. Remember that head words are important because their features play a role in how the entire phrase functions within the sentence. That’s why we name the phrase after the category of its head word. One way to think of this is that the properties of the word carry over to the phrase. Looking at how this works in a tree diagram, we can think of the properties of the head word as percolating up from the individual word to the phrase. The following diagram represents this “percolation” by showing the edges between the head words and their parent nodes as arrows.[1]

      the head word will determine the rest of the sentence structure

    2. At this point, you may be ready to assume that Garth’s actually is a determiner, but that conclusion leads to some unfortunate consequences. First, we would have to say that any noun could change its part of speech simply by adding the genitive inflection. In other words, the category of determiner, which we have already described as containing a small number of words that have a principally grammatical function becomes an open-ended set. Further, this slot isn’t just occupied by genitive nouns. It can be occupied by entire phrases:

      sometimes its not the obvious answer

    3. In sentence (4), stone is used as a mass noun; in sentence (5) it is used as a count noun. Additionally, some nouns that are typically either mass or count can be pressed into service the other way. For example, butter is typically a mass noun, and it seems strange to say two butters, but we can use it in a count sense in a sentence like the following:

      The noun can change its class depending on how the sentence is written

    1. The Snyder Act of 1924 enabled Native Americans born in the United States full citizenship. The 15th Amendment, passed in 1870, granted all U.S citizens the right to vote but it was the Snyder Act that enabled Native Americans to vote.

      When I see stuff like this, it enrages me beyond anything else because how are you gonna kill and rob and take from an entire race of people. Then say you found everything then NOT give them citizenship of their own land.

    2. Ironically, despite the Native American influence on America’s democracy, they were among the last Americans given the right to vote.

      this gotta be the stupidest sentence i've ever read. The "Founders" took what they "discovered" and they killed the people that were here already MY ANCESTORS

    1. Learning that is developmentally grounded and personalized. Learning experiences build on prior knowledge and experience, and account for learners’ active construction of new knowledge. Learning connects to who students are as well as to what they already know, attending to both cognitive and socioemotional realms, and school tasks are designed to be scaffolded according to students’ needs, intrinsically interesting based on their experiences, and appropriate to their level of development. Learning that is contextualized. Learning experiences recognize that people develop as they use the tools and symbols of their cultural con- texts to make sense of the world and their experiences in it. Learning builds on students’ personal, cultural, and linguistic knowledge, and is embedded in meaningful contexts and applications. Learning is con- nected to students’ experiences and is based on a deep understanding of these contexts for development as well as ongoing communication and connection with parents, caregivers, communities, and the world beyond school.

      Having context, previous learning and applying new knowledge is all crucial when growing.

    1. To other Gen Zers, the stare may signal a cool detachment, showing that they understand the irony, especially when selfies are paired with absurd captions or filters.

      This explains how detached gen Z is from the other generations

    2. The Gen Z stare is characterized by a neutral or blank facial expression. In photos, the Gen Z stare looks like direct, unblinking eye contact with the camera, with no smile or overt emotion. Unlike previous generations who often aimed to appear friendly, like they were having a good time, many Gen Zers seem to be adopting a flat, emotionally neutral look. It can be hard to tell whether it’s ironic or authentic—or both

      This paragraph in particular displays the emotion behind the Gen Z stare.

    3. major topic of peer-reviewed psychological research, related studies suggest it may reflect generational themes around identity, visibility, emotion regulation, and digital self-presentation.

      Reflects the struggles behind the Gen-Z stare.

    4. The Gen Z stare is characterized by a neutral or blank facial expression. In photos, the Gen Z stare looks like direct, unblinking eye contact with the camera,

      this is explaining how gen z aren't showing good morals or smiling they just have a plain and serious face

    5. "Gen Z stare"—a deadpan, unsmiling facial expression often seen in selfies and social media videos—has attracted growing cultural and psychological interest.

      I wager TikTok was the first to point this out because it's a Gen-Z orientated platform. The introduction also describes what the Gen-Z stare means, which I appreciate.

    6. The so-called "Gen Z stare"—a deadpan, unsmiling facial expression often seen in selfies and social media videos—has attracted growing cultural and psychological interest. While it has not yet been a major topic of peer-reviewed psychological research, related studies suggest it may reflect generational themes around identity, visibility, emotion regulation, and digital self-presentation.

      this talks about how recently people have noticed that Gen z have been less likely to smile and most likely to be emotinal

    1. she found that although most were well assimilated into the dominant culture, most also had a strong sense of a racialized identity because whites constantly imposed the identity of “Asian foreigner” on them.

      The struggle with Asian American identity is conflicting with the fact that assimilation does not equal acceptance. Even when individuals adapt to dominant cultural norms, systemic racism continues to mark them as outsiders. Racial identity is often shaped not by self-definition, but by how society chooses to perceive and categorize people.

    2. control

      this whole paragraph aims to show how a longstanding “white racial frame” sustains systemic racism and ranks racial groups, placing whites at the top and others, including Asian Americans, in lower positions.

    3. Virtually all Japanese are short. Japanese are likely to be stockier and broader-hipped than short Chinese.

      Rather than being harmless descriptions, these so-called “observations” reflect how racism is packaged as common sense. By presenting stereotypes as factual traits, the media helped cement racial hierarchies and justify exclusionary actions. This tactic made prejudice seem rational and inevitable, allowing systemic discrimination to thrive under the guise of neutrality.

    4. These students have never been taught Asian American history, or been privy to significant events that have shaped these communities in the United States

      When Asian American history is left untaught, it silences entire communities and their struggles, reinforcing a false narrative that their experiences are peripheral or insignificant. The absence of these stories in classrooms actively shapes how people understand race and power, allowing stereotypes to fill the gaps where real history should be.

    5. However, the white-constructed label of “model minority” awarded to Asian Americans does not protect them from prejudice and racism

      Rather than offering protection, it masks discrimination by suggesting Asian Americans don’t face real hardship. This narrative not only silences individual experiences with prejudice and mental health struggles but also reinforces existing racial hierarchies by setting Asian Americans apart from other marginalized groups. Ultimately, the label creates an illusion of privilege while deepening invisibility.

    6. It was just before going to bed and I, Rosalind S. Chou, could not get ahold of my friend Farrah

      The author finds herself living through the very statistics she studies. The blurred personal and professional boundaries underscores how systemic issues like racism and mental health disparities directly impact the researcher's own life and community.

    7. I slept terribly, and the next morning, I reached out to other friends to see if they had heard from her. I was nervous and anxious for hours. And then I got the phone call. When I answered, I got the news. Farrah had jumped out of a window that morning and was in a nearby hospital. I was stunned. In academia, we scholars are often taught to distance ourselves from our research, but this hit home. This was not a lecture in class where I discuss health disparities and how there is growing evidence that racism plays a role in disparate outcomes. This was someone close to me, in my inner circle

      The most astonishing aspect of this text lies in its breaking of the boundary between academic research and real life. The author, originally operating within a detached and objective framework - discussing health disparities and racism in the classroom - suddenly shifts to the genuine trauma experienced in intimate relationships.

    8. ad a terrible, uneasy feeling in my stomach and I picked up the phone to hear panic in the voice on the other end of the line. “She jumped out the window. Farrah* jumped out of the window.” What I thought were irrational fears from the night before had, in fact, become a reality.

      This passage vividly depicts the sudden tragedy with great impact. The "unease" is like a harbinger, revealing the subtle connection between human intuition and reality.

  2. docdrop.org docdrop.org
    1. a) if deficient in respect, manners, and responsibility toward others, especially family members. Following from students' definition of education is the implicit notion that learning should be premised on authentic caring, to use Noddings' ( l 984) terminology. That is, learning should be premised on relation with teachers and other school adults having as their chief concern their students' entire well-being. In contrast to their teachers' expect-ations, Seguin youth prefer to be cared for before they care about school, especially when the curriculum is impersonal, irrelevant, and test driven. U.S.-born students, in particular, display psychic and emotional detachment from a schooling process organized arou

      i think there should be time dedicated to connecting a teacher and student on a deeper level because it then gives the student reason to care in the classroom and the teacher a reason to teach properly.

    2. icanization

      although it wont let me highlight it this whole idea that schools take away your culture and language is so true. as a kid i had a stutter and instead of saying it was because i was a kid my school said it was because i was learning spanish and english at the same time. as such my parents worried i wouldnt have an adulthood in which white society accepted me stopped teaching me spanish.

    3. Regarding caring, teachers expect students to cnre about school in a technical fashion before they care for them, while students expect teachers to care for them before they care about school.

      I think this contradiction exposes the culture of the US and how that manifests in the classroom. Growing up in the US we are taught that we should only look out for ourselves, and that even when it comes to care, there is transaction involved. I agree with the students' perspective that teachers should really be caring for students and that will foster students' care for their studies, and I think this could be reached by actively combatting this culture of individualism. Classroom spaces should foster a collective culture that honors and takes into account everyone's individual experiences and where everyone is supporting each others' learning and growth together.

    1. Test

      The Dexamethasone Suppression Test measures whether your body can suppress cortisol after a synthetic glucocorticoid — and failure to suppress indicates stress-system overactivity, typical in melancholic depression.

  3. docdrop.org docdrop.org
    1. the narrow and essentially Eurocentric curriculum we provide for our teachers. At the university level, teachers are not being edu-cated with the broad strokes necessary to prepare them prop-erly for the twenty-first cent

      i believe myself to be very lucky here. although i come from an extremely low income community i had the ability to go to a free charter school in which as opposed to history i was taught ethnic studies in which my teacher taught about the ways that white historians altered history to make them look more sophisticated and nicer

    2. g less rather than teaching more can happen in sev-eral ways. Those who utilize "skills-based" approaches can

      I think to how in school there were so many comprehension tests that ti a point it became less of a whos learning the most and more of a who can remember the most dates and names. i was learning nothing and just remembering things. without the ability to comprehend a writing reading is fruitless

    3. To counter this tendency, educators must have knowledge of children's lives outside of school so as to recognize their strengths.

      This sentence really resonated with me as not only a teaching method, but as a practice that I think about in my day to day life. We should never assume, and should always strive to truly get to know the people and community around us, especially as a teacher. It's only through building genuine relationships and having care for those around us and learning about their lived day-to-day experiences that we can know how to move forward and support each other.

    1. In case of an accident, anyone seeing her dead on the highway would know at once that she was a lady.

      set dressing to better represent the time period this was taking place in as this was a big deal back then to look ladylike.

  4. docdrop.org docdrop.org
    1. Oh you sound White, you think you're White," they said. And the idea of sounding White was just so absurd to me .... So ninth grade was sort of traumatic in that I started listening to rap music, which I really just don't like. [I said] I'm gonna be Black, and it was just that stupid. But it's more than just how one acts, you know. [The other Black women there] were not into me for the longest time.

      I feel like i can connect to this in a deeper level because although i have alot of education under my belt and tons of experience in various areas my speech has never been the most eloquent and thus ive been oftentimes seen as less intelligent than i truly am and can do nothing but act as though the belittling doesn't impact me

    2. Mecklenburg School Dis-trict in North Carolina, Roslyn Mickelson compared the placements of Black and White high school students who had similar scores on a na-tional standardized achievement test they took in the sixth grade. More than half of the White students who scored in the ninetieth to ninety-ninth percentile on the test were enrolled in high school Advanced Placement (AP) or International Baccalaureate (IB) English, while only 20 percent of the Black students who also scored in the ninetieth to ninety-ninth percentile were enrolled in these more-rigorous courses. Meanwhile, 35 percent of White students whose test scores were below the seventieth percentile were taking AP or IB English.

      As a poli sci major ive gone through study after study that highlights inequalities at a systematic level and it’s in a way amazing to see the lengths that these inequalities can go in order to make sure communities of color are less likely to grow than white communities

    3. Transition to this new understanding is typically precipitated by an event or series of events that force the young person to acknowledge the personal impact of racism.

      Racial awareness begins with lived experience, school incidents often serve as those "events".

    4. Why do Black youths, in particular, think about themselves in terms of race? Because that is how the rest of the world thinks of them.

      I think this line caputres how external percetion drives internal identity work.

    5. One thing that happens is puberty. As children enter adolescence , they begin to explore the question of identity, asking "Who am I? Who can I be?" in ways they have not done before.

      Tatum roots racial identity in the universal adolescent search for self, showing how race becomes part of that process.

    6. Because Black children are much more likely to be in the lower track than in the honors track in racially mixed schools, such apparent sorting along racial lines sends a message about what it means to be Black.

      Shows how institutional practices quietly reproduce racial hierachies.

    7. Black girls who receive protective and affirming racial/ ethnic socialization and beauty messages at home may be less likely to accept negative stereotype images as reflec-tive of all black women or themselves

      It isn't only in the home that these stereotypes and beauty standards need to be unlearned and actively combatted through affirmation and love. It is the responsibility of teachers at school need to not only having awareness of these stereotypes, but by taking active steps to support black girls through their intersectional and unique experiences as students.

    1. Chromosomes are thread-like structures in the nucleus of a cell that contain DNA, which is organized by proteins called histones. Humans typically have 46 chromosomes in 23 pairs, with 22 pairs of autosomes and one pair of sex chromosomes (XX for females and XY for males). Chromosomes carry genes that determine physical traits, and they are crucial for cell division and heredity

      dsfasdfasdfasdf

    1. When a prokaryotic cell reproduces, its circular chromosome replicates, and the cell divides in a process called binary fission (Figure 2.5). Replication usually begins at a specific place on the circular chromosome, called the origin of replication. The origins of the two newly replicated chromosomes move away from each other and toward opposite ends of the cell.

      theh

    1. EDMUND QUINCY.

      Edmund Quincy was an author, editor, and member of the Anti-Slavery Society. He edited several abolitionist journals such as The Abolitionist and The Liberator. The inclusion of his letter at the beginning of Brown's narrative lends credibility to the text, just like William Lloyd Garrison's letter at the beginning of Frederick Douglass's Narrative of the Life of Frederick Douglass.

    1. AI systems are powerful tools-but if not built carefully, they can reinforce societal biases and make unfair decisions. Ensuring fairness and equity in AI is not just a technical challenge, but also a responsibility towards the development of ethical AI.

      Learn practical strategies to design AI systems that avoid bias and ensure fairness. Discover techniques like diverse data, transparent algorithms, and robust evaluation pipelines to build ethical AI.

    1. eLife Assessment

      This useful study attempts to place an ancient maize sample from Bolivia, dated to the end of the Incan empire, in genetic and geographical context. The analyses show that this sample is most closely related to ancient Peruvian maize, but the data remain inadequate to determine the direction of dispersal and the extent of Inca influence over the genetic make up of the analyzed sample. There are additional deficiencies in the statistical analyses and selection inferences. The topic of the study would appeal to researchers studying maize dispersal and adaptation.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, authors describe a good quality ancient maize genome from 15th century Boliva and try to link the genome characteristics to Inca influence. Overall, the revised manuscript is still below the standard in the field. While dating of the sample and the authentication of ancient DNA has been evidenced robustly, the downstream genetic analyses do not support the conclusion that genomic changes can be attributed to Inca influence. There is more story telling than story testing in this manuscript, analyses are not robust and possibly of very narrow interest.

      Strengths:

      Technical data related to the maize sample are robust. Radiocarbon dating strongly evidenced sample age, estimated to around 1474 AD. Authentication of ancient DNA has been done robustly. Spontaneous C-to-T substations which are present in all ancient DNA are visible in reported sample with the expected pattern. Despite low fraction of C-to-T at the 1st base, this number could be consistent with cool and dry climate in which the sample was preserved. The distribution of DNA fragment sizes is consistent with expectations for sample of this age.

      Weaknesses:

      (1) The geographic placement of the sample based on genetic data is not robust. To make use of the method correctly, it would be necessary to validate that genetic samples in this region follow the assumption of the 'isolation-by-distance' with dense sampling, which has not been done. Without this important information, we do not know if genetic similarity is influenced by demographic events and/or selection. The analysis is not a robust evidence of sample connectivity.

      (2) The conclusion that Ancient Andean maize is genetically similar to European varieties and hence share similar evolutionary history is not well supported. PCA plot in Fig. 4 merely represents sample similarity based on two components (jointly responsible for about 20% of variation explained). Contrary to authors' conclusion, the direct test of similarity using outgroup f3 statistic does not support that European varieties are particularly closely related to ancient Andean maize. These levels of shared drift could be due ancient Andean maize relationship with other related groups, such as ancient or modern Brazil. A relationship test between multiple populations would be necessary to show significant direct relationship between ancient Andean maize and European maize.

      (3) The conclusion that selection detected in aBM sample is due to Inca influence has no support. Firstly, selection signature can be due to environmental or any other factors. To disentangle those, authors would need to generate the data for a large number of samples from similar cultural context and from a wide-ranging environmental context followed by a formal statistical test. Secondly, allele frequency increase can be attributed to selection or demographic processes, and alone is not a sufficient evidence for selection. Presented XP-EHH method seems unsuitable for single individual. Overall, methods used in this paper raise some concerns: i) how accurate are allele-frequency tests of selection when only single individual is used as a proxy for a whole population, ii) the significance threshold has been arbitrary fixed to an absolute number based on other studies, but the standard is to use, for example, top fifth percentile.

      In sum, this manuscript presents new data that seem to be of high quality, but the analyses are frequently inappropriate and/or over-interpreted.

    3. Reviewer #2 (Public review):

      I am glad to see a revised version of the manuscript. The authors have successfully handled some of my comments, but others require additional attention. In particular, the dataset seems quite robust and valuable to publish, and the descriptive analysis of its position relative to other modern and ancient genomes is generally sound. The selection analyses remain unsupported, and should be removed from the paper. In addition, I agree with the other reviewers and reiterate my comment that the Locator analysis is not robust.

      As I said in my original review, the XP-EHH method is not applicable to pseudohaploid variant calls in a single individual. This method is simply not appropriate to apply to the data at hand, as the method relies on knowledge of diploid genotypes, usually phased, and the results from this test are not robust. It is possible that the XP-EHH method could be extended to this data type or genotype likelihoods with extensive validation and conditioning on a large reference panel, but in general haplotype-based approaches have not been extensible to low-coverage pseudohaplotype datasets. At any rate, any off-the-shelf implementation is inappropriate and unsupported. I am sorry to be this negative about this analysis, but it cannot be used as presented, the results from using it in this way would be spurious by definition.

      In addition, identifying GO terms without statistical assessment of enrichment is not a robust analysis, nor is selecting genes with a high proportion of rare alleles without extensive additional contextualization based on the expectations of neutrality and deviations potentially tied to selection. For this reason, the two genes linked with height traits have no support here as genuinely being targets of selection. It is a frustrating reality for us in the ancient DNA field that small numbers of highly degraded genomes offer extremely limited scope for selection analyses, but that's the unfortunate state of play, and is the situation here.

      My other major critique remains the application of the Locator method. As Reviewer 1 notes, this method must be built on a densely sampled dataset with strong isolation by distance, which is not done here. The authors explained their approach with more detail in their response, but it is fundamentally inappropriate for this dataset. It does not add anything more than the f3 analysis, and creates a falsely precise inference of genetic-geographic origins that is not supported.

      Per authors' response to my previous recommendation 6, it is not advisable to re-map the reads after damage masking, and doing this with a conservative hard-masking approach will lead to a high mismatch rate and significant loss of reads in BWA. This could also exacerbate reference sequence bias which is already a major challenge for ancient DNA (see Gunther et al 2019 PLoS Genet). The correct approach is to map reads, mask or rescale for damage, and then proceed with the modified alignment file. In response to Reviewer 3's comment 3, the authors also refer to a "0 mismatch alignment" strategy. This is not concordant with the damage analysis, and if they truly do not allow mismatches this would be very inadvisable, as it would allow an extreme reference sequence bias.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      Summary: 

      In this manuscript, the authors describe a good-quality ancient maize genome from 15th-century Bolivia and try to link the genome characteristics to Inca influence. Overall, the manuscript is below the standard in the field. In particular, the geographic origin of the sample and its archaeological context is not well evidenced. While dating of the sample and the authentication of ancient DNA have been evidenced robustly, the downstream genetic analyses do not support the conclusion that genomic changes can be attributed to Inca influence. Furthermore, sections of the manuscript are written incoherently and with logical mistakes. In its current form, this paper is not robust and possibly of very narrow interest. 

      Strengths: 

      Technical data related to the maize sample are robust. Radiocarbon dating strongly evidenced the sample age, estimated to be around 1474 AD. Authentication of ancient DNA has been done robustly. Spontaneous C-to-T substitutions, which are present in all ancient DNA, are visible in the reported sample with the expected pattern. Despite a low fraction of C-to-T at the 1st base, this number could be consistent with the cool and dry climate in which the sample was preserved. The distribution of DNA fragment sizes is consistent with expectations for a sample of this age. 

      Weaknesses: 

      Thank you for all your thoughtful comments. See below for comments on each.

      (1) Archaeological context for the maize sample is weakly supported by speculation about the origin and has unreasonable claims weighing on it. Perhaps those findings would be more convincing if the authors were to present evidence that supports their conclusions: i) a map of all known tombs near La Paz, ii) evidence supporting the stone tomb origins of this assemblage, and iii) evidence supporting non-Inca provenance of the tomb. 

      We believe we are clear about what information we have about context.  First, the intake records from the MSU Museum from 1890 are not as detailed as we would like, but we cannot enhance them. The mummified girl and her accoutrements, including the maize, came from a stone tower or chullpa south of La Paz, in what is now Bolivia. We do not know which stone chullpa, so a map would be of limited use.  The mortuary group is identified as Inca, but as we note the accoutrements do not appear of high status, so it is possible that she is not an elite.  Mud tombs are normally attributed to the local population, and stone towers to Inca or elites. We have clarified at multiple places in the text that the maize is from the period of Inca incursion in this part of Bolivia and have modified text to reflect greater uncertainty of Inca or local origin, but that selection for environmentally favorable characteristics had taken place.  Regardless, there are three 15th c CE or AD AMS ages on the maize, a cucurbita rind, and a camelid fiber.  The maize is almost certainly mid to late 15th century CE.

      (2) Dismissal of the admixture in the reported samples is not evidenced correctly. Population f3 statistic with an outgroup is indeed one of the most robust metrics for sample relatedness; however, it should not be used as a test of admixture. For an admixture test, the population f3 statistic should be used in the form: i) target population, ii) one possible parental population, iii) another possible parental population. This is typically done iteratively with all combinations of possible parental populations. Even in such a form, the population f3 statistic is not very sensitive to admixture in cases of strong genetic drift, and instead population f4 statistic (with an outgroup) is a recommended test for admixture. 

      We have removed “Our admixture f3-statistics test results suggest aBM is not admixed” in our revised manuscript. Since our goal here is to identify which group(s) has(have) the highest relatedness with aBM, so population f3 statistic with an outgroup is the most robust metric to do the test and to support our conclusion here.

      (3) The geographic placement of the sample based on genetic data is not robust. To make use of the method correctly, it would be necessary to validate that genetic samples in this region follow the assumption of the 'isolation-by-distance' with dense sampling, which has not been done. Additionally, the authors posit that "This suggests that aBM might not only be genetically related to the archaeological maize from ancient Peru, but also in the possible geographic location." The method used to infer the location is based on pure genetic estimation. The above conclusion is not supported by this method, and it directly contradicts the authors' suggestion that the sample comes from Bolivia.  

      We understood that it is necessary to validate the assumption of the 'isolation-by-distance' with dense sampling. But we did not do it because: 1) the ancient maize age ranges from ~5000BP to ~100BP and they were found in very different countries at different times. 2) isolation-by-distance is a population genetic concept and it's often used to test whether populations that are geographically farther apart are also more genetically different. Considering we only have 17 ancient samples in total our sample size is not sufficient for a big population test.

      For "It directly contradicts the authors' suggestion that the sample comes from Bolivia.”, as we described in our manuscript that “Given the provenience of the aBM and its age, it is possible the samples were local or alternatively were introduced into western highland Bolivia from the Inca core area – modern Peru.” The sample recording file did show the aBM sample was found in Bolivia, but we do not know where aBM originally came from before it was found in Bolivia. To answer this question, we used locator.py to predict the potential geographic location that aBM may have originally come from, and our results showed that the predicted location is inside of modern Peru and is also very close to archaeological Peruvian maize.  

      Therefore, our conclusion that "This suggests that aBM might not only be genetically related to the archaeological maize from ancient Peru, but also in the possible geographic location” does not contradict that the sample was found Bolivia.

      (4) The conclusion that Ancient Andean maize is genetically similar to European varieties and hence shares a similar evolutionary history is not well supported. The PCA plot in Figure 4 merely represents sample similarity based on two components (jointly responsible for about 20% of the variation explained), and European samples could be very distant based on other components. Indeed, the direct test using the outgroup f3 statistic does not support that European varieties are particularly closely related to ancient Andean maize. Perhaps these are more closely related to Brazil? We do not know, as this has not been measured. 

      Our conclusion is “We also found that a few types of maize from Europe have a much closer distance to the archaeological maize cluster compared to other modern maize, which indicates maize from Europe might expectedly share certain traits or evolutionary characteristics with ancient maize. It is also consistent with the historical fact that maize spread to Europe after Christopher Columbus's late 15th century voyages to the Americas. But as shown, maize also has diversity inside the European maize cluster. It is possible that European farmers and merchants may have favored different phenotypic traits, and the subsequent spread of specific varieties followed the new global geopolitical maps of the Colonial era”.

      We understood your concerns that two components only explain about 20% of the variation. But as you can see from the Figure 2b in Grzybowski, M.W. et al., 2023 publication, it described that “the first principal component (PC1) of variation for genetic marker data roughly corresponded to the division between domesticated maize and maize wild relatives is only 1.3%”. It shows this is quite common in maize, especially when the datasets include landraces, hybrids, and wild relatives. For our maize dataset, we have archaeological maize data ranging from ~5,000BP to ~100BP, and we also have modern maize, which makes the genetic structure of our data more complicated. Therefore, we think our two components are currently the best explanation currently possible. We also included PCA plot based on component 1 and 3 in Fig4_PCA13.pdf. It does not show that the European samples are very distant.

      For “Perhaps these are more closely related to Brazil?”, thank you for this very good question, but we apologize that we cannot answer this question from our current study because our study focuses on identifying the location where aBM originally came from, establishing and explaining patterns of genetic variability of maize, with a specific focus on maize strains that are related to our current aBM. Thus, we will not explore the story between maize from Brazil and European maize in our current study.

      (5) The conclusion that long branches in the phylogenetic tree are due to selection under local adaptation has no evidence. Long branches could be the result of missing data, nucleotide misincorporations, genetic drift, or simply due to the inability of phylogenetic trees to model complex population-level relationships such as admixture or incomplete lineage sorting. Additionally, captions to Figure S3, do not explain colour-coding.  

      We have removed “aBM tends to have long branches compare to tropicalis maize, which can be explained by adaption for specific local environment by time.” in our revised manuscript.

      We have added the color-coding information under Fig. S3 in our revised manuscript.

      (6) The conclusion that selection detected in aBM sample is due to Inca influence has no support. Firstly, selection signature can be due to environmental or other factors. To disentangle those, the authors would need to generate the data for a large number of samples from similar cultural contexts and from a wide-ranging environmental context, followed by a formal statistical test. Secondly, allele frequency increase can be attributed to selection or demographic processes, and alone is not sufficient evidence for selection. The presented XP-EHH method seems more suitable. Overall, methods used in this paper raise some concerns: i) how accurate are allele-frequency tests of selection when only single individual is used as a proxy for a whole population, ii) the significance threshold has been arbitrary fixed to an absolute number based on other studies, but the standard is to use, for example, top fifth percentile. Finally, linking selection to particular GO terms is not strong evidence, as correlation does not imply causation, and links are unclear anyway. 

      In sum, this manuscript presents new data that seems to be of high quality, but the analyses are frequently inappropriate and/or over-interpreted. 

      Regarding your suggestion that “from similar cultural contexts and from a wide-ranging environmental context, followed by a formal statistical test”, we apologize that this cannot be done in our current study because we could not find other archaeological maize samples/datasets that are from similar cultural contexts.

      For “Secondly, allele frequency increase can be attributed to selection or demographic processes, and alone is not sufficient evidence for selection.” Yes, we agree, and that’s why we said it “inferred” the conclusion instead of “indicated”. Furthermore, we revised the whole manuscript following all reviewers’ comments and reorganized and reduced the part on selection on aBM.

      For “The presented XP-EHH method seems more suitable”, we do not think XP-EHH is the best method that could be used here because we only have one aBM sample, but XP-EHH is more suitable for a population analysis.

      For “Finally, linking selection to particular GO terms is not strong evidence, as correlation does not imply causation, and links are unclear anyway.”, as we described in our manuscript, our results “inferred” instead of “indicated” the conclusion.

      Reviewer #2 (Public review): 

      Summary: 

      The manuscript presents valuable new datasets from two ancient maize seeds that contribute to our growing understanding of the maize evolution and biodiversity landscape in pre-colonial South America. Some of the analyses are robust, but the selection elements are not supported. 

      Strengths: 

      The data collection is robust, and the data appear to be of sufficiently high quality to carry out some interesting analytical procedures. The central finding that aBM maize is closely related to maize from the core Inca region is well supported, although the directionality of dispersal is not supported. 

      Weaknesses: 

      Thank you for your comments and suggestions. See below for responses and explanations.

      The selection results are not justified, see examples in the detailed comments below. 

      (1) The manuscript mentions cultural and natural selection (line 76), but then only gives a couple of examples of selecting for culinary/use traits. There are many examples of selection to tolerate diverse environments that could be relevant for this discussion, if desired. 

      We have added related examples with references supported in our revised manuscript.  

      (2) I would be extremely cautious about interpreting the observations of a Spanish colonizer (lines 95-99) without very significant caveats. Indigenous agriculture and food ways would have been far more nuanced than what could be captured in this context, and the genocidal activities of the Europeans would have impacted food production activities to a degree, and any contemporaneous accounts need to be understood through that lens.  

      We agree with the first part of this comment and have softened our use of this particular textual material such that it is far less central to interpretation.While of interest, we cannot evaluate the impact of colonial European activities or observational bias for purposes of this analysis.

      (3) The f3 stats presented in Figure 2 are not set up to test any specific admixture scenarios, so it is unsupported to conclude that the aBM maize is not admixed on this basis (lines 201-202). The original f3 publication (Patterson et al, 2012) describes some scenarios where f3 characteristics associate with admixture, but in general, there are many caveats to this approach, and it's not the ideal tool for admixture testing, compared with e.g., f4 and D (abba-baba) statistics.  

      You make an important point that f3 stats is not the ideal tool for admixture testing. Since our study goal here is to identify which group(s) has(have) the highest relatedness with aBM, the population f3 statistic with an outgroup is the most robust metrics with which to do the test and to support our conclusion here. We have removed the “Our admixture f3-statistics test results suggest aBM is not admixed” in our revised manuscript.

      (4) I'm a little bit skeptical that the Locator method adds value here, given the small training sample size and the wide geographic spread and genetic diversity of the ancient samples that include Central America. The paper describing that method (Battey et al 2020 eLife) uses much larger datasets, and while the authors do not specifically advise on sample sizes, they caution about small sample size issues. We have already seen that the ancient Peruvian maize has the most shared drift with aBM maize on the basis of the f3 stats, and the Locator analysis seems to just be reiterating that. I would advise against putting any additional weight on the Locator results as far as geographic origins, and personally I would skip this analysis in this case.  

      As we described in our manuscript, we have 17 archaeological samples in total. Please find more detailed information from the “geographical location prediction” section.

      We cannot add more ancient samples because they are all that we could find from all previous publications. We may still want to keep this analysis because f3 stats indicates the genome similarity, but the purpose of locator.py analysis is indicating the predicted location of origin of a genetic sample by comparing it to a set of samples of known geographic origin. 

      (5) The overlap in PCA should not be used to confirm that aBM is authentically ancient, because with proper data handling, PCA placement should be agnostic to modern/ancient status (see lines 224-226). It is somewhat unexpected that the ancient Tehuacan maize (with a major teosinte genomic component) falls near the ancient South American maize, but this could be an artifact of sampling throughout the PCA and the lack of teosinte samples that might attract that individual.  

      We have removed “which supports the authenticity of aBM as archaeological maize” in our revised manuscript. The PCA was only applied for all maize samples, so we did not include any teosinte samples in the analysis.

      (6) What has been established (lines 250-251) is genetic similarity to the Inca core area, not necessarily the directionality. Might aBM have been part of a cultural region supplying maize to the Inca core region, for example? Without a specific test of dispersal directionality, which I don't think is possible with the data at hand, this is somewhat speculative. 

      We added this and re-wrote this part in our revised manuscript.

      (7) Singleton SNPs are not a typical criterion for identifying selection; this method needs some citations supporting the exact approach and validation against neutral expectations (line 278). Without Datasets S2 and S3, which are not included with this submission, it is difficult to assess this result further. However, it is very unexpected that ~18,000 out of ~49,000 SNPs would be unique to the aBM lineage. This most likely reflects some data artifact (unaccounted damage, paralogs not treated for high coverage, which are extremely prevalent in maize, etc). I'm confused about unique SNPs in this context. How can they be unique to the aBM lineage if the SNPs used overlap the Grzybowski set? The GO results do not include any details of the exact method used or a statistical assessment of the results. It is not clear if the GO terms noted are statistically enriched.  

      We have added references 53 and 54 in our revised manuscript, and we also uploaded the Datasets S2 and S3.

      For “I'm confused about unique SNPs in this context. How can they be unique to the aBM lineage if the SNPs used overlap the Grzybowski set?”, as we described in our materials and method part that “To achieve potential unique selection on aBM, we calculated the allele frequency for each SNPs between aBM and other archaeological maize, resulting in allele frequency data for 49,896 SNPs. Of these,18,668 SNPs were unique to aBM.”  Thus, the unique SNPs for aBM came from the comparison between aBM with other archaeological maize, and we did not use any modern maize data from the Grzybowski set.

      For “The GO results do not include any details of the exact method used or a statistical assessment of the results. It is not clear if the GO terms noted are statistically enriched.” We did not do GO Term enrichment, so there are no statistical assessments for the results. What we have done was we retained the GO Terms information for each gene by checking their biological process from MaizeGDB, after that, we summarized the results in Dataset S4.

      (8) The use of XP-EHH with pseudo haplotype variant calls is not viable (line 293). It is not clear what exact implementation of XP-EHH was used, but this method generally relies on phased or sometimes unphased diploid genotype calls to observe shared haplotypes, and some minimum population size to derive statistical power. No implementation of XP-EHH to my knowledge is appropriate for application to this kind of dataset. 

      We used the same XP-EHH as this publication “Sabeti, P.C. et al. Genome-wide detection and characterization of positive selection in human populations. Nature 449, 913-918 (2007).” Specifically in our analysis, the SNP information of modern maize was compared with ancient maize. The code is available in https://doi.org/10.5061/dryad.w6m905qtd.

      XP-EHH is a statistical method used in population genetics to detect recent positive selection in one population compared to another, and it often applied in modern large maize populations in previous research. In our study, we wanted to detect recent positive selection in modern maize compared to ancient maize, thus, we applied XP-EHH here. Although the population size of ancient maize is not big, it is the best method that we can apply for our dataset here to detect recent selection on modern maize.

      Reviewer #3 (Public review): 

      Summary: 

      The authors seek to place archaeological maize samples (2 kernels) from Bolivia into genetic and geographical context and to assess signatures of selection. The kernels were dated to the end of the Incan empire, just prior to European colonization. Genetic data and analyses were used to characterize the distance from other ancient and modern maize samples and to predict the origin of the sample, which was discovered in a tomb near La Paz, Bolivia. Given the conquest of this region by the Incan empire, it is possible that the sample could be genetically similar to populations of maize in Peru, the center of the Incan empire. Signatures of selection in the sample could help reveal various environmental variables and cultural preferences that shaped maize genetic diversity in this region at that time. 

      Strengths: 

      The authors have generated substantial genetic data from these archaeological samples and have assembled a data set of published archaeological and modern maize samples that should help to place these samples in context. The samples are dated to an interesting time in the history of South America during a period of expansion of the Incan empire and just prior to European colonization. Much could be learned from even this small set of samples. 

      Weaknesses: 

      Many thanks for your comments and suggestions.  We have addressed these below and provided further explanation.

      (1) Sample preparation and sequencing: 

      Details of the quality of the samples, including the percentage of endogenous DNA are missing from the methods. The low percentage of mapped reads suggests endogenous DNA was low, and this would be useful to characterize more fully. Morphological assessment of the samples and comparison to morphological data from other maize varieties is also missing. It appears that the two kernels were ground separately and that DNA was isolated separately, but data were ultimately pooled across these genetically distinct individuals for analysis. Pooling would violate assumptions of downstream analysis, which included genetic comparison to single archaeological and modern individuals. 

      We did not do the morphological assessment of the samples and comparison to morphological data from other maize varieties because we only have 2 aBM kernels, and we do not have other archaeological samples that could be used to do comparison.

      For “It appears that the two kernels were ground separately and that DNA was isolated separately, but data were ultimately pooled across these genetically distinct individuals for analysis”, as you can see from our Materials and Methods section that “Whole kernels were crushed in a mortar and pestle”, these two kernels were ground together before sequenced. 

      While morphological assessment of the sample would be interesting, most morphological data reported for maize are from microremains (starch, phytoliths, pollen) and this is beyond the scope of our study. Most studies of macrobotanical remains do not appear to focus solely on individual kernels, but instead on (or in combination with) cob and ear shape, which were not available in the assemblage.

      (2) Genetic comparison to other samples: 

      The authors did not meaningfully address the varying ages of the other archaeological samples and modern maize when comparing the genetic distance of their samples. The archaeological samples were as old as >5000 BP to as young as 70 BP and therefore have experienced varying extents of genetic drift from ancestral allele frequencies. For this reason, age should explicitly be included in their analysis of genetic relatedness. 

      We have changed related part in our revised manuscript.

      (3) Assessment of selection in their ancient Bolivian sample: 

      This analysis relied on the identification of alleles that were unique to the ancient sample and inferred selection based on a large number of unique SNPs in two genes related to internode length. This could be a technical artifact due to poor alignment of sequence data, evidence supporting pseudogenization, or within an expected range of genetic differentiation based on population structure and the age of the samples. More rigor is needed to indicate that these genetic patterns are consistent with selection. This analysis may also be affected by the pooling of the Bolivian archaeological samples.  

      We do not think it is because of poor alignment of sequence data since we used BWA v0.7.17 with disabled seed (-l 1024) and 0 mismatch alignment. Therefore, there are no SNPs that could come from poor alignment. Please see our detailed methods description here “For the archaeological maize samples, adapters were removed and paired reads were merged using AdapterRemoval60 with parameters --minquality 20 --minlength 30. All 5՛ thymine and 3՛ adenine residues within 5nt of the two ends were hard-masked, where deamination was most concentrated. Reads were then mapped to soft-masked B73 v5 reference genome using BWA v0.7.17 with disabled seed (-l 1024 -o 0 -E 3) and a quality control threshold (-q 20) based on the recommended parameter61 to improve ancient DNA mapping”.

      For “More rigor is needed to indicate that these genetic patterns are consistent with selection”, Could you please be more specific about which method or approach we should use here? For example, methods from specific publications that could be referenced? Or which specific tool could be used?

      “This analysis may also be affected by the pooling of the Bolivian archaeological samples.” As we could not prove these two seeds came from two different individual plants, we do not think this analysis was affected by the pooling of the Bolivian archaeological samples.

      (4) Evidence of selection in modern vs. ancient maize: In this analysis, samples were pooled into modern and ancient samples and compared using the XP-EHH statistic. One gene related to ovule development was identified as being targeted by selection, likely during modern improvement. Once again, ancient samples span many millennia and both South, Central, and North America. These, and the modern samples included, do not represent meaningfully cohesive populations, likely explaining the extremely small number of loci differentiating the groups. This analysis is also complicated by the pooling of the Bolivian archaeological samples. 

      Yes, it is possible that ovule development might be a modern improvement. We re-wrote this part in our revised manuscript.

      Reviewer #1 (Recommendations for the authors): 

      My suggestion is to address the comments that outline why the methods used or results obtained are not sufficient to support your conclusions. Overall, I suggest limiting the narrative of Inca influence and framing it as speculation in the discussion section. Presenting conclusions of Inca influence in the title and abstract is not appropriate, given the very questionable evidence. 

      We agree and have changed the title to “Fifteenth century CE Bolivian maize reveals genetic affinities with ancient Peruvian maize”.

      Reviewer #2 (Recommendations for the authors): 

      (1) Line 74: Mexicana is another subspecies of teosinte; the distinction is between ssp. mexicana and ssp. parviglumis (Balsas teosinte), not mexicana and teosinte. 

      We have corrected this in our revised manuscript.

      (2) Line 100-102: This is a bit confusing, it cannot have been a symbol of empire "since its first introduction", since its introduction long predates the formation of imperial politics in the region. Reference 17 only treats the late precolonial Inca context, while ref 22 (which cites maize cultivation at 2450 BC, not 3000 BC) makes no reference to ritual/feasting contexts; it simply documents early phytolith evidence for maize cultivation. As such, this statement is not supported by the references offered.

      lines 100-102. This point is well taken and was poor prose on our part.  We have modified this discussion to reflect both the confusing statement and we have corrected our mistake in age for reference 22. associated prose has been modified accordingly.

      We have corrected them as “Indeed, in the Andes, previous research showed that under the Inca empire, maize was fulfilled multiple contextual roles. In some cases, it operated as a sacred crop” and “…since its first introduction to the region around 2500 BC”.

      (3) Line 161: IntCal is likely not the appropriate calibration curve for this region; dates should probably be calibrated using SHCal.  

      We greatly appreciate this important (and correct) observation. We have completely recalibrated the maize AMS result based on the southern hemisphere calibration curve, discussed the new calibrations, and have also invoked two other AMS dates also subjected to the southern hemisphere calibration on associated material for comparison.We are confident in a 15th century AD/CE age for the maize, most likely mid- to late 15th century.  

      (4) Lines 167-169: The increase of G and A residues shown in Supplementary Figure S1a is just before the 5' end of the read within the reference genome context, and is related to fragmentation bias - a different process from postmortem deamination. Deamination leads to 5' C->T and 3' G->A, resulting in increased T at 5' ends and increased A at 3' ends, and the diagnostic damage curve. The reduction of C/T just before reads begin is not a result of deamination. 

      We have removed the “Both features are indicative of postmortem deamination patterns” in our revised manuscript.

      (5) Lines 187-196 This section presents a lot of important external information establishing hypotheses, and needs some references.  

      We have added the related references here.

      (6) Line 421: This makes it sound like damage masking was done BEFORE read mapping. However, this conflicts with the previous paragraph about map Damage, and Supplementary Figure 1 still shows a slight but perceptible damage curve, which is impossible if all terminal Ts and As are hard-masked. This should be reconciled.  

      The Supplementary Figure 1 shows the raw ancient maize DNA sample before damage masking. Specifically, Step1: We used map Damage to check/estimate if the damage exists, and we made the Supplementary Figure 1. Step 2: Then we used our own code hard-masked the damage bases and did read mapping.

      The purpose of Supplementary Figure 1 is to show the authenticity of aBM as archaeological maize. Therefore, it should show a slight but perceptible damage curve.

      (7) Line 460: PCA method is not given (just the LD pruning and the plotting).  

      The merged dataset of SNPs for archaeological and modern maize was used for PCA analysis by using “plink –pca”.

      (8) "tropicalis" maize is not common usage, it is not clear to me what this refers to. 

      We have changed all “tropicalis maize” as “tropical maize” in our revised manuscript.

      (9) The Figure 4 color palette is not accessible for colorblind/color-deficient vision.  

      We have changed the color of Figure 4. Please find the new colors in our upload Figure 4.

      (10) Datasets S2 and S3 are not included with this submission. 

      Thank you for letting us know and your suggestion. We have included Datasets S2 and S3 here.

    1. Aside from issuing a brief statement on the organization’s website in August 2017 and another general statement on December 1, 2018, both of which condemned the actions of those they saw as using Confederate heritage for a hateful cause, the Daughters have remained silent and have refused to give media interviews, even after the vandalism of their headquarters. While some local chapters of the UDC have fought back by filing lawsuits or requesting restraining orders against monument removal in the face of local decisions, on the whole the organization’s silence speaks volumes about how it is now a shell of its former self

      This makes me think about how silence can be a political act. By refusing to engage, the UDC tries to preserve its version of history through absence rather than argument. Can a society truly move forward if those who created harmful narratives refuse to speak or take accountability?

    2. Locals had reinterpreted the Lee monument through protest art and it swiftly became a tourist destination, a place to take photos and to see a new kind of history in action

      This transformation raises questions about who “owns” history and how public spaces can evolve when the people demand new meanings. When citizens transform monuments like this, are they erasing history or creating a more honest version of it?

    1. eLife Assessment

      This study presents a meta-analysis of two independent genome-wide association studies (GWAS) that investigate the role of plasma proteins as potential biomarkers for enhancing the early detection of prostate cancer (PCa). The results provide useful confirmatory data that support existing evidence currently published. The evidence is incomplete: the study does not provide a comprehensive synthesis of all currently published work, does not explore other clinical outcomes related to prostatic disease, and its findings have not been validated through an external cohort study. These shortcomings notwithstanding, the work may be of interest to researchers studying correlates and predictors of prostate cancer risk.

    2. Reviewer #1 (Public review):

      Summary:

      In Causal associations between plasma proteins and prostate cancer: a Proteome-Wide Mendelian Randomization the authors present a manuscript which seeks to identify novel markers for prostate cancer through analysis of large biobank-based datasets, and to extend this analysis to potential therapeutic targets for drugs. This is an area which is already extensively researched, but remains important, due to the high burden and mortality of prostate cancer globally.

      Strengths:

      The main strengths of the manuscript are the identification and use of large biobank data assets, which provide large numbers of cases and controls, essential for achieving statistical power. The databases used (deCODE, FinnGen and the UK Biobank) allow for robust numbers of cases and controls. The analytical method chosen, Mendelian Randomization, however, may not be appropriate to the problem (without extensive validation, MR can be prone to false or misleading discoveries). The manuscript also integrates multi-omic datasets, here using protein data as well as GWAS sources to integrate genomic and proteomic data.

      Weaknesses:

      The main weaknesses of the manuscript relate to the following areas:

      (1) The failure of the study to analyse the data in the context of other closely related conditions such as benign prostatic hyperplasia (BPH) or lower urinary tract symptoms (LUTS), which have some pathways and biomarkers in common, such as inflammatory pathways (including complement) and specific markers such as KLK3. As a consequence, it is not possible for readers to know whether the findings are specific to prostate cancer, or whether they are generic to prostate dysfunction. Given the prevalence of prostate dysfunction (half of men reaching their sixth decade), the potential for false positives and overtreatment from non-specific biomarkers is a major problem, resulting in the evidence presented in this manuscript being weak. Other researchers have addressed this issue using the same data sources as presented here, for example in this paper looking at BPH in the UK Biobank population.<br /> https://www.nature.com/articles/s41467-018-06920-9

      (2) There is no discussion of Gleason scores with regard to either biomarkers or therapies, and a general lack of discussion around indolent disease as compared with more aggressive variants. These are crucial issues with regard to the triage and identification of genomically aggressive localized prostate cancers. See for example the work set out in: https://doi.org/10.1038/nature20788. In the revised version of the manuscript the authors set this out as a limitation, but this does not solve the core problem, which is that without this important biological context, the findings are unlikely to be robust.

      (3) An additional issue is that the field of PCa research is fast-moving. The manuscript cites ~80 references, but too few of these are from recent studies and many important and relevant papers are not included. The manuscript would be much stronger if it compared and contrasted its findings with more recent studies of PCa biomarkers and targets, especially those concerned with multi-omics and those including BPH. In the latest revised version of the manuscript, some changes have been made, but the source data are still too limited for in-depth analysis.

      (4) The Methods section provides no information on how the Controls were selected. There is no Table providing cohort data to allow the reader to know whether there were differences in age, BMI, ethnic grouping, social status or deprivation, or smoking status, between the Cases and Controls. These types of data are generally recorded in Biobank data; in the latest version of the manuscript the authors state that they don't have any ability to derive matched data, which again prevents deep analysis of the data.

      Assessing impact:

      Because of the weaknesses of the approach identified above, without further additions to the manuscript, the likely impact of the work on the field is minimal. There is no significant utility of the methods and data to the community, because the data are pre-existing and are not newly introduced to the community in this work, and mendelian randomization is a well-described approach in common use, and therefore the assets and methods described in the manuscript are not novel. In addition, Mendelian randomization is not always appropriate, especially when analysing publicly available data, see:

      Stender et al. Lipids in Health and Disease (2024) 23:286<br /> https://doi.org/10.1186/s12944-024-02284-w

      With regard to the authors achieving their aims, without assessing specificity and without setting their findings in the context of the latest literature, the authors (and readers) cannot know or assess whether the biomarkers identified or the druggable targets will be useful in the clinic.

      In conclusion, adding additional context and analysis to the manuscript would both help readers interpret and understand the work, and would also greatly enhance its significance. For example, the UK Biobank includes data on men with BPH / LUTS, as analysed in this paper, for example, https://doi.org/10.1038/s41467-018-06920-9. In the latest version of the manuscript and through the responses to earlier review comments, the authors explain why this has not been possible, but this naturally limits the value of the research.

    3. Reviewer #2 (Public review):

      This is potentially interesting work, but the analyses are attempted in a rather scattergun way, with little evident critical thought. The structure of the work (Results before Methods) can work in some manuscripts, but it is not ideal here. The authors discuss results before we know anything about the underlying data that the results come from. It gives the impression that the authors regard data as a resource to be exploited, without really caring where the data comes from. The methods can provide meaningful insights if correctly used, but while I don't have reasons to doubt that the analyses were conducted correctly, findings are presented with little discussion or interpretation. No follow-up analyses are performed.

      This is much improved but there remain some small concerns and one large concern:

      Using numbering from the previous review:

      (1) This looks better, but I still don't understand the claim in the text: "We found 5 genetic risk loci contained at least one SNP passing the genome-wide significance threshold of P {less than or equal to} 5×10−8". Far more gene regions appear to cross 10^-8 in Figure 1. What am I missing?

      (6) I don't understand the authors' response here. Early detection is important, but MR is not the right tool to investigate biomarkers for early detection. Biomarkers for early detection do not have to be causal biomarkers. The authors replied to this point, but the manuscript was unchanged.

      (7) Again, the authors still state "193 proteins were associated with PCa risk" even though they acknowledge that their analysis does not test whether proteins associate with PCa risk or not. When an error is pointed out, and you acknowledge it, please change the manuscript to correct the text. Otherwise, what is the peer review process for?

      The large concern is that these analyses, while now better explained, are still the product of a semi-automated procedure. It is a good procedure, but the manuscript essentially takes public data from different sources and uses this to create a manuscript. Overall, I think there is enough novel synthesis to justify publication, but it is not automatic.

      Strengths:

      The data and methods used are state-of-the-art.

      Weaknesses:

      The reader will have to provide their own translational insight.

    4. Reviewer #3 (Public review):

      Summary of concerns about the revised submission from the Reviewing Editor:

      With respect to Originality of the work, in the last 18 months, there have been 38 publications on combined topics of: (i) UK Biobank data, (ii) Mendelian randomization, (iii) and prostate cancer. The authors should consider the literature addressing prostate cancer via Mendelian randomization--specifically those using the UK Biobank data--published from 2024 onwards. A proper and comprehensive synthesis of recent findings should be made, to allow readers to compare and contrast how the work supports (or differs) from the findings presented in these other published studies.

      With respect to the significance of the findings, we feel the study data are incomplete for the strength of evidence. Given the deluge of manuscripts and publications on similar topics, the study offers incremental evidence and lacks a synthesis of all currently published findings.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      In Causal associations between plasma proteins and prostate cancer: a Proteome-Wide Mendelian Randomization, the authors present a manuscript which seeks to identify novel markers for prostate cancer through analysis of large biobank-based datasets and to extend this analysis to potential therapeutic targets for drugs. This is an area that is already extensively researched, but remains important, due to the high burden and mortality of prostate cancer globally.

      Strengths:

      The main strengths of the manuscript are the identification and use of large biobank data assets, which provide large numbers of cases and controls, essential for achieving statistical power. The databases used (deCODE, FinnGen, and the UK Biobank) allow for robust numbers of cases and controls. The analytical method chosen, Mendelian Randomization, is appropriate to the problem. Another strength is the integration of multi-omic datasets, here using protein data as well as GWAS sources to integrate genomic and proteomic data.

      Thank you for your positive feedback regarding the overall quality of our work and we greatly appreciate you taking time and making effort in reviewing our manuscript.

      Weaknesses:

      The main weaknesses of the manuscript relate to the following areas:

      (1) The failure of the study to analyse the data in the context of other closely related conditions such as benign prostatic hyperplasia (BPH) or lower urinary tract symptoms (LUTS), which have some pathways and biomarkers in common, such as inflammatory pathways (including complement) and specific markers such as KLK3. As a consequence, it is not possible for readers to know whether the findings are specific to prostate cancer or whether they are generic to prostate dysfunction. Given the prevalence of prostate dysfunction (half of men reaching their sixth decade), the potential for false positives and overtreatment from non-specific biomarkers is a major problem, resulting in the evidence presented in this manuscript being weak. Other researchers have addressed this issue using the same data sources as presented here, for example, in this paper, looking at BPH in the UK Biobank population. https://www.nature.com/articles/s41467-018-06920-9

      Thank you for your valuable comment. We fully agree that biomarker development must prioritize specificity to avoid overtreatment. While our study is a foundational step toward identifying potential therapeutic targets or complementary biomarkers for prostate cancer—not as a direct endorsement of these proteins for standalone clinical diagnosis. Mendelian randomization analysis strengthens causal inference by design, and we further ensured robustness through sensitivity analyses (e.g., MR-Egger regression for pleiotropy, Bonferroni correction for multiple testing). These methods distinguish true causal effects from nonspecific associations. Importantly, while PSA’s lack of specificity is widely recognized, its role in reducing PCa mortality underscores the value of biomarker-driven screening. Our findings align with the need to integrate multiple markers (e.g. combining a novel protein with PSA) to improve diagnostic precision. Translating these causal insights into clinical tools remains challenging but represents a necessary next step, and we emphasize that this work provides a rigorous starting point for future validation studies.

      (2) There is no discussion of Gleason scores with regard to either biomarkers or therapies, and a general lack of discussion around indolent disease as compared with more aggressive variants. These are crucial issues with regard to the triage and identification of genomically aggressive localized prostate cancers. See, for example, the work set out in: https://doi.org/10.1038/nature20788

      Thank you for pointing this out. We acknowledge that our original analysis did not directly address this critical issue due to a key data limitation: the publicly available GWAS summary statistics for PCa (from openGWAS and FinnGen) do not provide genetic associations stratified by phenotypic severity or molecular subtypes. This limitation precluded MR analysis of proteins specifically linked to aggressive disease. To partially bridge this gap, we integrate evidence from recent studies in the revised Discussion section to explore the relevance of potential biomarkers to aggressive PCa.

      (3) An additional issue is that the field of PCa research is fast-moving. The manuscript cites ~80 references, but too few of these are from recent studies, and many important and relevant papers are not included. The manuscript would be much stronger if it compared and contrasted its findings with more recent studies of PCa biomarkers and targets, especially those concerned with multi-omics and those including BPH.

      Thank you for your professional comments. We have rigorously updated the manuscript to include more recent publications and we systematically compare and contrast our findings with these recent studies in the revised Discussion section.

      (4) The Methods section provides no information on how the Controls were selected. There is no Table providing cohort data to allow the reader to know whether there were differences in age, BMI, ethnic grouping, social status or deprivation, or smoking status, between the Cases and Controls. These types of data are generally recorded in Biobank data, so this sort of analysis should be possible, or if not, the authors' inability to construct an appropriately matched set of Controls should be discussed as a Limitation.

      We thank the reviewer for raising this important methodological concern. We have expanded the Limitations section to state it.

      “Lastly, our analysis relied exclusively on publicly available GWAS summary statistics from openGWAS and FinnGen, which did not provide individual-level data on covariates, resulting in no direct assessment of demographic or clinical differences between cases and controls.”

      Reviewer #2 (Public review):

      This is potentially interesting work, but the analyses are attempted in a rather scattergun way, with little evident critical thought. The structure of the work (Results before Methods) can work in some manuscripts, but it is not ideal here. The authors discuss results before we know anything about the underlying data that the results come from. It gives the impression that the authors regard data as a resource to be exploited, without really caring where the data comes from. The methods can provide meaningful insights if correctly used, but while I don't have reasons to doubt that the analyses were conducted correctly, findings are presented with little discussion or interpretation. No follow-up analyses are performed.

      In summary, there are likely some gems here, but the whole manuscript is essentially the output from an analytic pipeline.

      We thank the reviewer for the thoughtful evaluation of our work. In response to the concerns regarding manuscript structure and interpretative depth, we have restructured the manuscript to present the Methods section before Results, ensuring transparency in data sources and analytical workflows. Additionally, the Discussion section has been substantially revised to provide mechanistic explanations for key findings (e.g., associated phenotype, causal proteins, druggable targets), contextualize results within recent multi-omics studies and highlight clinical implications.  These revisions aim to transform the work from a pipeline-driven analysis to a biologically grounded investigation, offering actionable insights into prostate cancer pathogenesis and therapeutic development.

      Taking the researchers aims in turn:

      (1) Meta-GWAS - while combining two datasets together can provide additional insights, the contribution of this analysis above existing GWAS is not clear. The PRACTICAL consortium has already reported the GWAS of 70% of these data. What additional value does this analysis provide? (Likely some, but it's not clear from the text.) Also, the presentation of results is unclear - authors state that only 5 gene regions contained variants at p<5x10-8, but Figure 1 shows dozens of hits above 5x10-8. Also, the red line in Figure 1 (supposedly at 5x10-8) is misplaced.

      Thank you very much for your feedback. Although the PRACTICAL consortium constituted the majority of PCa GWAS data, our meta-analysis integrating FinnGen data enhanced statistical power enabling robust detection of low-frequency variants with minor allele frequencies. Moreover, FinnGen's Finnish ancestry (genetic isolate) helps distinguish population-specific effects. The presentation of results showed the top 5 gene regions contained variants at p < 5×10⁻⁸. We apologize for not noticing that the red line was not displayed correctly in the original figures included in the manuscript. We have updated it in the revised manuscript.

      (2) Cross-phenotype analysis. It is not really clear what this analysis is, or why it is done. What is the iCPAGdb? A database? A statistical method? Why would we want to know cross-phenotype associations? What even are these? It seems that the authors have taken data from an online resource and have written a paragraph based on this existing data with little added value.

      We appreciate the opportunity to clarify this analysis. The cross-phenotype analysis was designed to systematically identify phenotypic traits that share genetic or molecular pathways with prostate cancer, thereby uncovering pleiotropic mechanisms or shared risk factors. Here, iCPAGdb (integrated Cross-Phenotype Association Genetics Database) is a curated repository that aggregates GWAS summary statistics and evaluates genetic pleiotropy using LD-proxy associations from the NHGRI-EBI GWAS Catalog. Prostate carcinogenesis involves multisystem interactions, including spanning endocrine dysregulation, immune microenvironment remodeling and metabolic reprogramming, rather than isolated molecular pathway disruptions. Therefore, it is indispensable for discriminating primary pathogenic drivers from secondary compensatory responses, ultimately informing the development of precision therapeutic strategies. 

      In response to your concerns, we have revised the Results section to explicitly define the rationale and methodology of cross-phenotype analysis and restructured the Discussion to interpret phenotype-PCa associations within unified biological frameworks (e.g., metabolic dysregulation, androgen signaling), rather than presenting them as isolated findings.

      (3) PW-MR. I can see the value of this work, but many details are unclear. Was this a two-sample MR using PRACTICAL + FinnGen data for the outcome? How many variants were used in key analyses? Again, the description of results is sparse and gives little added value.

      We thank you for raising this issue. Two-sample MR refers to an analytical design where genetic instruments for the exposure (plasma proteins) and genetic associations with the outcome (PCa) are derived from non-overlapping populations. This ensures complete sample independence between exposure and outcome datasets to avoid confounding biases, regardless of whether the outcome data originate from single or multiple cohorts. The meta-analysis of PRACTICAL and FinnGen GWAS generates 27,210 quality-controlled variants (p < 5×10⁻⁸, MAF ≥ 1%, LD-clumped r² < 0.1) used in key analyses. Regarding the concern about sparse interpretation, we have substantially expanded the Discussion by comparing significant protein findings (e.g., MSMB, SERPINA3) with results from existing functional studies and multi-omics datasets and unravelling new insights.

      (4) Colocalization - seems clear to me.

      (5) Additional post-GWAS analyses (pathway + druggability) - again, the analyses seem to be performed appropriately, although little additional insight other than the reporting of output from the methods.

      The post-MR druggability and pathway analyses serve two primary scientific purposes: (1) therapeutic prioritization - systematically evaluating which MR-identified proteins represent tractable drug targets (either through existing FDA-approved agents or compounds in clinical development) with direct relevance to cancer or PCa management, and (2) mechanistic hypothesis generation - mapping these candidate proteins to coherent biological pathways to guide future functional validation studies investigating their causal roles in prostate carcinogenesis. In response to your feedback, we have restructured the Discussion section under the subheading “Biological Mechanisms and Druggable Targets” to synthesize these findings, explicitly linking biological pathway to therapeutic targets.

      Minor points:

      (6) The stated motivation for this work is "early detection". But causality isn't necessary for early detection. If the authors are interested in early detection, other analysis approaches are more appropriate.

      We appreciate your insightful feedback. Early detection is one motivation for this work, meanwhile, our goal is also to identify causally implicated proteins that may serve as intervention targets for PCa prevention or therapy.  Establishing causality is critical for distinguishing biomarkers that drive disease pathogenesis from those that are secondary to disease progression, as the former holds greater specificity for early detection and prioritization of therapeutic targets. While we acknowledge that validation for early detection may require additional methodologies, MR analysis provides a foundational step by prioritizing candidate proteins with causal links to disease. This approach ensures that downstream efforts focus on biomarkers and targets with the greatest potential to alter disease trajectories, rather than merely correlative markers.

      (7) The authors state "193 proteins were associated with PCa risk", but they are looking at MR results - these analyses test for disease associations of genetically-predicted levels of proteins, not proteins themselves.

      True, in MR, the exposure of interest is the lifelong effect of genetically predicted protein levels. This approach is designed to infer causality while avoiding confounding and reverse causation, as genetic variants are fixed at conception and unaffected by disease processes. When we state “193 proteins were associated with PCa risk,” we specifically refer to proteins whose genetically predicted levels (based on instrument SNPs from protein QTLs) show causal links to PCa. Importantly, MR does not measure the direct association between observed protein concentrations and disease. Instead, it estimates the lifelong causal effect of protein levels predicted by genetics. This distinction is critical for disentangling cause from consequence. For example, a protein elevated due to tumor progression would not be identified as causal in MR if its genetic predictors are unrelated to PCa risk.

      We acknowledge that clinical translation requires further validation of these proteins in observational studies measuring actual protein levels. However, MR provides a robust first step by prioritizing candidates with causal roles, thereby reducing the risk of investing in biomarkers confounded by disease processes.

      Reviewer #1 (Recommendations for the authors):

      As outlined above, the major weakness of the manuscript is its failure to consider BPH / LUTS, and whether the markers and targets are specific to PCa or not. Specific improvements that the authors could consider might include a literature review of the features identified for their 20 high-risk proteins, and ideally also analyze whether these proteins are upregulated or downregulated in the databases they have analysed (for example it will be easy to analyze whether these proteins are dysregulated in BPH patients as these are specifically identified in the UK Biobank).

      The authors may be able to gain context for this approach by looking at papers analyzing BPH and the complement cascade and other proteins from the authors' top 10 or top 20, for example: https://doi.org/10.1002/pros.24639IF: 2.6 Q2

      Other sources can be identified by examining the literature for recent omics papers analysing BPH, especially those that analyse and compare BPH / PCa specifically.

      Thank you for highlighting the critical need to distinguish PCa-specific biomarkers from those shared with BPH. In response, we conducted a literature review of multi-omics datasets and prospective cohort studies, systematically evaluating the specificity of prioritized proteins by comparing their expression trends in PCa and BPH or benign prostate tissues. These findings are now integrated into the revised Discussion section under the subheading " Plasma Proteins Causal Links to Prostate Cancer".

      In the Discussion, the paragraph (line 288) on PSA is extremely weak. The authors state that further research is needed, and yet only reference four articles (from 2008, 2010, 2012, 2014), none of which are from the last decade. Considerable amounts of research from the last ten years have been published on PSA, for example, see this article from 2018, which specifically analyses PSA in the context of the UK Biobank. This section should be made more up-to-date with the latest literature findings. https://doi.org/10.1038/s41467-018-06920-9

      Thank you very much for your feedback. We acknowledge the need to strengthen the discussion on PSA by incorporating recent literature. In the revised manuscript, we have expanded the PSA discussion to integrate contemporary research on the prognostic role of PSA in the progression of PCa and its limitations in cancer screening, ensuring that our discussion reflected the current consensus and controversies. 

      Also in the Discussion, the analysis of phenotypic indicators is insufficiently comprehensive and should reference other recent research. For example, this recent UK Biobank study dealt with a wide range of conditions, including prostate cancer, and identified similar factors to those identified in this paper. The authors should compare and contrast their phenotypic findings with the existing literature. https://doi.org/10.1038/s41588-024-01898-1

      Thank you for addressing the comprehensiveness of phenotypic analysis. We have learned recent large-scale phenome-wide analyses (linked in your feedback) that explore multi-omics biomarkers and their associations with a range of different diseases. We have compared and contrasted our phenotypic findings with the existing literature and revised the Discussion section to interpret phenotype-PCa associations, emphasizing both shared pathways and disease-specific signals.

      Under Methods, there is too little information on how Controls were selected, whether any matching process was conducted, or whether there are fundamental differences between the cases and controls (such as smoking status, BMI, comorbidities). The authors use R, and a library such as MatchIt could be used to ensure that the Controls cohort is appropriately matched to the Cases.

      As outlined above, we acknowledge that our original analysis did not directly address this critical issue due to a key data limitation. The publicly available GWAS summary statistics for PCa (from openGWAS and FinnGen) do not provide individual-level data on covariates, resulting in no direct assessment of demographic or clinical differences between cases and controls.

      An important final point is that, as far as I can tell, no UK Biobank Application Number has been specified in the manuscript. This is vital to establish that there was an original hypothesis being investigated (as opposed to data dredging of open access resources), especially in light of the largely mechanistic flow of the manuscript and lack of PCa and relevant confounder-specific discussion. The authors may be aware of the work of Stender et al (2024) regarding formulaic papers using Mendelian randomization, especially that "[All] combinations of exposure and outcome results based on data available in IEU openGWAS (https://gwas.mrcieu.ac.uk/) can be browsed online on epigraphDB.org. In other words, these results are, in effect, already published. Reporting them again in a scientific paper adds nothing to what can be looked up online in minutes." The authors may wish to address this issue directly.

      Stender, S., Gellert-Kristensen, H. & Smith, G.D. Reclaiming Mendelian randomization from the deluge of papers and misleading findings. Lipids Health Dis 23, 286 (2024). https://doi.org/10.1186/s12944-024-02284-w

      We confirm that all data used in this study were obtained from publicly available GWAS summary statistics (e.g., PRACTICAL consortium, FinnGen) and proteomic datasets (deCODE, UKB-PPP). Our research was guided by a predefined hypothesis to investigate causal plasma protein biomarkers for prostate cancer, rather than exploratory data mining. The analytical pipelines and integrative approaches (e.g., colocalization, druggability assessment) were specifically designed to address this hypothesis, aligning with the ethical use of open-access resources.

      Reviewer #2 (Recommendations for the authors):

      There are several specific recommendations in the public review (e.g., clarify the contribution of the GWAS). Otherwise, there is nothing clearly incorrect, but translational insight is missing - the analyses are not clearly connected to the scientific literature. This is a limitation rather than a flaw - the manuscript will likely still be useful to readers.

      We thank you for highlighting the need to strengthen translational insights and contextualize our findings within existing literature. In the revised manuscript, we have expanded the Discussion section to systematically compare our results with prior mechanistic and clinical studies, including the shared pathways of associated phenotypes, the potential of significant proteins in biomarkers and therapeutic targeting. These revisions ensure our analyses are firmly rooted in the scientific literature.

    1. eLife Assessment

      This global study compares environmental niche model outputs of avian influenza pathogen niche constructed for two distinct periods, and uses differences between those outputs to suggest that the changed case numbers and distribution relate to intensification of chicken and duck farming, and extensive cultivation. While a useful update to existing niche models of highly pathogenic avian influenza, the justification for the use of environmental niche models to explore correlative relationships between land cover change and changed case epidemiology is incomplete. Key assumptions have not been adequately clarified for the readers benefit, and in consequence the communication of the likely limitations of the work are not sufficiently clear.

    2. Reviewer #1 (Public review):

      Summary:

      The authors aim to predict ecological suitability for transmission of highly pathogenic avian influenza (HPAI) using ecological niche models. This class of models identify correlations between the locations of species or disease detections and the environment. These correlations are then used to predict habitat suitability (in this work, ecological suitability for disease transmission) in locations where surveillance of the species or disease has not been conducted. The authors fit separate models for HPAI detections in wild birds and farmed birds, for two strains of HPAI (H5N1 and H5Nx) and for two time periods, pre- and post-2020. The authors also validate models fitted to disease occurrence data from pre-2020 using post-2020 occurrence data. I thank the authors for taking the time to respond to my initial review and I provide some follow-up below.

      Detailed comments:

      In my review, I asked the authors to clarify the meaning of "spillover" within the HPAI transmission cycle. This term is still not entirely clear: at lines 409-410, the authors use the term with reference to transmission between wild birds and farmed birds, as distinct to transmission between farmed birds. It is implied but not explicitly stated that "spillover" is relevant to the transmission cycle in farmed birds only. The sentence, "we developed separate ecological niche models for wild and domestic bird HPAI occurrences ..." could have been supported by a clear sentence describing the transmission cycle, to prime the reader for why two separate models were necessary.

      I also queried the importance of (dead-end) mammalian infections to a model of the HPAI transmission risk, to which the authors responded: "While spillover events of HPAI into mammals have been documented, these detections are generally considered dead-end infections and do not currently represent sustained transmission chains. As such, they fall outside the scope of our study, which focuses on avian hosts and models ecological suitability for outbreaks in wild and domestic birds." I would argue that any infections, whether they are in dead-end or competent hosts, represent the presence of environmental conditions to support transmission so are certainly relevant to a niche model and therefore within scope. It is certainly understandable if the authors have not been able to access data of mammalian infections, but it is an oversight to dismiss these infections as irrelevant.

      Correlative ecological niche models, including BRTs, learn relationships between occurrence data and covariate data to make predictions, irrespective of correlations between covariates. I am not convinced that the authors can make any "interpretation" (line 298) that the covariates that are most informative to their models have any "influence" (line 282) on their response variable. Indeed, the observation that "land-use and climatic predictors do not play an important role in the niche ecological models" (line 286), while "intensive chicken population density emerges as a significant predictor" (line 282) begs the question: from an operational perspective, is the best (e.g., most interpretable and quickest to generate) model of HPAI risk a map of poultry farming intensity?

      I have more significant concerns about the authors' treatment of sampling bias: "We agree with the Reviewer's comment that poultry density could have potentially been considered to guide the sampling effort of the pseudo-absences to consider when training domestic bird models. We however prefer to keep using a human population density layer as a proxy for surveillance bias to define the relative probability to sample pseudo-absence points in the different pixels of the background area considered when training our ecological niche models. Indeed, given that poultry density is precisely one of the predictors that we aim to test, considering this environmental layer for defining the relative probability to sample pseudo-absences would introduce a certain level of circularity in our analytical procedure, e.g. by artificially increasing to influence of that particular variable in our models." The authors have elected to ignore a fundamental feature of distribution modelling with occurrence-only data: if we include a source of sampling bias as a covariate and do not include it when we sample background data, then that covariate would appear to be correlated with presence. They acknowledge this later in their response to my review: "...assuming a sampling bias correlated with poultry density would result in reducing its effect as a risk factor." In other words, the apparent predictive capacity of poultry density is a function of how the authors have constructed the sampling bias for their models. A reader of the manuscript can reasonably ask the question: to what degree are is the model a model of HPAI transmission risk, and to what degree is the model a model of the observation process? The sentence at lines 474-477 is a helpful addition, however the preceding sentence, "Another approach to sampling pseudo-absences would have been to distribute them according to the density of domestic poultry," (line 474) is included without acknowledgement of the flow-on consequence to one of the key findings of the manuscript, that "...intensive chicken population density emerges as a significant predictor..." (line 282). The additional context on the EMPRES-i dataset at line 475-476 ("the locations of outbreaks ... are often georeferenced using place name nomenclatures") is in conflict with the description of the dataset at line 407 ("precise location coordinates"). Ultimately, the choices that the authors have made are entirely defensible through a clear, concise description of model features and assumptions, and precise language to guide the reader through interpretation of results. I am not satisfied that this is provided in the revised manuscript.

      The authors have slightly misunderstood my comment on "extrapolation": I referred to "environmental extrapolation" in my review without being particularly explicit about my meaning. By "environmental extrapolation", I meant to ask whether the models were predicting to environments that are outside the extent of environments included in the occurrence data used in the manuscript. The authors appear to have understood this to be a comment on geographic extrapolation, or predicting to areas outside the geographic extent included in occurrence data, e.g.: "For H5Nx post-2020, areas of high predicted ecological suitability, such as Brazil, Bolivia, the Caribbean islands, and Jilin province in China, likely result from extrapolations, as these regions reported few or no outbreaks in the training data" (lines 195-197). Is the model extrapolating in environmental space in these regions? This is unclear. I do not suggest that the authors should carry out further analysis, but the multivariate environmental similarly surface (MESS; see Elith et al., 2010: https://doi.org/10.1111/j.2041-210X.2010.00036.x) is a useful tool to visualise environmental extrapolation and aid model interpretation.

      On the subject of "extrapolation", I am also concerned by the additions at lines 362-370: "...our models extrapolate environmental suitability for H5Nx in wild birds in areas where few or no outbreaks have been reported. This discrepancy may be explained by limited surveillance or underreporting in those regions." The "discrepancy" cited here is a feature of the input dataset, a function of the observation distribution that should be captured in pseudo-absence data. The authors state that Kazakhstan and Central Asia are areas of interest, and that the environments in this region are outside the extent of environments captured in the occurrence dataset, although it is unclear whether "extrapolation" is informed by a quantitative tool like a MESS or judged by some other qualitative test. The authors then cite Australia as an example of a region with some predicted suitability but no HPAI outbreaks to date, however this discussion point is not linked to the idea that the presence of environmental conditions to support transmission need not imply the occurrence of transmission (as in the addition, "...spatial isolation may imply a lower risk of actual occurrences..." at line 214). Ultimately, the authors have not added any clear comment on model uncertainty (e.g., variation between replicated BRTs) as I suggested might be helpful to support their description of model predictions.

      All of my criticisms are, of course, applied with the understanding that niche modelling is imperfect for a disease like HPAI, and that data may be biased/incomplete, etc.: these caveats are common across the niche modelling literature. However, if language around the transmission cycle, the niche, and the interpretation of any of the models is imprecise, which I find it to be in the revised manuscript, it undermines all of the science that is presented in this work.

    3. Reviewer #2 (Public review):

      Summary:

      The geographic range of highly pathogenic avian influenza cases changed substantially around the period 2020, and there is much interest in understanding why. Since 2020 the pathogen irrupted in the Americas and the distribution in Asia changed dramatically. This study aimed to determine which spatial factors (environmental, agronomic and socio-economic) explain the change in numbers and locations of cases reported since 2020 (2020--2023). That's a causal question which they address by applying correlative environmental niche modelling (ENM) approach to the avian influenza case data before (2015--2020) and after 2020 (2020--2023) and separately for confirmed cases in wild and domestic birds. To address their questions they compare the outputs of the respective models, and those of the first global model of the HPAI niche published by Dhingra et al 2016.

      ENM is a correlative approach useful for extrapolating understandings based on sparse geographically referenced observational data over un- or under-sampled areas with similar environmental characteristics in the form of a continuous map. In this case, because the selected covariates about land cover, use, population and environment are broadly available over the entire world, modelled associations between the response and those covariates can be projected (predicted) back to space in the form of a continuous map of the HPAI niche for the entire world.

      Strengths:

      The authors are clear about expected bias in the detection of cases, such geographic variation in surveillance effort (testing of symptomatic or dead wildlife, testing domestic flocks) and in general more detections near areas of higher human population density (because if a tree falls in a forest and there is no-one there, etc), and take steps to ameliorate those. The authors use boosted regression trees to implement the ENM, which typically feature among the best performing models for this application (also known as habitat suitability models). They ran replicate sets of the analysis for each of their model targets (wild/domestic x pathogen variant), which can help produce stable predictions. Their code and data is provided, though I did not verify that the work was reproducible.

      The paper can be read as a partial update to the first global model of H5Nx transmission by Dhingra and others published in 2016 and explicitly follows many methodological elements. Because they use the same covariate sets as used by Dhingra et al 2016 (including the comparisons of the performance of the sets in spatial cross-validation) and for both time periods of interest in the current work, comparison of model outputs is possible. The authors further facilitate those comparisons with clear graphics and supplementary analyses and presentation. The models can also be explored interactively at a weblink provided in text, though it would be good to see the model training data there too.

      The authors' comparison of ENM model outputs generated from the distinct HPAI case datasets is interesting and worthwhile, though for me, only as a response to differently framed research questions.

      Weaknesses:

      This well-presented and technically well-executed paper has one major weakness to my mind. I don't believe that ENM models were an appropriate tool to address their stated goal, which was to identify the factors that "explain" changing HPAI epidemiology.

      Here is how I understand and unpack that weakness:

      (1) Because of their fundamentally correlative nature, ENMs are not a strong candidate for exploring or inferring causal relationships.

      (2) Generating ENMs for a species whose distribution is undergoing broad scale range change is complicated and requires particular caution and nuance in interpretation (e.g., Elith et al, 2010, an important general assumption of environmental niche models is that the target species is at some kind of distributional equilibrium (at time scales relevant to the model application). In practice that means the species has had an opportunity to reach all suitable habitats and therefore its absence from some can be interpreted as either unfavourable environment or interactions with other species). Here data sets for the response (N5H1 or N5Hx case data in domestic or wild birds ) were divided into two periods; 2015--2020, and 2020--2023 based on the rationale that the geographic locations and host-species profile of cases detected in the latter period was suggestive of changed epidemiology. In comparing outputs from multiple ENMs for the same target from distinct time periods the authors are expertly working in, or even dancing around, what is a known grey area, and they need to make the necessary assumptions and caveats obvious to readers.

      (3) To generate global prediction maps via ENM, only variables that exist at appropriate resolution over the desired area can be supplied as covariates. What processes could influence changing epidemiology of a pathogen and are their covariates that represent them? Introduction to a new geographic area (continent) with naive population, immunity in previously exposed populations, control measures to limit spread such as vaccination or destruction of vulnerable populations or flocks? Might those control measures be more or less likely depending on the country as a function of its resources and governance? There aren't globally available datasets that speak to those factors, so the question is not why were they omitted but rather was the authors decision to choose ENMs given their question justified? How valuable are insights based on patterns of correlation change when considering different temporal sets of HPAI cases in relation to a common and somewhat anachronistic set of covariates?

      (4) In general the study is somewhat incoherent with respect to time. Though the case data come from different time periods, each response dataset was modelled separately using exactly the same covariate dataset that predated both sets. That decision should be understood as a strong assumption on the part of the authors that conditions the interpretation: the world (as represented by the covariate set) is immutable, so the model has to return different correlative associations between the case data and the covariates to explain the new data. While the world represented by the selected covariates *may* be relatively stable (could be statistically confirmed), what about the world not represented by the covariates (see point 3)?

      References:

      Dhingra et al, 2016, Global mapping of highly pathogenic avian influenza H5N1 and H5Nx clade 2.3.4.4 viruses with spatial cross-validation, eLife 5, https://doi.org/10.7554/eLife.19571

      Elith, J., Kearney, M., & Phillips, S. (2010). The art of modelling range‐shifting species. Methods in Ecology and Evolution, 1(4), 330-342.

    4. Author response:

      The following is the authors’ response to the current reviews.

      Public Reviews:

      We thank the Reviewers for their thorough attention to our paper and the interesting discussion about the findings. Before responding to more specific comments, here some general points we would like to clarify:

      (1) Ecological niche models are indeed correlative models, and we used them to highlight environmental factors associated with HPAI outbreaks within two host groups. We will further revise the terminology that could still unintentionally suggest causal inference. The few remaining ambiguities were mainly in the Discussion section, where our intent was to interpret the results in light of the broader scientific literature. Particularly, we will change the following expressions:

      -  “Which factors can explain…” to  “Which factors are associated with…” (line 75);

      -  “the environmental and anthropogenic factors influencing” to “the environmental and anthropogenic factors that are correlated with” (line 273);

      -  “underscoring the influence” to “underscoring the strong association” (line 282).

      (2) We respectfully disagree with the suggestion that an ecological niche modelling (ENM) approach is not appropriate for this work and the research question addressed therein. Ecological niche models are specifically designed to estimate the spatial distribution of the environmental suitability of species and pathogens, making them well suited to our research questions. In our study, we have also explicitly detailed the known limitations of ecological niche models in the Discussion section, in line with prior literature, to ensure their appropriate interpretation in the context of HPAI.

      (3) The environmental layers used in our models were restricted to those available at a global scale, as listed in Supplementary Information Resources S1 (https://github.com/sdellicour/h5nx_risk_mapping/blob/master/Scripts_%26_data/SI_Resource_S1.xlsx). Naturally, not all potentially relevant environmental factors could be included, but the selected layers are explicitly documented and only these were assessed for their importance. Despite this limitation, the performance metrics indicate that the models performed well, suggesting that the chosen covariates capture meaningful associations with HPAI occurrence at a global scale.

      Reviewer #1 (Public review):

      The authors aim to predict ecological suitability for transmission of highly pathogenic avian influenza (HPAI) using ecological niche models. This class of models identify correlations between the locations of species or disease detections and the environment. These correlations are then used to predict habitat suitability (in this work, ecological suitability for disease transmission) in locations where surveillance of the species or disease has not been conducted. The authors fit separate models for HPAI detections in wild birds and farmed birds, for two strains of HPAI (H5N1 and H5Nx) and for two time periods, pre- and post-2020. The authors also validate models fitted to disease occurrence data from pre-2020 using post-2020 occurrence data. I thank the authors for taking the time to respond to my initial review and I provide some follow-up below.

      Detailed comments:

      In my review, I asked the authors to clarify the meaning of "spillover" within the HPAI transmission cycle. This term is still not entirely clear: at lines 409-410, the authors use the term with reference to transmission between wild birds and farmed birds, as distinct to transmission between farmed birds. It is implied but not explicitly stated that "spillover" is relevant to the transmission cycle in farmed birds only. The sentence, "we developed separate ecological niche models for wild and domestic bird HPAI occurrences ..." could have been supported by a clear sentence describing the transmission cycle, to prime the reader for why two separate models were necessary.

      We respectfully disagree that the term “spillover” is unclear in the manuscript. In both the Methods and Discussion sections (lines 387-391 and 409-414), we explicitly define “spillover” as the introduction of HPAI viruses from wild birds into domestic poultry, and we distinguish this from secondary farm-to-farm transmission. Our use of separate ecological niche models for wild and domestic outbreaks reflects not only the distinction between primary spillover and secondary transmission, but also the fundamentally different ecological processes, surveillance systems, and management implications that shape outbreaks in these two groups. We will clarify this choice in the revised manuscript when introducing the separate models. Furthermore, on line 83, we will add “as these two groups are influenced by different ecological processes, surveillance biases, and management contexts”.

      I also queried the importance of (dead-end) mammalian infections to a model of the HPAI transmission risk, to which the authors responded: "While spillover events of HPAI into mammals have been documented, these detections are generally considered dead-end infections and do not currently represent sustained transmission chains. As such, they fall outside the scope of our study, which focuses on avian hosts and models ecological suitability for outbreaks in wild and domestic birds." I would argue that any infections, whether they are in dead-end or competent hosts, represent the presence of environmental conditions to support transmission so are certainly relevant to a niche model and therefore within scope. It is certainly understandable if the authors have not been able to access data of mammalian infections, but it is an oversight to dismiss these infections as irrelevant.

      We understand the Reviewer’s point, but our study was designed to model HPAI occurrence in avian hosts only. We therefore restricted our analysis to wild birds and domestic poultry, which represent the primary hosts for HPAI circulation and the focus of surveillance and control measures. While mammalian detections have been reported, they are outside the scope of this work.

      Correlative ecological niche models, including BRTs, learn relationships between occurrence data and covariate data to make predictions, irrespective of correlations between covariates. I am not convinced that the authors can make any "interpretation" (line 298) that the covariates that are most informative to their models have any "influence" (line 282) on their response variable. Indeed, the observation that "land-use and climatic predictors do not play an important role in the niche ecological models" (line 286), while "intensive chicken population density emerges as a significant predictor" (line 282) begs the question: from an operational perspective, is the best (e.g., most interpretable and quickest to generate) model of HPAI risk a map of poultry farming intensity?

      We agree that poultry density may partly reflect reporting bias, but we also assumed it a meaningful predictor of HPAI risk. Its importance in our models is therefore expected. Importantly, our BRT framework does more than reproduce poultry distribution: it captures non-linear relationships and interactions with other covariates, allowing a more nuanced characterisation of risk than a simple poultry density map. Note also that we distinguished in our models intensive and extensive chicken poultry density and duck density. Therefore, it is not a “map of poultry farming intensity”. 

      At line 282, we used the word “influence” while fully recognising that correlative models cannot establish causality. Indeed, in our analyses, “relative influence” refers to the importance metric produced by the BRT algorithm (Ridgeway, 2020), which measures correlative associations between environmental factors and outbreak occurrences. These scores are interpreted in light of the broader scientific literature, therefore our interpretations build on both our results and existing evidence, rather than on our models alone. However, in the next version of the paper, we will revise the sentence as: “underscoring the strong association of poultry farming practices with HPAI spread (Dhingra et al., 2016)”. 

      I have more significant concerns about the authors' treatment of sampling bias: "We agree with the Reviewer's comment that poultry density could have potentially been considered to guide the sampling effort of the pseudo-absences to consider when training domestic bird models. We however prefer to keep using a human population density layer as a proxy for surveillance bias to define the relative probability to sample pseudo-absence points in the different pixels of the background area considered when training our ecological niche models. Indeed, given that poultry density is precisely one of the predictors that we aim to test, considering this environmental layer for defining the relative probability to sample pseudo-absences would introduce a certain level of circularity in our analytical procedure, e.g. by artificially increasing to influence of that particular variable in our models." The authors have elected to ignore a fundamental feature of distribution modelling with occurrence-only data: if we include a source of sampling bias as a covariate and do not include it when we sample background data, then that covariate would appear to be correlated with presence. They acknowledge this later in their response to my review: "...assuming a sampling bias correlated with poultry density would result in reducing its effect as a risk factor." In other words, the apparent predictive capacity of poultry density is a function of how the authors have constructed the sampling bias for their models. A reader of the manuscript can reasonably ask the question: to what degree are is the model a model of HPAI transmission risk, and to what degree is the model a model of the observation process? The sentence at lines 474-477 is a helpful addition, however the preceding sentence, "Another approach to sampling pseudo-absences would have been to distribute them according to the density of domestic poultry," (line 474) is included without acknowledgement of the flow-on consequence to one of the key findings of the manuscript, that "...intensive chicken population density emerges as a significant predictor..." (line 282). The additional context on the EMPRES-i dataset at line 475-476 ("the locations of outbreaks ... are often georeferenced using place name nomenclatures") is in conflict with the description of the dataset at line 407 ("precise location coordinates"). Ultimately, the choices that the authors have made are entirely defensible through a clear, concise description of model features and assumptions, and precise language to guide the reader through interpretation of results. I am not satisfied that this is provided in the revised manuscript.

      We thank the Reviewer for this important point. To address it, we compared model predictive performance and covariate relative influences obtained when pseudo-absences were weighted by poultry density versus human population density (Author response table 1). The results show that differences between the two approaches are marginal, both in predictive performance (ΔAUC ranging from -0.013 to +0.002) and in the ranking of key predictors (see below Author response images 1 and 2). For instance, intensive chicken density consistently emerged as an important predictor regardless of the bias layer used.

      Note: the comparison was conducted using a simplified BRT configuration for computational efficiency (fewer trees, fixed 5-fold random cross-validation, and standardised parameters). Therefore, absolute values of AUC and variable importance may differ slightly from those in the manuscript, but the relative ranking of predictors and the overall conclusions remain consistent.

      Given these small differences, we retained the approach using human population density. We agree that poultry density partly reflects surveillance bias as well as true epidemiological risk, and we will clarify this in the revised manuscript by noting that the predictive role of poultry density reflects both biological processes and surveillance systems. Furthermore, on line 289, we will add “We note, however, that intensive poultry density may reflect both surveillance intensity and epidemiological risk, and its predictive role in our models should be interpreted in light of both processes”.

      Author response table 1.

      Comparison of model predictive performances (AUC) between pseudo-absence sampling were weighted by poultry density and by human population density across host groups, virus types, and time periods. Differences in AUC values are shown as the value for poultry-weighted minus human-weighted pseudo-absences.

      Author response image 1.

      Comparison of variable relative influence (%) between models trained with pseudo-absences weighted by poultry density (red) and human population density (blue) for domestic bird outbreaks. Results are shown for four datasets: H5N1 (<2020), H5N1 (>2020), H5Nx (<2020), and H5Nx (>2020).

      Author response image 2.

      Comparison of variable relative influence (%) between models trained with pseudo-absences weighted by poultry density (red) and human population density (blue) for wild bird outbreaks. Results are shown for three datasets: H5N1 (>2020), H5Nx (<2020), and H5Nx (>2020).

      The authors have slightly misunderstood my comment on "extrapolation": I referred to "environmental extrapolation" in my review without being particularly explicit about my meaning. By "environmental extrapolation", I meant to ask whether the models were predicting to environments that are outside the extent of environments included in the occurrence data used in the manuscript. The authors appear to have understood this to be a comment on geographic extrapolation, or predicting to areas outside the geographic extent included in occurrence data, e.g.: "For H5Nx post-2020, areas of high predicted ecological suitability, such as Brazil, Bolivia, the Caribbean islands, and Jilin province in China, likely result from extrapolations, as these regions reported few or no outbreaks in the training data" (lines 195-197). Is the model extrapolating in environmental space in these regions? This is unclear. I do not suggest that the authors should carry out further analysis, but the multivariate environmental similarly surface (MESS; see Elith et al., 2010) is a useful tool to visualise environmental extrapolation and aid model interpretation.

      On the subject of "extrapolation", I am also concerned by the additions at lines 362-370: "...our models extrapolate environmental suitability for H5Nx in wild birds in areas where few or no outbreaks have been reported. This discrepancy may be explained by limited surveillance or underreporting in those regions." The "discrepancy" cited here is a feature of the input dataset, a function of the observation distribution that should be captured in pseudo-absence data. The authors state that Kazakhstan and Central Asia are areas of interest, and that the environments in this region are outside the extent of environments captured in the occurrence dataset, although it is unclear whether "extrapolation" is informed by a quantitative tool like a MESS or judged by some other qualitative test. The authors then cite Australia as an example of a region with some predicted suitability but no HPAI outbreaks to date, however this discussion point is not linked to the idea that the presence of environmental conditions to support transmission need not imply the occurrence of transmission (as in the addition, "...spatial isolation may imply a lower risk of actual occurrences..." at line 214). Ultimately, the authors have not added any clear comment on model uncertainty (e.g., variation between replicated BRTs) as I suggested might be helpful to support their description of model predictions.

      Many thanks for the clarification. Indeed, we interpreted your previous comments in terms of geographic extrapolations. We thank the Reviewer for these observations. We will adjust the wording to further clarify that predictions of ecological suitability in areas with few or no reported outbreaks (e.g., Central Asia, Australia) are not model errors but expected extrapolations, since ecological suitability does not imply confirmed transmission (for instance, on Line 362: “our models extrapolate environmental suitability” will be changed to “Interestingly, our models extrapolate geographical”). These predictions indicate potential environments favorable to circulation if the virus were introduced.

      In our study, model uncertainty is formally assessed when comparing the predictive performances of our models (Fig. S3, Table S1), the relative influence (Table S3) and response curves (Fig. 2) associated with each environmental factor (Table S2). All the results confirming a good converge between these replicates. Finally, we indeed did not use a quantitative tool such as a MESS to assess extrapolation but did rely on qualitative interpretation of model outputs.

      All of my criticisms are, of course, applied with the understanding that niche modelling is imperfect for a disease like HPAI, and that data may be biased/incomplete, etc.: these caveats are common across the niche modelling literature. However, if language around the transmission cycle, the niche, and the interpretation of any of the models is imprecise, which I find it to be in the revised manuscript, it undermines all of the science that is presented in this work.

      We respectfully disagree with this comment. The scope of our study and the methods employed are clearly defined in the manuscript, and the limitations of ecological niche modelling in this context are explicitly acknowledged in the Discussion section. While we appreciate the Reviewer’s concern, the comment does not provide specific examples of unclear or imprecise language regarding the transmission cycle, niche, or interpretation of the models. Without such examples, it is difficult to identify further revisions that would improve clarity.

      Reviewer #2 (Public review):

      The geographic range of highly pathogenic avian influenza cases changed substantially around the period 2020, and there is much interest in understanding why. Since 2020 the pathogen irrupted in the Americas and the distribution in Asia changed dramatically. This study aimed to determine which spatial factors (environmental, agronomic and socio-economic) explain the change in numbers and locations of cases reported since 2020 (2020--2023). That's a causal question which they address by applying correlative environmental niche modelling (ENM) approach to the avian influenza case data before (2015--2020) and after 2020 (2020--2023) and separately for confirmed cases in wild and domestic birds. To address their questions they compare the outputs of the respective models, and those of the first global model of the HPAI niche published by Dhingra et al 2016.

      We do not agree with this comment. In the manuscript, it is well established that we are quantitatively assessing factors that are associated with occurrences data before and after 2020. We do not claim to determine the causality. One sentence of the Introduction section (lines 75-76) could be confusing, so we intend to modify it in the final revision of our manuscript. 

      ENM is a correlative approach useful for extrapolating understandings based on sparse geographically referenced observational data over un- or under-sampled areas with similar environmental characteristics in the form of a continuous map. In this case, because the selected covariates about land cover, use, population and environment are broadly available over the entire world, modelled associations between the response and those covariates can be projected (predicted) back to space in the form of a continuous map of the HPAI niche for the entire world.

      We fully agree with this assessment of ENM approaches.

      Strengths:

      The authors are clear about expected bias in the detection of cases, such geographic variation in surveillance effort (testing of symptomatic or dead wildlife, testing domestic flocks) and in general more detections near areas of higher human population density (because if a tree falls in a forest and there is no-one there, etc), and take steps to ameliorate those. The authors use boosted regression trees to implement the ENM, which typically feature among the best performing models for this application (also known as habitat suitability models). They ran replicate sets of the analysis for each of their model targets (wild/domestic x pathogen variant), which can help produce stable predictions. Their code and data is provided, though I did not verify that the work was reproducible.

      The paper can be read as a partial update to the first global model of H5Nx transmission by Dhingra and others published in 2016 and explicitly follows many methodological elements. Because they use the same covariate sets as used by Dhingra et al 2016 (including the comparisons of the performance of the sets in spatial cross-validation) and for both time periods of interest in the current work, comparison of model outputs is possible. The authors further facilitate those comparisons with clear graphics and supplementary analyses and presentation. The models can also be explored interactively at a weblink provided in text, though it would be good to see the model training data there too.

      The authors' comparison of ENM model outputs generated from the distinct HPAI case datasets is interesting and worthwhile, though for me, only as a response to differently framed research questions.

      Weaknesses:

      This well-presented and technically well-executed paper has one major weakness to my mind. I don't believe that ENM models were an appropriate tool to address their stated goal, which was to identify the factors that "explain" changing HPAI epidemiology.

      Here is how I understand and unpack that weakness:

      (1) Because of their fundamentally correlative nature, ENMs are not a strong candidate for exploring or inferring causal relationships.

      (2) Generating ENMs for a species whose distribution is undergoing broad scale range change is complicated and requires particular caution and nuance in interpretation (e.g., Elith et al, 2010, an important general assumption of environmental niche models is that the target species is at some kind of distributional equilibrium (at time scales relevant to the model application). In practice that means the species has had an opportunity to reach all suitable habitats and therefore its absence from some can be interpreted as either unfavourable environment or interactions with other species). Here data sets for the response (N5H1 or N5Hx case data in domestic or wild birds ) were divided into two periods; 2015--2020, and 2020--2023 based on the rationale that the geographic locations and host-species profile of cases detected in the latter period was suggestive of changed epidemiology. In comparing outputs from multiple ENMs for the same target from distinct time periods the authors are expertly working in, or even dancing around, what is a known grey area, and they need to make the necessary assumptions and caveats obvious to readers.

      We thank the Reviewer for this observation. First, we constrained pseudo-absence sampling to countries and regions where outbreaks had been reported, reducing the risk of interpreting non-affected areas as environmentally unsuitable. Second, we deliberately split the outbreak data into two periods (2015-2020 and 2020-2023) because we do not assume a single stable equilibrium across the full study timeframe. This division reflects known epidemiological changes around 2020 and allows each period to be modeled independently. Within each period, ENM outputs are interpreted as associations between outbreaks and covariates, not as equilibrium distributions. Finally, by testing prediction across periods, we assessed both niche stability and potential niche shifts. These clarifications will be added to the manuscript to make our assumptions and limitations explicit.

      Line 66, we will add: “Ecological niche model outputs for range-shifting pathogens must therefore be interpreted with caution (Elith et al., 2010). Despite this limitation, correlative ecological niche models  remain useful for identifying broad-scale associations and potential shifts in distribution. To account for this, we analysed two distinct time periods (2015-2020 and 2020-2023).”

      Line 123, we will revise “These findings underscore the ability of pre-2020 models in forecasting the recent geographic distribution of ecological suitability for H5Nx and H5N1 occurrences” to “These results suggest that pre-2020 models captured broad patterns of suitability for H5Nx and H5N1 outbreaks, while post-2020 models provided a closer fit to the more recent epidemiological situation”.

      (3) To generate global prediction maps via ENM, only variables that exist at appropriate resolution over the desired area can be supplied as covariates. What processes could influence changing epidemiology of a pathogen and are their covariates that represent them? Introduction to a new geographic area (continent) with naive population, immunity in previously exposed populations, control measures to limit spread such as vaccination or destruction of vulnerable populations or flocks? Might those control measures be more or less likely depending on the country as a function of its resources and governance? There aren't globally available datasets that speak to those factors, so the question is not why were they omitted but rather was the authors decision to choose ENMs given their question justified? How valuable are insights based on patterns of correlation change when considering different temporal sets of HPAI cases in relation to a common and somewhat anachronistic set of covariates?

      We agree that the ecological niche models trained in our study are limited to environmental and host factors, as described in the Methods section with the selection of predictors. While such models cannot capture causality or represent processes such as immunity, control measures, or governance, they remain a useful tool for identifying broad associations between outbreak occurrence and environmental context. Our study cannot infer the full mechanisms driving changes in HPAI epidemiology, but it does provide a globally consistent framework to examine how associations with available covariates vary across time periods.

      (4) In general the study is somewhat incoherent with respect to time. Though the case data come from different time periods, each response dataset was modelled separately using exactly the same covariate dataset that predated both sets. That decision should be understood as a strong assumption on the part of the authors that conditions the interpretation: the world (as represented by the covariate set) is immutable, so the model has to return different correlative associations between the case data and the covariates to explain the new data. While the world represented by the selected covariates *may* be relatively stable (could be statistically confirmed), what about the world not represented by the covariates (see point 3)?

      We used the same covariate layers for both periods, which indeed assumes that these environmental and host factors are relatively stable at the global scale over the short timeframe considered. We believe this assumption is reasonable, as poultry density, land cover, and climate baselines do not change drastically between 2015 and 2023 at the resolution of our analysis. We agree, however, that unmeasured processes such as control measures, immunity, or governance may have changed during this time and are not captured by our covariates.

      Recommendations for the Authors:

      Reviewer #1 (Recommendations for the authors):

      - Line 400-401: "over the 2003-2016 periods" has an extra "s"; "two host species" (with reference to wild and domestic birds) would be more precise as "two host groups".

      - Remove comma line 404

      Many thanks for these comments, we have modified the text accordingly.

      Reviewer #2 (Recommendations for the authors):

      Most of my work this round is encapsulated in the public part of the review.

      The authors responded positively to the review efforts from the previous round, but I was underwhelmed with the changes to the text that resulted. Particularly in regard to limiting assumptions - the way that they augmented the text to refer to limitations raised in review downplayed the importance of the assumptions they've made. So they acknowledge the significance of the limitation in their rejoinder, but in the amended text merely note the limitation without giving any sense of what it means for their interpretation of the findings of this study.

      The abstract and findings are essentially unchanged from the previous draft.

      I still feel the near causal statements of interpretation about the covariates are concerning. These models really are not a good candidate for supporting the inference that they are making and there seem to be very strong arguments in favour of adding covariates that are not globally available.

      We never claimed causal interpretation, and we have consistently framed our analyses in terms of associations rather than mechanisms. We acknowledge that one phrasing in the research questions (“Which factors can explain…”) could be misinterpreted, and we are correcting this in the revised version to read “Which factors are associated with…”. Our approach follows standard ecological niche modelling practice, which identifies statistical associations between occurrence data and covariates. As noted in the Discussion section, these associations should not be interpreted as direct causal mechanisms. Finally, all interpretive points in the manuscript are supported by published literature, and we consider this framing both appropriate and consistent with best practice in ecological niche modelling (ENM) studies.

      We assessed predictor contributions using the “relative influence” metric, the terminology reported by the R package “gbm” (Ridgeway, 2020). This metric quantifies the contribution of each variable to model fit across all trees, rescaled to sum to 100%, and should be interpreted as an association rather than a causal effect.

      L65-66 The general difficulty of interpreting ENM output with range-shifting species should be cited here to alert readers that they should not blithely attempt what follows at home.

      I believe that their analysis is interesting and technically very well executed, so it has been a disappointment and hard work to write this assessment. My rough-cut last paragraph of a reframed intro would go something like - there are many reasons in the literature not to do what we are about to do, but here's why we think it can be instructive and informative, within certain guardrails.

      To acknowledge this comment and the previous one, we revised lines 65-66 to: “However, recent outbreaks raise questions about whether earlier ecological niche models still accurately predict the current distribution of areas ecologically suitable for the local circulation of HPAI H5 viruses. Ecological niche model outputs for range-shifting pathogens must therefore be interpreted with caution (Elith et al., 2010). Despite this limitation, correlative ecological niche models  remain useful for identifying broad-scale associations and potential shifts in distribution.”

      We respectfully disagree with the Reviewer’s statement that “_there are many reasons in the literature not to do what we are about to do”._ All modeling approaches, including mechanistic ones, have limitations, and the literature is clear on both the strengths and constraints of ecological niche models. Our manuscript openly acknowledges these limits and frames our findings accordingly. We therefore believe that our use of an ENM approach is justified and contributes valuable insights within these well-defined boundaries.

      Reference: Ridgeway, G. (2007). Generalized Boosted Models: A guide to the gbm package. Update, 1(1), 2007.


      The following is the authors’ response to the original reviews.

      Reviewer #1(Public review):

      I am concerned by the authors' conceptualisation of "niche" within the manuscript. Is the "niche" we are modelling the niche of the pathogen itself? The niche of the (wild) bird host species as a group? The niche of HPAI transmission within (wild) bird host species (i.e., an intersection of pathogen and bird niches)? Or the niche of HPAI transmission in poultry? The precise niche being modelled should be clarified in the Introduction or early in the Methods of the manuscript. The first two definitions of niche listed above are relevant, but separate from the niche modelled in the manuscript - this should be acknowledged.

      We acknowledge that these concepts were probably not enough clearly defined in the previous version of our manuscript, and we have now included an explicit definition in the fourth paragraph of the Introduction section: “We developed separate ecological niche models for wild and domestic bird HPAI occurrences, these models thus predicting the ecological suitability for the risk of local viral circulation leading to the detection of HPAI occurrences within each host group (rather than the niche of the virus or the host species alone).”

      The authors should consider the precise transmission cycle involved in each HPAI case: "index cases" in farmed poultry, caused by "spillover" from wild birds, are relevant to the wildlife transmission cycle, while the ecological conditions coinciding with subsequent transmission in farmed poultry are likely to be fundamentally different. (For example, subsequent transmission is not conditional on the presence of wild birds.) Modelling these two separate, but linked, transmission cycles together may omit important nuances from the modelling framework.

      We thank the Reviewer for highlighting the distinction between primary (wild-todomestic) and secondary (farm-to-farm) transmission cycles. Our modelling framework was designed to assess the ecological suitability of HPAI occurrences in wild and domestic birds separately. In the domestic poultry models, the response variables are the confirmed outbreaks data and do not distinguish between index cases resulting from primary or secondary infections.

      One of the aims of the study is to evaluate the spatial distribution of areas ecologically suitable for local H5N1/x circulation either leading to domestic or wild bird cases, i.e. to identify environmental conditions where the virus may have persisted or spread, whether as a result of introduction by wild birds or farm-to-farm transmission. Introducing mechanistic distinctions in the response variable would not necessarily improve or affect the ecological suitability maps, since each type of transmission is likely to be associated with different covariates that are included in the models.

      Also, the EMPRES-i database does not indicate whether each record corresponds to an index case or a secondary transmission event, so in practice it would not be possible to produce two different models. However, we agree that distinguishing between types of transmission is an interesting perspective for future research. This could be explored, for example, by mapping interfaces between wild and domestic bird populations or by inferring outbreak transmission trees using genomic data when available.

      To avoid confusion, we now explicitly clarify this aspect in the Materials and Methods section: “It is important to note that the EMPRES-i database does not distinguish between index cases (e.g., primary spillover from wild birds) and secondary farm-to-farm transmissions. As such, our ecological niche models are trained on confirmed HPAI outbreaks in poultry that may result from different transmission dynamics — including both initial introduction events influenced by environmental factors and subsequent spread within poultry systems.”

      We now also address this limitation in the Discussion section: “Finally, our models for domestic poultry do not distinguish between primary introduction events (e.g., spillover from wild birds) and secondary transmission between farms due to limitations in the available surveillance data. While environmental factors likely influence the risk of initial spillover events, secondary spread is more often driven by anthropogenic factors such as biosecurity practices and poultry trade, which are not included in our current modelling framework.”

      The authors should clarify the meaning of "spillover" within the HPAI transmission cycle: if spillover transmission is from wild birds to farmed poultry, then subsequent transmission in poultry is separate from the wildlife transmission cycle. This is particularly relevant to the Discussion paragraph beginning at line 244: does "farm to farm transmission" have a distinct ecological niche to transmission between wild birds, and transmission between wild birds and farmed birds? And while there has been a spillover of HPAI to mammals, could the authors clarify that these detections are dead-end? And not represented in the dataset? Dhingra et al., 2016 comment on the contrast between models of "directly transmitted" pathogens, such as HPAI, and vector-borne diseases: for vector-borne diseases, "clear eco-climatic boundaries of vectors can be mapped", whereas "HPAI is probably not as strongly environmentally constrained". This is an important piece of nuance in their Discussion and a comment to a similar effect may be of use in this manuscript.

      Following the Reviewer’s previous comment, we have now added clarifications in the Methods and Discussion sections defining spillover as the transmission of HPAI viruses from wild birds to domestic poultry (index cases), and secondary transmission as onward spread between farms. As mentioned in our answer above, we now emphasise that our models do not distinguish these dynamics, which are likely to be influenced by different drivers — ecological in the case of spillover, and often anthropogenic (e.g., poultry trade movement, biosecurity) in the case of farm-to-farm transmission.

      The discussion regarding farm-to-farm transmission and spillovers is indeed an interpretation derived from the covariates analysis (see the second paragraph in the Discussion section). Specifically, we observed a stronger association between HPAI occurrences and domestic bird density after 2020, which may suggest that secondary infections (e.g., farm-to-farm transmission) became more prominent or more frequently reported. We however acknowledge that our data do not allow us to distinguish primary introductions from secondary transmission events, and we have added a sentence to explicitly clarify this: “However, this remains an interpretation, as the available data do not allow us to distinguish between index cases and secondary transmission events.”

      We thank the Reviewer for raising the point of mammalian infections. While spillover events of HPAI into mammals have been documented, these detections are generally considered dead-end infections and do not currently represent sustained transmission chains. As such, they fall outside the scope of our study, which focuses on avian hosts and models ecological suitability for outbreaks in wild and domestic birds. However, we agree that future work could explore the spatial overlap between mammalian outbreak detections and ecological suitability maps for wild birds to assess whether such spillovers may be linked to localised avian transmission dynamics.

      Finally, we have added a comment about the differences between pathogens strongly constrained by the environments and HPAI: “This suggests that HPAI H5Nx is not as strongly environmentally constrained as vector-borne pathogens, for which clear eco-climatic boundaries (e.g., vector borne diseases) can be mapped (Dhingra et al., 2016).” This aligns with the interpretation provided by Dhingra and colleagues (2016) and helps contextualise the predictive limitations of ecological niche models for directly transmitted pathogens like HPAI.

      There are several places where some simple clarification of language could answer my questions related to ecological niches. For example, on line 74, "the ecological niche" should be followed by "of the pathogen", or "of HPAI transmission in wild birds", or some other qualifier that is most appropriate to the Authors' conceptualisation of the niche modelled in the manuscript. Similarly, in the following sentence, "areas at risk" could be followed by "of transmission in wild birds", to make the transmission cycle that is the subject of modelling clear to the reader. On line 83, it is not clear who or what is the owner of "their ecological niches": is this "poultry and wild birds", or the pathogen?

      We agree with that suggestion and have now modified the related part of the text  accordingly (e.g., “areas at risk for local HPAI circulation” and “of HPAI in wild or domestic birds”).

      I am concerned by the authors' treatment of sampling bias in their BRT modelling framework. If we are modelling the niche of HPAI transmission, we would expect places that are more likely to be subject to disease surveillance to be represented in the set of locations where the disease has been detected. I do not agree that pseudo-absence points are sampled "to account for the lack of virus detection in some areas" - this description is misleading and does not match the following sentence ("pseudo-absence points sampled ... to reflect the greater surveillance efforts ..."). The distribution of pseudo-absences should aim to capture the distribution of probable disease surveillance, as these data act as a stand-in for missing negative surveillance records. It is sensible that pseudo-absences for disease detection in wild birds are sampled proportionately to human population density, as the disease is detected in dead wild birds, which are more likely to be identified close to areas of human occupation (as stated on line 163). However, I do not agree that the same applies to poultry - the density of farmed poultry is likely to be a better proxy for surveillance intensity in farmed birds. Human population density and farmed poultry density may be somewhat correlated (i.e., both are low in remote areas), but poultry density is likely to be higher in rural areas, which are assumed to have relatively lower surveillance intensity under the current approach. The authors allude to this in the Discussion: "monitoring areas with high intensive chicken densities ... remains crucial for the early detection and management of HPAI outbreaks".

      We agree with the Reviewer's comment that poultry density could have potentially been considered to guide the sampling effort of the pseudo-absences to consider when training domestic bird models. We however prefer to keep using a human population density layer as a proxy for surveillance bias to define the relative probability to sample pseudoabsence points in the different pixels of the background area considered when training our ecological niche models. Indeed, given that poultry density is precisely one of the predictors that we aim to test, considering this environmental layer for defining the relative probability to sample pseudo-absences would introduce a certain level of circularity in our analytical procedure, e.g. by artificially increasing to influence of that particular variable in our models.

      Furthermore, it is also worth noting that, to better account for variations in surveillance intensity, we also adjusted the sampling effort by allocating pseudo-absences in proportion to the number of confirmed outbreaks per administrative unit (country or sub-national regions for Russia and China). This approach aimed to reduce bias caused by uneven reporting and surveillance efforts between regions. Additionally, we restricted model training to countries or regions with a minimum surveillance threshold (at least five confirmed outbreaks per administrative unit). Therefore, both presence and pseudo-absence points originated from areas with more consistent surveillance data.

      We acknowledge in the Materials and Methods section that the approach proposed by the Reviewer could have been used: “Another approach to sampling pseudo-absences would have been to distribute them according to the density of domestic poultry.” Finally, our approach is also justified in our response to the next comment of the Reviewer.

      Having written my review, including the paragraph above, I briefly scanned Dhingra et al., and found that they provide justification for the use of human population density to sample pseudoabsences in farmed birds: "the Empres-i database compiles outbreak locations data from very heterogeneous sources and in the absence of explicit GPS location data, the geo-referencing of individual cases is often through the use of place name gazetteers that will tend to force the outbreak location populated place, rather in the exact location of the farm where the disease was found, which would introduce a bias correlated with human population density." This context is entirely missing from the manuscript under review, however, I maintain the comment in the paragraph above - have the Authors trialled sampling pseudo-absences from poultry density layers?

      We agree with the Reviewer’s comment and have now added this precision in the Materials and Methods section (in the third paragraph dedicated to ecological niche modelling): “However, as pointed out by Dhingra and colleagues (2016), the locations of outbreaks in the EMPRES-i database are often georeferenced using place name nomenclatures due to a lack of accurate GPS data, which could introduce a spatial bias towards populated areas.”

      The authors indirectly acknowledge the role of sampling bias in model predictions at line 163, however, this point could be clearer: there is sampling bias in the set of locations where HPAI has been observed and failure to adequately replicate this sampling bias in pseudo-absence data could lead covariates that are correlated with the observation distribution to appear to be correlated with the target distribution. This point is alluded to but should be clearly acknowledged to allow the reader to appropriately interpret your results. I understand the point being made on line 163 is that surveillance of HPAI in wild birds has become more structured and less opportunistic over time - if this is the case, a statement to this effect could replace "which could influence earlier data sets", which is a little ambiguous. The Authors acknowledge the role of sampling bias in lines 241-242 - this may be a good place to remind the reader that they have attempted to incorporate sampling bias through the selection of their pseudoabsence dataset, particularly for wild bird models.

      We thank the Reviewer for this comment. We have now clarified in the text that observed data on HPAI occurrence are inherently influenced by heterogeneous surveillance efforts and that failure to replicate this bias in pseudo-absence sampling could effectively lead to misleading correlations with covariates associated with surveillance effort rather than true ecological suitability. We have now rephrased the related sentence as follows: “This decline may indicate a reduced bias in observation data: typically, dead wild birds are more frequently found near human-populated areas due to opportunistic detections, whereas more recent surveillance efforts have become increasingly proactive (Giacinti et al., 2024).”

      Dhingra et al. aimed to account for the effect of mass vaccination of birds in China. This does not appear to be included in the updated models - is this a relevant covariate to consider in updated models? Are the models trained on pre-2020 data predicting to post-2020 given the same presence dataset as previous models? It may be helpful to provide a comment on this if we consider the pre-2020 models in this work to be representative of pre-2020 models as a cohort. Given the framing of the manuscript as an update to Dhingra et al., it may be useful for the authors to briefly summarise any differences between the existing models and updated models. Dhingra et al., also examine spatial extrapolation, which is not addressed here. Environmental extrapolation may be a useful metric to consider: are there areas where models are extrapolating that are predicted to be at high risk of HPAI transmission? Finally, they also provide some inset panels on global maps of model predictions - something similar here may also be useful.

      We thank the Reviewer for these comments. Vaccination coverage is indeed a relevant covariate for HPAI suitability in domestic birds. However, we did not include this variable in our updated models for two reasons. First, comprehensive vaccination data were only available for China, so it is not possible to include this variable in a global model. Second, available data were outdated and vaccination strategies can vary substantially over time.

      We however agree with the Reviewer that the Materials and Methods section did not clarify clearly the differences with Dhingra et al. (2016), and we now detail these differences at the beginning of the Materials and Methods section: “Our approach is similar to the one implemented by Dhingra and colleagues (2016). While Dhingra et al. (2016) developed their models only for domestic birds over the 2003-2016 periods, our models were developed for two host species separately (wild and domestic birds) and for two time periods (2016-2020 and 2020-2023).”

      We also detail the main difference concerning the pseudo-absences sampling:  Dhingra and colleagues (2016) used human population density to sample pseudo-absences to reflect potential surveillance bias and also account for spatial filtering (min/max distances from presence). We adopted a similar strategy but also incorporated outbreak count per country or province (in the case of China and Russia) into the pseudo-absence sampling process to further account for within-country surveillance heterogeneity. We have now added these specifications in the Materials and Methods section: “To account for heterogeneity in AIV surveillance and minimise the risk of sampling pseudo-absences in poorly monitored regions, we restricted our analysis to countries (or administrative level 1 units in China and Russia) with at least five confirmed outbreaks. Unlike Dhingra et al. (2016), who sampled pseudoabsences across a broader global extent, our sampling was limited to regions with demonstrated surveillance activity. In addition, we adjusted the density of pseudo-absence points according to the number of reported outbreaks in each country or admin-1 unit, as a proxy for surveillance effort — an approach not implemented in this previous study.”

      We have now also provided a comparison between the different outputs, particularly in the Results section: “Our findings were overall consistent with those previously reported by Dhingra and colleagues (Dhingra et al., 2016), who used data from January 2004 to March 2015 for domestic poultry. However, some differences were noted: their maps identified higher ecological suitability for H5 occurrences before 2016 in North America, West Africa, eastern Europe, and Bangladesh, while our maps mainly highlight ecologically suitable regions in China, South-East Asia, and Europe (Fig. S5). In India, analyses consistently identified high ecologically suitable areas for the risk of local H5Nx and H5N1 circulation for the three time periods (pre-2016, 2016-2020, and post-2020). Similar to the results reported by Dhingra and colleagues, we observed an increase in the ecological suitability estimated for H5N1 occurrence in South America's domestic bird populations post-2020. Finally, Dhingra and colleagues identified high suitability areas for H5Nx occurrence in North America, which are predicted to be associated with a low ecological suitability in the 2016-2020 models.”

      We acknowledge that some regions predicted as highly suitable correspond to areas where extrapolation likely occurs due to limited or no recorded outbreaks. We have now added these specifications when discussing the resulting suitability maps obtained for domestic birds: “For H5Nx post-2020, areas of high predicted ecological suitability, such as Brazil, Bolivia, the Caribbean islands, and Jilin province in China, likely result from extrapolations, as these regions reported few or no outbreaks in the training data”, and, for wild birds: “Some of the areas with high predicted ecological suitability reflect the result of extrapolations. This is particularly the case in coastal regions of West and North Africa, the Nile Basin, Central Asia (Kyrgyzstan, Tajikistan, Uzbekistan), Brazil (including the Amazon and coastal areas), southern Australia, and the Caribbean, where ecological conditions are similar to those in areas where outbreaks are known to occur but where records of outbreaks are still rare.”

      For wild birds (H5Nx, post-2020), high ecological suitability was predicted along the West and North African coasts, the Nile basin, Central Asia (e.g., Kyrgyzstan, Tajikistan, Uzbekistan), the Brazilian coast and Amazon region, Caribbean islands, southern Australia, and parts of Southeast Asia. Ecological suitability estimated in these regions may directly result from extrapolations and should therefore be interpreted cautiously.

      We also added a discussion of the extrapolation for wild birds (in the Discussion section): “Interestingly, our models extrapolate environmental suitability for H5Nx in wild birds in areas where few or no outbreaks have been reported. This discrepancy may be explained by limited surveillance or underreporting in those regions. For instance, there is significant evidence that Kazakhstan and Central Asia play a role as a centre for the transmission of avian influenza viruses through migratory birds (Amirgazin et al., 2022; FAO, 2005; Sultankulova et al., 2024). However, very few wild bird cases are reported in EMPRES-i. In contrast, Australia appears environmentally suitable in our models, yet no incursion of HPAI H5N1 2.3.4.4b has occurred despite the arrival of millions of migratory shorebirds and seabirds from Asia and North America. Extensive surveillance in 2022 and 2023 found no active infections nor evidence of prior exposure to the 2.3.4.4b lineage (Wille et al., 2024; Wille and Klaassen, 2023).”

      We agree that inset panels can be helpful for visualising global patterns. However, all resulting maps are available on the MOOD platform (https://app.mood-h2020.eu/core), which provides an interactive interface allowing users to zoom in and out, identify specific locations using a background map, and explore the results in greater detail. This resource is referenced in the manuscript to guide readers to the platform.

      Related to my review of the manuscript's conceptualisation above, there are several inconsistencies in terminology in the manuscript - clearing these up may help to make the methods and their justification clearer to the reader. The "signal" that the models are estimating is variously described as "susceptibility" and "risk" (lines 179-180), "HPAI H5 ecological suitability" (line 78), "likelihood of HPAI occurrences" (line 139), "risk of HPAI circulation" (line 187), "distribution of occurrence data" (line 428). Each of these quantities has slightly different meanings and it is confusing to the reader that all of these descriptors are used for model output. "Likelihood of HPAI occurrences" is particularly misleading: ecological niche models predict high suitability for a species in areas that are similar to environments where it has previously been identified, without imposing constraints on species movement. It is intuitively far more likely that there will be HPAI occurrences in areas where the disease is already established than in areas where an introduction event is required, however, the niche models in this work do not include spatial relationships in their predictions.

      We agree with the Reviewer’s comments. We have now modified the text so that in the Results section we refer to ecological suitability when referring to the outputs of the models. In the context of our Discussion section, we then interpret this ecological suitability in terms of risk, as areas with high ecological suitability being more likely to support local HPAI outbreaks.

      I also caution the authors in their interpretation of the results of BRTs, which are correlative models, so therefore do not tell us what causes a response variable, but rather what is correlated with it. On Line 31, "correlated with" may be more appropriate than "influenced by". On Line 82, "correlated with" is more appropriate than "driving". This is particularly true given the authors' treatment of sampling bias.

      We agree with the Reviewer’s comment and have now rephrased these sentences as follows: “The spatial distribution of HPAI H5 occurrences in wild birds appears to be primarily correlated with urban areas and open water regions” and “Our results provide a better understanding of HPAI dynamics by identifying key environmental factors correlated with the increase in H5Nx and H5N1 cases in poultry and wild birds, investigating potential shifts in their ecological niches, and improving the prediction of at-risk areas.”

      The following sentences in line 201 are ambiguous: "For both H5Nx and H5N1, however, isolated areas on the risk map should be interpreted with caution. These isolated areas may result from sparse data, model limitations, or local environmental conditions that may not accurately reflect true ecological suitability." By "isolated", do the authors mean remote? Or ecologically dissimilar from the set of locations where HPAI has been detected? Or ecologically dissimilar from the set of locations in the joint set of HPAI detection locations and pseudo-absences? Or ecologically similar to the set of locations where HPAI has been detected but spatially isolated? These four descriptors are each slightly different and change the meaning of the sentences. "Model limitations" are also ambiguous - could the authors clarify which specific model limitations they are referring to here? Ultimately, the point being made is probably that a model may predict high ecological suitability for HPAI transmission in areas where the disease has not yet been identified, or where a model is extrapolating in environmental space, however, uncertainty in these predictions may be greater than uncertainty in predictions in areas that are represented in surveillance data. A clear comment on model uncertainty and how it is related to the surveillance dataset and the covariate dataset is currently missing from the manuscript and would be appropriate in this paragraph.

      We understand the Reviewer’s concerns regarding these potential ambiguities, and have now rephrased these sentences as follows: “For both H5Nx and H5N1, certain areas of predicted high ecological suitability appear spatially isolated, i.e. surrounded by regions of low predicted ecological suitability. These areas likely meet the environmental conditions associated with past HPAI occurrences, but their spatial isolation may imply a lower risk of actual occurrences, particularly in the absence of nearby outbreaks or relevant wild bird movements.”

      I am concerned by the wording of the following sentence: "The risk maps reveal that high-risk areas have expanded after 2020" (line 203). This statement could be supported by an acknowledgement of the assumptions the models make of the HPAI niche: are we saying that the niche is unchanged in environmental space and that there are now more geographic areas accessible to the pathogen, or that the niche has shifted or expanded, and that there are now more geographic areas accessible to the pathogen? The authors should review the sentence beginning on line 117: if models trained on data from the old timepoint predicting to the new timepoint are almost as good as models trained on data from the new timepoint predicting to the new timepoint, doesn't this indicate that the niche, as the models are able to capture it, has not changed too much?

      We thank the Reviewer for this comment. The statement that "high-risk areas have expanded after 2020" indeed refers to an increase in the geographic extent of areas predicted to have high ecological suitability in models trained on post-2020 data. This expansion likely reflects new outbreak data from regions that had not previously reported cases, which in turn influenced model training.

      However, models trained on pre-2020 data retain reasonable predictive performance when applied to post-2020 data (see the AUC results reported in Table S1), suggesting that the models suggest an expansion in the ecological suitability, but do not provide definitive evidence of a shift in the ecological niche. We have now added a statement at the end of this paragraph to clarify this point: “However, models trained on pre-2020 data maintained reasonable predictive performance when tested on post-2020 data, suggesting that the overall ecological niche of HPAI did not drastically shift over time.”

      The final two paragraphs of the Results might be more helpful to include at the beginning of the Results, as the data discussed there are inputs to the models. Is it possible that the "rise in Shannon index for sea birds" that "suggests a broadening of species diversity within this category from 2020 onwards" is caused by the increasingly structured surveillance of HPAI in wild birds alluded to earlier in the Results? Is the "prevalence" discussed in line 226 the frequency of the families Laridae and Sulidae being represented in HPAI detection data? Or the abundance of the bird species themselves? The language here is a little ambiguous. Discussion of particular values of Shannon/Simpson indices is slightly out of context as the meanings of the indices are in the Methods - perhaps a brief explanation of the uses of Shannon/Simpson indices may be helpful to the reader here. It may also be helpful to readers who are not acquainted with avian taxonomy to provide common names next to formal names (for example, in brackets) in the body of the text, as this manuscript is published in an interdisciplinary journal.

      We thank the Reviewer for these comments. First, we acknowledge that the paragraphs on species diversity and Shannon/Simpson indices describe important data, but we have chosen to present them after the main modelling results in order to maintain a logical narrative flow. Our manuscript first presents the ecological niche models and their predictive performance, followed by interpretations of the observed patterns, including changes in avian host diversity. Diversity indices were used primarily to support and contextualise the patterns observed in the modelling results.

      For clarity, we have revised the relevant paragraphs in the Results (i) to briefly remind readers of the interpretation of the Shannon and Simpson indices (“Note that these indices reflect the diversity of bird species detected in outbreak records, not necessarily their abundance in the wild”) and (ii) to clarify that “prevalence” refers to the frequency of HPAI detection in wild bird species of the Laridae (gulls) and Sulidae (boobies and gannets) families, and not their total abundance. Family of birds includes several species, so the “common name” of a family can sometimes refer to species from other families. We have now added the common names for each family in the manuscript (even if we indeed acknowledge that “penguins” can be ambiguous).

      In the Methods, it is stated: "To address the heterogeneity of AIV surveillance efforts and to avoid misclassifying low-surveillance areas as unsuitable for virus circulation, we trained the ecological niche models only considering countries in which five or more cases have been confirmed." However, it is not clear how this processing step prevents low-surveillance areas from being misclassified. If pseudo-absences are appropriately sampled, low-surveillance areas should be less represented in the pseudo-absence dataset, which should lead the models to be uncertain in their predictions of these areas. Perhaps "To address the heterogeneity of AIV surveillance efforts and to avoid sampling pseudo-absence data in realistically low-surveillance areas" is a more accurate introduction to the paragraph. I am not entirely convinced that it is appropriate to remove detection data where the national number of cases is low. This may introduce further sampling bias into the dataset.

      We take the opportunity of the Reviewer’s comment to further clarify this important step aiming to mitigate bias associated with countries with substantial uncertainty in reporting and/or potentially insufficient HPAI surveillance data. While we indeed acknowledge that this procedure may exclude countries that had effective surveillance but low virus detection, we argue that it constitutes a relevant conservative approach to minimising the risk of sampling a significant number of pseudo-absence points in areas associated with relatively high yet undetected local HPAI circulation due to insufficient surveillance. Furthermore, given that five cases over two decades is a relatively low threshold — particularly for a highly transmissible virus such as AIV — non-detection or non-reporting remains a more plausible explanation than true absence.

      To improve clarity, we have now revised the related sentence as follows: “To account for heterogeneity in AIV surveillance and minimise the risk of sampling pseudo-absences in poorly monitored regions, we restricted our analysis to countries (or administrative level 1 units in China and Russia) with at least five confirmed outbreaks.”

      The reporting of spatial and temporal resolution of data in the manuscript could be significantly clearer. Is there a reason why human population density is downscaled to 5 arcminutes (~10km at the equator) while environmental covariate data has a resolution of 1km? The projection used is not reported. The authors should clarify the time period/resolution of the covariate data assigned to the occurrence dataset, for example, does "day LST annual mean" represent a particular year pre- or post-2020? Or an average over a number of years? Given that disease detections are associated with observation and reporting dates, and that there may be seasonal patterns in HPAI occurrence, it would be helpful to the reader to include this information when the eco-climatic indices are described. It would also be helpful to the reader to summarise the source, spatial and temporal resolution of all covariates in a table, as in Dhingra et al. Could the Authors clarify whether the duck density layer is farmed ducks or wild ducks?

      The projection is WGS 84 (EPSG:4326) and the resolution of the output maps is around 0.0833 x 0.0833 decimal degrees (i.e. 5 arcmin, or approximately 10 km at the equator). We have now added these specifications in the text: “All maps are in a WGS84 projection with a spatial resolution of 0.0833 decimal degrees (i.e. 5 arcmin, or approximately 10 km at the equator).” In addition, we have now specified in the text that duck refers to domestic duck for clarity. 

      Environmental variables retrieved for our analyses were here available as values averaged over distinct periods of time (for further detail see Supplementary Information Resources S1 — description and source of each environmental variable included in the original sets of variables — available at https://github.com/sdellicour/h5nx_risk_mapping). In future works, this would indeed be interesting to associate the occurrences to a specific season with the variables accordingly, specially for viruses such as HPAI which have been found correlated with seasons. However, we did not conduct this type of analysis in the present study, occurrences being here associated with averaged values of environmental data only.

      In line 407, the authors state a number of pseudo-absence points used in modelling, relative to the number of presence points, without clear justification. Note that relative weights can be assigned to occurrence data in most ECN software (e.g., R package gbm), to allow many pseudo-absence points to be sampled to represent the full extent of probable surveillance effort and subsequently down-weighted.

      We thank the Reviewer for this suggestion. We acknowledge that alternative approaches such as down-weighting pseudo-absence points could offer a certain degree of flexibility in representing surveillance effort. However, we opted for a fixed 1:3 ratio of pseudoabsences to presence points within each administrative unit to ensure a consistent and conservative sampling distribution. This approach aimed to limit overrepresentation of pseudoabsences in areas with sparse presence data, while still reflecting areas of likely surveillance.

      There are a number of typographical errors and phrasing issues in the manuscript. A nonexhaustive list is provided below.

      - Line 21: "its" should be "their" - Line 25: "HPAI cases"

      Modifications have been done.

      - Line 63: sentence beginning "However" is somewhat out of context - what is it (briefly) about recent outbreaks that challenge existing models?

      We have now edited that sentence as follows: “However, recent outbreaks raise questions about whether earlier ecological niche models still accurately predict the current distribution of areas ecologically suitable for the local circulation of HPAI H5 viruses.”

      - Lines 71 and 390: "AIV" is not defined in the text - Line 73: "do" ("are" and "what" are not capitalised)

      Modifications have been done.

      - Line 115: "predictability" should be "predictive capacity"

      We have now replaced “predictability” by “predictive performance”.

      - Line 180: omit "pinpointing"

      - Line 192 sentence beginning "In India," should be re-worded: is the point that there are detections of HPAI here and the model predicts high ecological suitability?

      - Line 195 sentence beginning "Finally," phrasing could be clearer: Dhingra et al. find high suitability areas for H5Nx in North America which are predicted to be low suitability in the new model.

      - Line 237: omit "the" in "with the those"

      - Line 374: missing "."

      - Line 375: "and" should be "to" (the same goes for line 421)

      - Line 448: Rephrase "Simpson index goes" to "The Simpson index ranges"

      Modifications have been done.

      Reviewer #2 (Public Review):

      What is the justification for separating the dataset at 2020? Is it just the gap in-between the avian influenza outbreaks?

      We chose 2020 as a cut-off based on a well-documented shift in HPAI epidemiology, notably the emergence and global spread of clade 2.3.4.4b, which may affect host dynamics and geographic patterns. We have now added this precision in the Materials and Methods section: “We selected 2020 as a cut-off point to reflect a well-documented shift in HPAI epidemiology, notably the emergence and global spread of clade 2.3.4.4b. This event marked a turning point in viral dynamics, influencing both the range of susceptible hosts and the geographical distribution of outbreaks.”

      If the analysis aims to look at changing case numbers and distribution over time, surely the covariate datasets should be contemporaneous with the response?

      Thank you for raising this important point. While we acknowledge that, ideally, covariates should match the response temporally, such high-resolution spatiotemporal environmental data were not available for most environmental factors considered in our ecological niche modelling analyses. While we used predictors (e.g., land-use variables, poultry density) that reflect long-term ecological suitability, we acknowledge that rather considering short-term seasonal variation could be an interesting perspective in future works, which is now explicitly stated in the Discussion section: “In addition, aligning outbreak occurrences with seasonally matched environmental variables could further refine predictions of HPAI risk linked to migratory dynamics.”

      I would expect quite different immunity dynamics between domestic and wild birds as a function of lifespan and birth rates - though no obvious sign of that in the raw data. A statement on assumptions in that respect would be good.

      Thank you for the comment. We agree that domestic and wild birds likely exhibit different immunity dynamics due to differences in lifespan, turnover rates, and exposure. However, our analyses did not explicitly model immunity processes, and the data did not show a clear signal of these differences.

      Decisions and analytical tactics from Dhingra et al are adopted here in a way that doesn't quite convey the rationale, or justify its use here.

      We thank the Reviewer for this observation. However, we do not agree with the notion that the rationale for using Dhingra et al.’s analytical framework is insufficiently conveyed. We adapted key components of their ecological niche modelling approach — such as the use of a boosted regression tree methodology and pseudo-absences sampling procedure — to ensure comparability with their previous findings, while also extending the analysis to additional time periods and host categories (wild vs. domestic birds). This framework aligns with the main objective of our study, which is to assess shifts in ecological suitability for HPAI over time and across host species, in light of changing viral dynamics.  

      Please go over the manuscript and harmonise the language about the model target - it is usually referred to as cases, but sometimes the pathogen, and others the wild and domestic birds where the cases were discovered.

      We agree and we have now modified the text to only use the “cases” or “occurrences” terminology when referring to the model inputs.

      Is the reporting of your BRT implementation correct? The text suggests that only 10 trees were run per replicate (of which there were 10 per response (domestic/wild x H5N1 / H5Nx) x distinct covariate set), but this would suggest that the authors were scarcely benefiting from the 'boosting' part of the BRTs that allow them to accurately estimate curvilinear functions. As additional trees are added, they should still be improving the loss function, and dramatically so in the early stages. The authors seem heavily guided by Elith et al's excellent paper[1] explaining BRTs and the companion tutorial piece, but in that work, the recommended approach is to run an initial model with a relatively quick learning rate that achieves the best fit to the held-out data at somewhere over 1000 trees, and then to refine the model to that number of trees with a slower learning rate. If the authors did indeed run only 10 trees I think that should be explained.

      For each model, we used the “gbm.step” function to fit boosted regression trees, initiating the process with 10 trees and allowing up to 10,000 trees in steps of 5. The optimal number of trees was automatically determined by minimising the cross-validated deviance, following the recommended approach of Elith and colleagues (2008, J. Anim. Ecol.). This setup allows the boosting algorithm to iteratively improve model performance while avoiding overfitting. These aspects are now further clarified in the Materials and Methods section: “All BRT analyses were run and averaged over 10 cross-validated replicates, with a tree complexity of 4, a learning rate of 0.01, a tolerance parameter of 0.001, and while considering 5 spatial folds. Each model was initiated with 10 trees, and additional trees were incrementally added (in steps of 5) up to a maximum of 10,000, with the optimal number selected based on cross-validation tests.”

      I'm uncomfortable with the strong interpretation of changes in indices such as those for diversity in the case of bird species with detected cases of avian influenza, and the relative influence of covariates in the environmental niche models. In the former case, if surveillance effort is increasing it might be expected that more species will be found to be infected. In the latter, I'm just not convinced that these fundamentally correlative models can support the interpretation of changing epidemiology as asserted by authors. This strikes me as particularly problematic in light of static and in some cases anachronistic predictor sets.

      We thank the Reviewer for drawing attention to how changes in surveillance intensity might influence our diversity estimates. We have now integrated a new analysis to evaluate the increase in the number of wild birds tested and discussed the potential impact of this increase on the comparison of the bird species diversity metrics presented in our study, which is now interpreted with more caution: “To evaluate whether the post-2020 increase in species diversity estimated for infected wild birds could result from an increase in the number of tests performed on wild birds, we compared European annual surveillance test counts (EFSA et al., 2025, 2019) before and after 2020 using a Wilcoxon rank-sum test. We relied on European data because it was readily accessible and offered standardised and systematically collected metrics across multiple years, making it suitable for a comparative analysis. Although borderline significant (p-value = 0.063), the Wilcoxon rank-sum test indeed highlighted a recent increase in the number of wild bird tests (on average >11,000/year pre-2020 and >22,000 post-2020), which indicates that the comparison of bird species diversity metrics should be interpreted with caution. However, such an increase in the number of tests conducted in the context of a passive surveillance framework would thus also be in line with an increase in the number of wild birds found dead and thus tested. Therefore, while the increase in the number of tests could indeed impact species diversity metrics such as the Shannon index, it can also reflect an absolute higher wild bird mortality in line with a broadened range of infected bird species.”

    1. . How we understand ourselves

      Here to do How we understand ourselves in the world around us, that's like the central human

      task. Like that's that's what we're here to do is to figure that out and to act accordingly.

      understand to be understood

      Loce to be Loved

      Amour Mundi

      cultivate an interest in the things u must do

      Learn what u need to Learn and Learn It

      Symmathesy mutual learning

      Symmatheticist in the medium of Software

      we all are

      Software design and development is a Symmathetic conversation not just in but through the medium of software

    1. A 3-by-4 design has 4 factors each at 3 level

      Not correct, it is 2 factors each with different levels, so one has 3 and one has 4 levels. The number of numbers is amount of factors and the value of the numbers is the amount of levels.

    1. Zjadł 720 jajek w miesiąc! Efekt? Nikt się tego nie spodziewał

      Based on YT video I Ate 720 Eggs in 1 Month. Here's What Happened to my Cholesterol:

      • Nick Norwitz, a Harvard medical student with a doctorate in human cerebral metabolism, ate 720 eggs in one month, averaging 24 eggs per day, to test the effects on his cholesterol levels.
      • Contrary to popular belief that eggs increase bad cholesterol (LDL), his LDL cholesterol dropped during the experiment. It decreased by 2% in the first two weeks and then by an additional 18% in the next two weeks.
      • The high cholesterol intake (about 133,000 mg from eggs) did not raise his blood cholesterol. This was explained by a hormone called cholesin, released when cholesterol binds to receptors in the gut, which signals the liver to reduce cholesterol production, maintaining balance in the body.
      • During the last two weeks, he also increased carbohydrate intake slightly, which further helped reduce LDL levels, possibly in combination with other dietary elements like fruits.
      • The experiment challenges the long-held medical advice limiting egg consumption due to cholesterol concerns, suggesting that dietary cholesterol may not significantly impact blood cholesterol levels in healthy individuals.
      • Norwitz documented his experiment publicly on YouTube, sparking discussions on nutrition and metabolism and encouraging re-evaluation of egg consumption guidelines.
      • Eggs are a veritable treasure trove of nutrients: harmful vitamins A, D, E, and K, harmful B vitamins, calcium, phosphorus, iron, sodium, and potassium. However, even such a valuable product should not replace other essential nutrients.
    1. Epigenetyk: badanie superstulatki pokazuje, że starzenie można spowalniać
      • The world's oldest verified person, Maria Branyas Morera, lived to 117 years and 168 days, surpassing the average lifespan by over 30 years.
      • A comprehensive study analyzed her genetics, epigenetics, metabolism, immunity, and gut microbiome using blood, saliva, urine, and stool samples.
      • Despite her advanced age, she exhibited excellent health: low inflammation, healthy lipid profile, protection against age-related illnesses, and no cancer or neurodegenerative diseases.
      • Her biological age (measured by epigenetic markers) was much younger than her chronological age, suggesting her cells aged more slowly.
      • No single “longevity gene” was found; instead, her longevity was linked to a rare combination of genetic variants, a robust immune system, a healthy Mediterranean diet, daily physical activity, and an exceptionally diverse gut microbiome rich in Bifidobacterium.
      • Her lifestyle choices—never smoking or drinking alcohol, walking daily, and eating three yogurts a day—supported her gut health and helped keep inflammation low.
      • Researchers emphasized that both genetic factors and environmental/lifestyle influences interplay, and that aspects like diet, stress reduction, and supporting a healthy microbiome may help slow biological aging.
      • The study’s results suggest the importance of healthy aging (maintaining function and independence in old age), and reinforce the idea that everyone can potentially influence their rate of aging through lifestyle, even if they don’t carry rare longevity genes.
    1. eLife Assessment

      This study presents valuable findings regarding the basic molecular pathways leading to the cystogenesis of Autosomal Dominant Polycystic Kidney Disease, suggesting BICC1 functions as both a minor causative gene for PKD and a modifier of PKD severity. Solid data were supplied to show the functional and structural interactions between BICC-1 and PKD2 and their relevance to the pathogenesis of ADPKD, although the characterization of such interactions remains to be developed further and the clinical relevance is currently unclear.

    2. Reviewer #1 (Public review):

      In this manuscript, Tran et al. investigate the interaction between BICC1 and ADPKD genes in renal cystogenesis. Using biochemical approaches, they reveal a physical association between Bicc1 and PC1 or PC2 and identify the motifs in each protein required for binding. Through genetic analyses, they demonstrate that Bicc1 inactivation synergizes with Pkd1 or Pkd2 inactivation to exacerbate PKD-associated phenotypes in Xenopus embryos and potentially in mouse models. Furthermore, by analyzing a large cohort of PKD patients, the authors identify compound BICC1 variants alongside PKD1 or PKD2 variants in trans, as well as homozygous BICC1 variants in patients with early-onset and severe disease presentation. They also show that these BICC1 variants repress PC2 expression in cultured cells.

      Overall, the concept that BICC1 variants modify PKD severity is plausible, the data are robust, and the conclusions are largely supported.

      Comments on revision:

      My comments have been mostly addressed.

    3. Reviewer #2 (Public review):

      Tran and colleagues report evidence supporting the expected yet undemonstrated interaction between the Pkd1 and Pkd2 gene products Pc1 and Pc2 and the Bicc1 protein in vitro, in mice, and collaterally, in Xenopus and HEK293T cells. The authors go on to convincingly identify two large and non-overlapping regions of the Bicc1 protein important for each interaction and to perform gene dosage experiments in mice that suggest that Bicc1 loss of function may compound with Pkd1 and Pkd2 decreased function, resulting in PKD-like renal phenotypes of different severity. These results led to examining a cohort of very early onset PKD patients to find three instances of co-existing mutations in PKD1 (or PKD2) and BICC1. Finally, preliminary transcriptomics of edited lines gave variable and subtle differences that align with the theme that Bicc1 may contribute to the PKD defects, yet are mechanistically inconclusive.

      These results are potentially interesting, despite the limitation, also recognized by the authors, that BICC1 mutations seem exceedingly rare in PKD patients and may not "significantly contribute to the mutational load in ADPKD or ARPKD". The manuscript has several intrinsic limitations that must be addressed.

      The manuscript contains factual errors, imprecisions, and language ambiguities. This has the effect of making this reviewer wonder how thorough the research reported and analyses have been.

      Comments on revision:

      My comments have been addressed.

    4. Reviewer #3 (Public review):

      Summary:

      This study investigates the role of BICC1 in the regulation of PKD1 and PKD2 and its impact on cytogenesis in ADPKD. By utilizing co-IP and functional assays, the authors demonstrate physical, functional, and regulatory interactions between these three proteins.

      Strengths:

      (1) The scientific principles and methodology adopted in this study are excellent, logical, and reveal important insights into the molecular basis of cystogenesis.

      (2) The functional studies in animal models provide tantalizing data that may lead to a further understanding and may consequently lead to the ultimate goal of finding a molecular therapy for this incurable condition.

      (3) In describing the patients from the Arab cohort, the authors have provided excellent human data for further investigation in large ADPKD cohorts. Even though there was no patient material available, such as HUREC, the authors have studied the effects of BICC1 mutations and demonstrated its functional importance in a Xenopus model.

      Weaknesses:

      This is a well-conducted study and could have been even more impactful if primary patient material was available to the authors. A further study in HUREC cells investigating the critical regulatory role of BICC1 and potential interaction with mir-17 may yet lead to a modifiable therapeutic target.

      Conclusion:<br /> The authors achieve their aims. The results reliably demonstrate the physical and functional interaction between BICC1 and PKD1/PKD2 genes and their products.

      The impact is hopefully going to be manifold:

      (1) Progressing the understanding of the regulation of the expression of PKD1/PKD2 genes.

      Comments on revision:

      My comments have been addressed and sorted.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      (1) The authors devote significant effort to characterizing the physical interaction between Bicc1 and Pkd2. However, the study does not examine or discuss how this interaction relates to Bicc1's well-established role in posttranscriptional regulation of Pkd2 mRNA stability and translation efficiency.

      The reviewer is correct that the present study has not addressed the downstream consequences of uthis interaction considering that Bicc1 is a posttranscriptional regulator of Pkd2 (and potentially Pkd1). We think that the complex of Bicc1/Pkd1/Pkd2 retains Bicc1 in the cytoplasm and thus restrict its activity in participating in posttranscriptional regulation (see Author response image 1). We, however, do not yet have data to support this and thus have not included this model in the manuscript. Yet, we have updated the discussion of the manuscript to further elaborate on the potential mechanism of the Bicc1/Pkd1/Pkd2 complex.

      We have updated the discussion to include a discussion on the potential consequences on posttranscriptional regulation by Bicc1.

      Author response image 1.

      Model of BICC1, PC1 and PC2 self-regulation. In this model Bicc1 acts as a positive regulator of PKD gene expression. In the presence of ‘sufficient’ amounts of PC1/PC2 complex, it is tethered to the complex and remains biologically inactive (Fig. 1A). However, once the levels of the PC1/PC2 complex are reduced, Bicc1 is now present in the cytoplasm to promote expression of the PKD proteins, thereby raising their levels (Fig. 4B), which then in turn will ‘shutdown’ Bicc1 activity by again tethering it to the plasma membrane.

      (2) Bicc1 inactivation appears to downregulate Pkd1 expression, yet it remains unclear whether Bicc1 regulates Pkd1 through direct interaction or by antagonizing miR-17, as observed in Pkd2 regulation. This should be further examined or discussed.

      This is a very interesting comment. Vishal Patel published that PKD1 is regulated by a mir-17 binding site in its 3’UTR (PMID: 35965273). We, however, have not evaluated whether BICC1 participates in this regulation. A definitive answer would require utilization of the mice described in above reference, which is beyond the scope of this manuscript. We, however, have revised the discussion to elaborate on this potential mechanism. 

      We have updated the discussion to include a statement on the potential direct regulation of Pkd1 mRNA by Bicc1.

      (3) The evidence supporting Bicc1 and ADPKD gene cooperativity, particularly with Pkd1, in mouse models is not entirely convincing, likely due to substantial variability and the aggressive nature of Bpk/Bpk mice. Increasing the number of animals or using a milder Bicc1 strain, such as jcpk heterozygotes, could help substantiate the genetic interaction.

      We have initially performed the analysis using our Bicc1 complete knockout, we previously reported on (PMID 20215348) focusing on compound heterozygotes. Yet, similar to the Pkd1/Pkd2 compound heterozygotes (PMID 12140187) no cyst development was observed when we sacrificed the mice as late as P21. Our strain is similar to the above mentioned jcpk, which is characterized by a short, abnormal transcript thought to result in a null allele (PMID: 12682776). We thank the reviewer for pointing us to the reference showing the heterozygous mice exhibit glomerular cysts in the adults (PMID: 7723240). This suggestion is an interesting idea we will investigate. In general, we agree with the reviewer that a better understanding of the contribution of Bicc1 to the adult PKD phenotype will be critical. To this end, we are currently generating a floxed allele of Bicc1 that will allow us to address the cooperativity in the adult kidney, when e.g. crossed to the Pkd1<sup>RC/RC</sup> mice. Yet, these experiments are beyond the timeframe for this revision. 

      No changes were made in the revised manuscript. 

      Reviewer #2 (Public review):

      (1) These results are potentially interesting, despite the limitation, also recognized by the authors, that BICC1 mutations seem exceedingly rare in PKD patients and may not "significantly contribute to the mutational load in ADPKD or ARPKD". The manuscript has several intrinsic limitations that must be addressed. 

      As mentioned above, the study was designed to explore whether there is an interaction between BICC1 and the PKD1/PKD2 and whether this interaction is functionally important. How this translates into the clinical relevance will require additional studies (and we have addressed this in the discussion of the manuscript).

      (2) The manuscript contains factual errors, imprecisions, and language ambiguities. This has the effect of making this reviewer wonder how thorough the research reported and analyses have been. 

      We respectfully disagree with the reviewer on the latter interpretation. The study was performed with rigor. We have carefully assessed the critiques raised by the reviewer. As presented below, most of the criticisms raised by the reviewer have been easily addressed in the revised version of the manuscript. Yet, none of the critiques seems to directly impact the overall interpretation of the data. 

      Reviewer #1 (Recommendations for the authors):

      (1) The manuscript requires further editing. For example, figure panels and legends are mismatched in Figure 1

      We have corrected the labeling of Figure 1. 

      (2) Y-axis units and values are inconsistent in Figures 4b-4g, Supplementary Figures S2e and S2f are not referenced in the text, genotypes are missing in Supplementary Figure S3f, and numerous typographical errors are present.

      In respect to the y-axis in Figure 4b-g, the scale is different for each of them, but that is intentional as one would lose the differences if they were all scaled identically. But we have now mentioned this in the figure legend to make the reader aware of it. In respect to the Supplemental Figure S2e,f, we included the panels in the description of the mutant BICC1 lines, but unfortunately forgot to reference them. This has now been done.

      We have updated the labeling of the Y-axis for the cystic indices adding “[%]” as the unit and updated the figure legend of Figure 4. We have included the genotypes in Supplementary Figure S3f. The Supplementary Figure S2e,f is now mentioned in the supplemental material (page 9, 2<sup>nd</sup> paragraph). 

      Reviewer #2 (Recommendations for the authors):

      (1) Previous data from mouse, Xenopus, and zebrafish suggest a crucial role for the RNAbinding protein Bicc1 in the pathogenesis of PKD, although BICC1 mutations in human PKD have not been previously reported." The cited sources (and others that were not cited) link Bicc1 mutations to renal cysts, similar to a report by Kraus (PMID: 21922595) that the authors cite later. However, a more direct link to PKD was reported by Lian and colleagues using whole Pkd1 mice (PMID: 20219263) and by Gamberi and colleagues using Pkd1 kidneys and human microarrays (PMID: 28406902). Although relevant, neither is cited here, and only the former is cited later in the manuscript.

      Thanks for pointing this out. We have added these three citations.

      We have added these three citations (PMID: 21922595, PMID: 20219263 and PMID: 28406902) in the indicated sentence.

      (2) In Figure 1B, the lanes do not seem to correspond among panels, particularly evident in the panel with myc-mBicc1. Hence, it is difficult to agree with the presented conclusions.

      We have corrected the labeling of the lanes in Figure 1b.

      (3) In the Figure 1 legend: "(g) Western blot analysis following co-IP experiments, using an anti-mouse Bicc1 or anti-goat PC2 antibody as bait, identified protein interactions between endogenous PC2 and BICC1 in UCL93 cells. Non-immune goat and mouse IgG were included as a negative control." There is no mention of panel H, although this reviewer can imagine what the authors meant. The capitalization differs in the figure and legend. More troublingly, in panel G, a non-defined star indicates a strong band present in both immune and non-immune control.

      We have corrected the figure legend of Figure 1 and clarified the non-specific band in the figure legend.

      (4) In Figure 4, the authors do not show the matched control for the Bicc1 Pkd1 interaction in panel d, nor do they show a scale bar in either a) or d). Thus, the phenotypic severity cannot be properly assessed.

      Thanks for pointing out the missing scale bars, which have now been added. In respect to the two kidneys shown in Figure 4d, the two kidneys shown are from littermates to illustrate the kidney size in agreement with the cumulative data shown in Figure 4e. Unfortunately, this litter did not have a wildtype control. As the data analysis in Figure 4e is based on littermates, mixing and matching kidneys of different litters does not seem appropriate. Thus, we have omitted showing a wildtype control in this panel. However, the size of the wildtype kidney can be seen in Figure 4a.

      We have added the scale bar to both panels and have updated the figure legend to emphasize that the kidneys shown are from littermates and that no wildtype littermate was present in this litter.

      (5) "Surprisingly, an 8-fold stronger interaction was observed between full-length PC1 and myc-mBicc1-ΔKH compared to mycmBicc1 or myc-mBicc1-ΔSAM." Assuming all the controls for protein folding and expression levels have been carried out and not shown/mentioned, this sentence seems to contradict the previous statement that Bicc1deltaSAM reduced the interaction with PC1 by 55%. Because the full length and SAM deletion have different interaction strengths, the latter sentence makes no sense.

      The reduction in the levels of myc-mBicc1-ΔSAM compared to wildtype mycmBicc1 in respect to PC1 binding was not significant. We have clarified this in the text.

      We have corrected the sentence and modified the Figure accordingly. 

      (6) Imprecise statements make a reader wonder how to interpret the data: "More than three independent experiments were analyzed." Stating the sample size or including it in the figure would save space and improve confidence in the data presented.

      We have stated the exact number of animals per conditions above each of the bars.

      (7) "Next, we performed a similar mouse study for Pkd1 by reducing the gene dose of Pkd1 postnatally in the collecting ducts using a Pkhd1-Cre as previously described40" What did the authors mean?

      The reference was included to cite the mouse strain, but realized that it can be mis-interpreted that the exact experiments has been performed previously. We have clarified this in the text.

      We have reworded the sentence to avoid misinterpretation. 

      (8) The authors examined the additive effects of knocking down Bicc1, Pkd1, and Pkd2 with morpholinos in Xenopus and, genetically, in mice. While the Bicc1[+/-] Pkd1 or 2[+/-] double heterozygote mice did not show phenotypes, the authors report that the Bicc1[-/-] Pkd1 or 2 [+/-] did instead show enlarged kidneys. What is the phenotype of a Bicc1[+/-] Pkd1 or 2 [-/-]? What we learn from the author's findings among the PKD population suggests that the latter situation would be potentially translationally relevant.

      The mouse experiments were designed to address a cooperativity between Bicc1 and either Pkd1 or Pkd2 and whether removal of one copy of Pkd1 or Pkd2 would further worsen the Bicc1 cystic kidney phenotype. Thus, the parental crosses were chosen to maximize the number of animals obtained for these genotypes. Unfortunately, these crosses did not yield the genotypes requested by the reviewer. To address the contribution of Bicc1 towards the PKD population, we will need to perform a different cross, where we eliminate Pkd1 or Pkd2 in a floxed background of Bicc1 postnatally in adult mice. While we are gearing up to perform such an experiment, this is timewise beyond the scope of the manuscript. In addition, please note that we have addressed the question about the translation towards the PKD population already in the discussion of the original submission (page 13/14, last/first paragraph).

      No changes have been made to the revised version of the manuscript.

      (9) How do the authors interpret the milder effects of the Bicc1[-/-] Pkd1[+/-] compared to Bicc1[-/-] Pkd2[+/-] relative to the respective protein-protein interactions?

      The milder effects are due to the nature of the crosses. While the Pkd2 mutant is a germline mutation, the Pkd1 mutant is a conditional allele eliminating Pkd1 only in the collecting ducts of the kidney. As such, we spare other nephron segments such as the proximal tubules, which also significantly contribute to the cyst load. As such these mouse data support the interaction between Pkd1 and Pkd2 with Bicc1, but do not allow us to directly compare the outcomes. While this was mentioned in the previous version of the manuscript, we have expanded on this in the revised version of the manuscript.

      We have expanded the results section in the revised version of the manuscript highlighting that the two different approaches cannot be directly compared.

      (10) How do the authors interpret that the strong Bicc1[Bpk] Pkd1 or Pkd2 double heterozygote mice did not have defects and "kidneys from Bicc1+/-:Pkd2+/- did not exhibit cysts (data not shown)", when the VEO PKD patients and - although not a genetic reduction - also the morpholino-treated Xenopus did?

      VEO PKD patients are characterized by a loss of function of PKD1 or PKD2 and – as we propose in this manuscript - that BICC1 further aggravates the phenotype. Yet, we do not address either in the mouse or Xenopus experiments whether BICC1 is a genetic modifier. We are simply addressing whether the two genes show a genetic interaction. In the mouse studies, we eliminate one copy of Pkd1 or Pkd2 in the background of a hypomorphic allele of Bicc1. Similarly, in the Xenopus experiments, we employ suboptimal doses of the morpholino oligomers, i.e., concentrations that did not yield a phenotypic change and then asked whether removing both together show cooperativity. It is important to state that this is based on a biological readout and not defined based on the amount of protein. While we have described this already in the original manuscript (page 7, first paragraph), we have amended our description of the Xenopus experiment to make this even clearer. 

      Finally, we agree with the reviewer that if we were to address whether Bicc1 is a modifier of the PKD phenotype in mouse, we would need to reduce Bicc1 function in a Pkd1 or Pkd2 mutants. Yet, we have recognized this already in the initial version of the manuscript in the discussion (page 14, first paragraph).

      We have expanded the results section when discussing the suboptimal amounts of the morpholino oligos (Page 6, 1<sup>st</sup> paragraph).

      (11) Unclear: "While variants in BICC1 are very rare, we could identify two patients with BICC1 variants harboring an additional PKD2 or PKD1 variant in trans, respectively." Shortly after, the authors state in apparent contradiction that "the patients had no other variants in any of other PKD genes or genes which phenocopy PKD including PKD1, PKD2, PKHD1, HNF1s, GANAB, IFT140, DZIP1L, CYS1, DNAJB11, ALG5, ALG8, ALG9, LRP5, NEK8, OFD1, or PMM2."

      The reviewer is correct. This should have been phrased differently. We have now added “Besides the variants reported below” to clarify this more adequately.

      The sentence was changed to start with “Besides the variants reported below, […].”

      (12) "The demonstrated interaction of BICC1, PC1, and PC2 now provides a molecular mechanism that can explain some of the phenotypic variability in these families." How do the authors reconcile this statement with their reported ultra-rare occurrence of the BICC1 mutations?

      As mentioned in the manuscript and also in response to the other two reviewers, Bicc1 has been shown to regulate Pkd2 gene expression in mice and frogs via an interaction with the miR-17 family of microRNAs. Moreover, the miR-17 family has been demonstrated to be critical in PKD (PMID: 30760828, PMID: 35965273, PMID: 31515477, PMID: 30760828). In fact, both other reviewers have pointed out that we should stress this more since Bicc1 is part of this regulatory pathway. Future experiments are needed to address whether Bicc1 contributes to the variability in ADPKD onset/severity. Yet, this is beyond the scope of this study. 

      Based on the comments of the two other reviewers we have further addressed the Bicc1/miR-17 interaction.

      (13) The manuscript should use correct genetic conventions of italicization and capitalization. This is an issue affecting the entire manuscript. Some exemplary instances are listed below.

      (a) "We also demonstrate that Pkd1 and Pkd2 modifies the cystic phenotype in Bicc1 mice in a dose-dependent manner and that Bicc1 functionally interacts with Pkd1, Pkd2 and Pkhd1 in the pronephros of Xenopus embryos." Genes? Proteins?

      The data presented in this section show that a hypomorphic allele of Bicc1 in mouse and a knockdown in Xenopus yields this. As both affect the proteins, the spelling should reflect the proteins.

      No changes have been made in the revised manuscript.

      (b) The sentence seems to use both the human and mouse genetic capitalization, although it refers to experiments in the mouse system “to define the Bicc1 interacting domains for PC2 (Fig. 2d,e). Full-length PC2 (PC2-HA) interacted with full-length myc-mBICC1.”

      We agree with the review that stating the species of the molecules used is critical, we have adapted a spelling of Bicc1, where BICC1 is the human homologue, mBicc1 is the mouse homologue and xBicc1 the Xenopus one.

      We have highlighted the species spelling in the methods section and labeled the species accordingly throughout the manuscript and figures. 

      (14) “Together these data supported our biochemical interaction data and demonstrated that BICC1 cooperated with PKD1 and PKD2.” Are the authors implying that these results in mice will translate to the human protein?

      We agree that we have not formally shown that the same applies to the human proteins. Thus, we have changed the spelling accordingly.

      We have revised the capitalization of the proteins. 

      (15) The text is often unclear, terse, or inconsistent.

      (a) “These results suggested that the interaction between PC1 and Bicc1 involves the SAM but not the KH/KHL domains (or the first 132 amino acids of Bicc1). It also suggests that the N-terminus could have an inhibitory effect on PC1-BICC1 association.” How do the authors define the N-terminus? The first 132 aa? KH/KHL domains?

      This was illustrated in the original Figure 2A. The DKH constructs lack the first 351 amino acids. 

      To make this more evident, we have specified this in the text as well.

      (b) Similarly, the authors state below, "Unlike PC1, PC2 interacted with mycmBICC1ΔSAM, but not myc-mBICC1-ΔKH suggesting that PC2 binding is dependent on the N-terminal domains but not the SAM domain." It is unclear if the authors refer to the KH/KHL domains or others. Whatever the reference to the N-terminal region, it should also be consistent with the section above.

      This is now specified in the text.

      (c) Unclear: "We have previously demonstrated that Pkd2 levels are reduced in a complete Bicc1 null mice,22 performing qRT-PCR of P4 kidneys (i.e. before the onset of a strong cystic phenotype), revealed that Bicc1, Pkd1 and Pkd2 were statistically significantly down9 regulated (Fig. 4h-j)".

      We have changed the text to clarify this. 

      (d) “Utilizing recombinant GST domains of PC1 and PC2, we demonstrated that BICC1 binds to both proteins in GST-pulldown assays (Fig. 1a, b)." GST-tagged domains? Fusions?

      We have changed the text to clarify this. 

      (e) "To study the interaction between BICC1, PKD1 and PKD2 we combined biochemical approaches, knockout studies in mice and Xenopus, genetic engineered human kidney cells" > genetically engineered.

      We have changed the text to clarify this.

      (f) Capitalization (e.g., see Figure S3, ref. the Bpk allele) and annotation (e.g., Gly821Glu and G821E) are inconsistent.

      We have homogenized the labeling of the capitalization and annotations throughout the manuscript. 

      (g) What do the authors mean by "homozygous evolutionarily well-conserved missense variant"?

      We have changed this is the revised version of the manuscript. 

      Reviewer #3 (Public review/Recommendations to the authors):

      (1) A further study in HUREC cells investigating the critical regulatory role of BICC1 and potential interaction with mir-17 may yet lead to a modifiable therapeutic target.

      (2) This study should ideally include experiments in HUREC material obtained from patients/families with BICC1 mutations and studying its effects on the PKD1/2 complex in primary cell lines.

      This is an excellent suggestion. We agree with the reviewer that it would have been interesting to analyze HUREC material from the affected patients. Unfortunately, besides DNA and the phenotypic analysis described in the manuscript neither human tissue nor primary patient-derived cells collected once the two patients with the BICC1 p.Ser240Pro variant passed away.

      No changes to the revised manuscript have been made to address this point.

      (3) Please remove repeated words in the following sentence in paragraph 2 of the introduction: "BICC1 encodes an evolutionarily conserved protein that is characterized by 3 K-homology (KH) and 2 KH-like (KHL) RNA-binding domains at the N-terminus and a SAM domain at the C-terminus, which are separated by a by a disordered intervening sequence (IVS).23-28".

      This has been changed.

    1. eLife Assessment

      This study provides new single-cell multi-omics datasets that may be useful in the study of early cardiac lineages. However, the authors' conclusions regarding the mutual regulation of key regulators for cardiac specification and new cardiac lineage trajectories are inadequately supported by persuasive analysis and do not align with prior published studies. If revised to address the serious caveats adequately, the findings may be of interest to researchers in the field of cardiac development and congenital heart disease.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, the authors identified and described the transcriptional trajectories leading to CMs during early mouse development, and characterized the epigenetic landscapes that underlie early mesodermal lineage specification.

      The authors identified two transcriptomic trajectories from a mesodermal population to cardiomyocytes, the MJH and PSH trajectories. These trajectories are relevant to the current model for the First Heart Field (FHF) and the Second Heart Field (SHF) differentiation. Then, the authors characterized both gene expression and enhancer activity of the MJH and PSH trajectories, using a multiomics analysis. They highlighted the role of Gata4, Hand1, Foxf1, and Tead4 in the specification of the MJH trajectory. Finally, they performed a focused analysis of the role of Hand1 and Foxf1 in the MJH trajectory, showing their mutual regulation and their requirement for cardiac lineage specification.

      Strengths:

      The authors performed an extensive transcriptional and epigenetic analysis of early cardiac lineage specification and differentiation which will be of interest to investigators in the field of cardiac development and congenital heart disease. The authors considered the impact of the loss of Hand1 and Foxf1 in-vitro and Hand1 in-vivo.

      Weaknesses:

      The authors used previously published scRNA-seq data to generate two described transcriptomic trajectories.

      (1) Details of the re-analysis step should be added, including a careful characterization of the different clusters and maker genes, more details on the WOT analysis, and details on the time stamp distribution along the different pseudotimes. These details would be important to allow readers to gain confidence that the two major trajectories identified are realistic interpretations of the input data.

      The authors have also renamed the cardiac trajectories/lineages, departing from the convention applied in hundreds of papers, making the interpretation of their results challenging.

      (2) The concept of "reverse reasoning" applied to the Waddington-OT package for directional mass transfer is not adequately explained. While the authors correctly acknowledged Waddington-OT's ability to model cell transitions from ancestors to descendants (using optimal transport theory), the justification for using a "reverse reasoning" approach is missing. Clarifying the rationale behind this strategy would be beneficial.

      (3) As the authors used the EEM cell cluster as a starting point to build the MJH trajectory, it's unclear whether this trajectory truly represents the cardiac differentiation trajectory of the FHF progenitors:<br /> - This strategy infers that the FHF progenitors are mixed in the same cluster as the extra-embryonic mesoderm, but no specific characterization of potential different cell populations included in this cluster was performed to confirm this.

      - The authors identified the EEM cluster as a Juxta-cardiac field, without showing the expression of the principal marker Mab21l2 per cluster and/or on UMAPs.

      - As the FHF progenitors arise earlier than the Juxta-cardiac field cells, it must be possible to identify an early FHF progenitor population (Nkx2-5+; Mab21l2-) using the time stamp. It would be more accurate to use this FHF cluster as a starting point than the EEM cluster to infer the FHF cardiac differentiation trajectory.

      These concerns call into question the overall veracity of the trajectory analysis, and in fact, the discrepancies with prior published heart field trajectories are noted but the authors fail to validate their new interpretation. Because their trajectories are followed for the remainder of the paper, many of the interpretations and claims in the paper may be misleading. For example, these trajectories are used subsequently for annotation of the multiomic data, but any errors in the initial trajectories could result in errors in multiomic annotation, etc, etc.

      (4) As mentioned in the discussion, the authors identified the MJH and PSH trajectories as non-overlapping. But, the authors did not discuss major previously published data showing that both FHF and SHF arise from a common transcriptomic progenitor state in the primitive streak (DOI: 10.1126/science.aao4174; DOI: 10.1007/s11886-022-01681-w). The authors should consider and discuss the specifics of why they obtained two completely separate trajectories from the beginning, how these observations conflict with prior published work, and what efforts they have made at validation.

      (5) Figures 1D and E are confusing, as it's unclear why the authors selected only cells at E7.0. Also, panels 1D 'Trajectory' and 'Pseudotime' suggest that the CM trajectory moves from the PSH cells to the MJH. This result is confusing, and the authors should explain this observation.

      (6) Regarding the PSH trajectory, it's unclear how the authors can obtain a full cardiac differentiation trajectory from the SHF progenitors as the SHF-derived cardiomyocytes are just starting to invade the heart tube at E8.5 (DOI: 10.7554/eLife.30668).

      The above notes some of the discrepancies between the author's trajectory analysis and the historical cardiac development literature. Overall, the discrepancies between the author's trajectory analysis and the historical cardiac development literature are glossed over and not adequately validated.

      (7) The authors mention analyzing "activated/inhibited genes" from Peng et al. 2019 but didn't specify when Peng's data was collected. Is it temporally relevant to the current study? How can "later stage" pathway enrichment be interpreted in the context of early-stage gene expression?

      (8) Motif enrichment: cluster-specific DAEs were analyzed for motifs, but the authors list specific TFs rather than TF families, which is all that motif enrichment can provide. The authors should either list TF families or state clearly that the specific TFs they list were not validated beyond motifs.

      (9) The core regulatory network is purely predictive. The authors again should refrain from language implying that the TFs in the CRN have any validated role.

      Regarding the in vivo analysis of Hand1 CKO embryos, Figures 6 and 7:

      (10) How can the authors explain the presence of a heart tube in the E9.5 Hand1 CKO embryos (Figure 6B) if, following the authors' model, the FHF/Juxta-cardiac field trajectory is disrupted by Hand1 CKO? A more detailed analysis of the cardiac phenotype of Hand1 CKO embryos would help to assess this question.

      (11) The cell proportion differences observed between Ctrl and Hand1 CKO in Figure 6D need to be replicated and an appropriate statistical analysis must be performed to definitely conclude the impact of Hand1 CKO on cell proportions.

      (12) The in-vitro cell differentiations are unlikely to recapitulate the complexity of the heart fields in-vivo, but they are analyzed and interpreted as if they do.

      (13) The schematic summary of Figure 7F is confusing and should be adjusted based on the following considerations:<br /> (a) the 'Wild-type' side presents 3 main trajectories (SHF, Early HT and JCF), but uses a 2-color code and the authors described only two trajectories everywhere else in the article (aka MJH and PSH). It's unclear how the SHF trajectory (blue line) can contribute to the Early HT, when the Early HT is supposed to be FHF-associated only (DOI: 10.7554/eLife.30668). As mentioned previously in Major comment 3., this model suggests a distinction between FHF and JCF trajectories, which is not investigated in the article.<br /> (b) the color code suggests that the MJH (FHF-related) trajectory will give rise to the right ventricle and outflow tract (green line), which is contrary to current knowledge.

      Minor comments:

      (1) How genes were selected to generate Figure 1F? Is this a list of top differentially expressed genes over each pseudotime and/or between pseudotimes?

      (2) Regarding Figure 1G, it's unclear how inhibited signaling can have an increased expression of underlying genes over pseudotimes. Can the authors give more details about this analysis and results?

      (3) How do the authors explain the visible Hand1 expression in Hand1 CKO in Figure S7C 'EEM markers'? Is this an expected expression in terms of RNA which is not converted into proteins?

      (4) The authors do not address the potential presence of doublets (merged cells) within their newly generated dataset. While they mention using "SCTransform" for normalization and artifact removal, it's unclear if doublet removal was explicitly performed.

      Comments on revised version:

      Summary:

      The authors have not addressed the major philosophical problems with the initial submission. They interpret their data without care to conform to years of prior publications in the field. This causes the authors to draw fanciful conclusions that are highly likely to be inaccurate (at best).

      Q1R1: The authors gave more details about the characterization of cell types and the two identified trajectories.

      a) It remains unclear how the authors generated this list. Are they manually selected genes based on relevant literature or an unbiased marker gene identification analysis? Either references should be added, or the bioinformatics explanation should be included in the method section.<br /> b) Revised text satisfies the comment.<br /> c) Revised text satisfies the comment.

      Other comments:

      Figure 1F: left annotation needs to be corrected (two "JCF specific").

      Q2R1: Revised text satisfies the comment.

      Q3R1 (1): Revised text satisfies the comment.

      Q3R1 (2): a) The explanation of how the authors built the JCF trajectory makes sense and the renaming from "MJH" to "JCF" is correct and better represents the identification that was made using time points from E7.5 to E8.5. However, the explanation given does not answer our original question. Our original comment asked about the FHF differentiation trajectory. The authors built the "MJH" trajectory as the combined "FHF/JCF" trajectory, however, it is not directly established whether the FHF and JCF progenitor differentiation trajectories are the same. The authors did not directly try to identify the FHF and JCF trajectories separately using appropriate real time windows but only assumed that they were the same. Every link between JCF and FHF trajectories assuming that they are shared without prior identification of the FHF progenitor differentiation trajectory should be removed from the manuscript (e.g. page 4: "namely the JCF trajectory (the Hand1-expressing early extraembryonic mesoderm - JCF and FHF - CM)").

      b) Adding the Mab21l2 ICA plot satisfies the comment.

      c) The explanation given by the authors regarding the FHF trajectory analysis is missing important details. The authors started the reverse trajectory analysis from E7.75 cardiomyocytes as being the FHF.

      - The authors should be mindful with the distinction between FHF progenitors and FHF-derived cardiomyocytes.<br /> - It is unclear whether cells called after the starting point (E7.75 CMs) in the reverse FHF trajectory, were collected prior E7.75. Can the authors add more details, and a real time point distribution along the FHF pseudotime to their analysis? Also, what cells belong to the FHF trajectory after the E7.75 CMs in the reverse direction? These cells should be shown as in Figure 1A and 1B for the JCF and SHF trajectories.<br /> - As the FHF arises first and differentiates into the cardiac crescent prior to or at the same time the JCF and SHF emerge, it is impossible for late progenitors (JCF and SHF) to contribute to the early FHF progenitor pool. Therefore, the observation that "both JCF and SHF lineages contribute to the early FHF progenitor population" can not be correct. It is also not what Dominguez et al showed. This misinterpretation goes against the current literature (e.g. DOI: 10.1038/ncb3024) and will leads to confusion.

      Q4R1: Revised text and figure satisfy the comment.

      Q5R1: The answer satisfies the comment.

      Q6R1: a) The authors did not address the question and did not change their language in the manuscript. As SHF-derived cardiomyocytes are missing (because they are generated after E8.5), the part of the SHF trajectory going from SHF progenitors to the E8.5 heart tube must be inaccurate.

      b) The authors correctly mentioned, both JCF and SHF will contribute to the four-chamber heart. However, as the dataset used by the authors spans only to E8.5 (which is days before the completion of the four-chamber heart), and all SHF and the vast majority of JCF contributions don't reach the heart until after E8.5, any claims about trajectories from JCF/SHF progenitor pools to cardiomyocytes should be removed because they do not correspond to prior published and accepted work.

      Q7R1: Especially because gene expression levels change over time, the authors might have considered genes as specific and restricted to a pathway based on their expression at a given time (e.g. later time), but at another time (e.g. earlier time), the same genes could have another expression pattern and not be pathway-specific anymore.

      Q8R1: Revised text satisfies the comment.

      Q9R1: Revised text satisfies the comment.

      Q10R1: Thank you for analyzing deeper the cardiac phenotype of the Hand1 cKO embryos.

      Regarding the presence of a heart tube, while, following the authors' model the FHF/JCF trajectory is disrupted:

      - Renaming the "MSH" to "JCF" is more accurate to the data shown by the authors as mainly the EEM is altered after Hand1 cKO.<br /> - The presence of the heart tube suggests that even if the JCF is altered, the FHF can still produce a cardiac crescent and a heart tube (as observed in Hand1-null embryos DOI: 10.1038/ng0398-266). The schematic Figure 7F suggests that only the SHF contribution will allow the formation of the heart tube. This unorthodox idea would need to be assessed by an alternate approach. More likely is that the model simply ignores the FHF contribution (the most important up to E8.5). The schematic is therefore incomplete and inaccurate and should be removed or edited to correspond to the prior literature.

      Q11R1: It is unclear what "replicates" mean in the authors' answer, as if they have been pooled without replicate-specific barcodes they are no longer replicates and should be considered as a single sample. This should be explicitly written in the method section.<br /> Thank you for your IF staining/quantification. If DAPI was used, it should be written in the figure caption.

      Q12R1: Revised text satisfies the comment.

      Q13R1: The answer given by the authors did not satisfy the comment because of the following:

      - The authors investigated two differentiation trajectories (JCF and SHF) in the article but Figure 7F presents three trajectories (JCF, SHF, and Early HT). The "Early HT" is neither mentioned, nor discussed in the manuscript.<br /> - Figure 7F suggests that the "Early HT" trajectory corresponds to a combination of the SHF and JCF trajectories but does not mention the early FHF trajectory. This is going against the current literature. This relates to the comments of Q10R1.<br /> - As the authors rightly point out, the SHF will be contributing to the heart tube, but through a cell invasion of the already differentiated heart tube (10.1016/j.devcel.2023.01.010). Our prior comments did not question the implication of the SHF to the looping and ballooning process but mentioned that the heart tube arises before the invasion from SHF and is FHF-derived. Figure 7F in the context of Hand1-null suggest that the heart tube will form from the SHF lineage, which is confusing as the SHF is known to contribute by invasion of the (already-formed) FHF-derived heart tube. The FHF lineage is missing from the authors' model.<br /> - In the revised manuscript, the FHF trajectory analysis is still unclear and suggests that the JCF and SHF progenitors contribute to the FHF progenitor which is going against current literature. This relates to the comments of Q3R1 (2).

      Overall, the schematic Figure 7F is very confusing as it does not follow already published data without being fully validated and therefore is inaccurate and misleading.

      Minor comments:

      The answers satisfy the minor comments.

    3. Reviewer #2 (Public review):

      Summary of goals:

      The aims of the study were to identify new lineage trajectories for the cardiac lineages of the heart, and to use computational and cell and animal studies to identify and validate new gene regulatory mechanisms involved in these trajectories.

      Strengths:

      Overall: the study addresses the long standing yet still not fully answered questions of what drives the earliest specification mechanisms of the heart lineages. The introduction demonstrates a good understanding of the relevant lineage trajectories that have been previously established, and the significance of the work is well described. The study takes advantage of several recently published data sets and attempts t use these in combination to uncover any new mechanisms underlying early mesoderm/cardiac specification mechanisms. A strength of the study is the use of an in vitro model system (mESCs) to assess the functional relevance of the key players identified in the computational analysis, including innovative technology such as CRISPR-guided enhancer modulations. Lastly, the study generates mesoderm-specific Hand1 LOF embryos and assesses the differentiation trajectories in these animals, which represents a strong complementary approach to the in vitro and computational analysis earlier in the paper. The manuscript is clearly written and the methods section is detailed and comprehensive.

      Comments and Weaknesses:

      I unfortunately still have the same concerns I had for the original submission. There are many strong claims about lineage trajectories and population relationships that are based purely on the analysis of a number of datasets, some published and a few new datasets.

      The methods used involve significant input bias, and some of the less user-biased approaches, such as the new RNA velocity analysis on the JCF/SHF trajectories, are included only in the response to reviewers but not in the manuscript (R1R2), as far as I can tell. This analysis does not seem to suggest that CMs are generated from both trajectories, but it is difficult to know as they provide so little information on what exactly they did.<br /> The conclusions are particularly concerning not only because they are largely based on computational analysis, but also because they contradict well-described concepts (which are supported by in vivo lineage tracing).<br /> I want to give them credit for having done some additional experiments. That said, the new data added for the validation of some of their concepts (mESC Fig 5F and embryos Fig S8C) do not strengthen their conclusions in my opinion. The mESC data were not quantified, and the embryo data looks like quantifications were done in different planes of a single embryo, but it's hard to tell as little information is provided. Even with accurate quantification, I believe the IF analysis for VIM in Hand1 cKO embryos is not sufficient to back up their claims on the role of Hand1 in driving the JCF lineage.

    4. Reviewer #3 (Public review):

      In this manuscript, the Xie et al. delineate two cardiac lineage trajectories using pseudo-time and epigenetic analyses, tracing development from E6.5 to E8.5, culminating in cardiomyocytes (CMs). The authors propose that mutual regulation between the transcription factors Hand1 and Foxf1 plays a role in specifying a first cardiac lineage.

      Following the first round of revision, the authors have renamed their EEM-JCF/FHF (MJH) and PM-SHF (PSH) trajectories JCF and SHF. However, their use of this terminology is confusing. The so-called JCF trajectory appears to represent a mixture of JCF and FHF, as Hand1-expressing early extraembryonic mesoderm contributes to FHF-derived cardiomyocytes (e.g., HCN4+, Tbx5+). The authors then argue that JCF arises from Hand1+ cells and is therefore distinct from FHF, yet elsewhere suggest that both JCF and SHF contribute to FHF. This introduces conceptual inconsistencies.

      Furthermore, the expression of Hand1, Foxf1, and Bmp4 in the lateral plate mesoderm complicates the assertion that JCF is distinct from FHF (Development 2015; 142: 3307-3320; Nat Rev Mol Cell Biol, https://www.nature.com/articles/nrm2618; Circ Res 2021, https://doi.org/10.1161/CIRCRESAHA.121.318943). Mab21l2 expression also overlaps with the cardiac crescent. The designation of Tbx20 as a "key JCF-specific gene" is problematic, why should it not equally be considered an FHF-specific marker (https://pmc.ncbi.nlm.nih.gov/articles/PMC10629681)? Perhaps the JCF trajectory represent a subset of FHF. A designation such as "JCF/FHF" may therefore be more appropriate.

      In Figure 1A, the decision to define a single CM state as the endpoint of both trajectories is also problematic. FHF and SHF are known to give rise to distinct CM subtypes, yet in the authors' reconstruction both lineages converge on one CM population. This was the point raised in Question 1 of my initial review. If both trajectories converge on the same CM state, are they truly independent lineages? This interpretation remains unclear and potentially misleading.

    5. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public Review):

      Summary:

      In this study, the authors identified and described the transcriptional trajectories leading to CMs during early mouse development, and characterized the epigenetic landscapes that underlie early mesodermal lineage specification.

      The authors identified two transcriptomic trajectories from a mesodermal population to cardiomyocytes, the MJH and PSH trajectories. These trajectories are relevant to the current model for the First Heart Field (FHF) and the Second Heart Field (SHF) differentiation. Then, the authors characterized both gene expression and enhancer activity of the MJH and PSH trajectories, using a multiomics analysis. They highlighted the role of Gata4, Hand1, Foxf1, and Tead4 in the specification of the MJH trajectory. Finally, they performed a focused analysis of the role of Hand1 and Foxf1 in the MJH trajectory, showing their mutual regulation and their requirement for cardiac lineage specification.

      Strengths:

      The authors performed an extensive transcriptional and epigenetic analysis of early cardiac lineage specification and differentiation which will be of interest to investigators in the field of cardiac development and congenital heart disease. The authors considered the impact of the loss of Hand1 and Foxf1 in-vitro and Hand1 in-vivo.

      Weaknesses:

      The authors used previously published scRNA-seq data to generate two described transcriptomic trajectories.

      We agree that a two-route cardiac development model has been described, which is consistent with our analyses. However, the developmental origins and key events by early lineage specification is unclear. Our study provided new insights from the following aspects:

      a) Computational analyses inferred the earliest cardiac fate segregation by E6.75-7.0.

      b) Provided the new-generated E7.0 multi-omics data which revealed the transcriptomic and chromatin accessibility landscape.

      c) Utilized multi-omics and ChIP-seq data to construct a core regulatory network underlying the JCF lineage specification.

      d) Applied in vitro and in vivo analyses, which elucidated the synergistic and different roles of key transcription factors, HAND1 and FOXF1.

      Q1R1: Details of the re-analysis step should be added, including a careful characterization of the different clusters and maker genes, more details on the WOT analysis, and details on the time stamp distribution along the different pseudotimes. These details would be important to allow readers to gain confidence that the two major trajectories identified are realistic interpretations of the input data.

      R1R1: Thank you for the valuable suggestion. In the last version, we characterized the two major trajectories by identifying their common or specific gene sets, and by profiling the expression dynamics along pseudotime (Figure 1F). But we realized a careful description was not provided. In the revised manuscript, we have made the following improvements:

      a) Provided marker gene analyses based on cell types as well as developmental lineages to support the E7.0 progenitor clusters (Figure S1F).

      b) For Figure 1F: revised the text and introduced characteristic genes for the two trajectories.

      c) For WOT analysis: provided more details in the first paragraph of the ‘Results’ section.

      R2R1: The authors have also renamed the cardiac trajectories/lineages, departing from the convention applied in hundreds of papers, making the interpretation of their results challenging.

      R2R1: Agreed. We have changed the MJH as JCF lineage and PSH as SHF lineage.

      Q3R1: The concept of "reverse reasoning" applied to the Waddington-OT package for directional mass transfer is not adequately explained. While the authors correctly acknowledged Waddington-OT's ability to model cell transitions from ancestors to descendants (using optimal transport theory), the justification for using a "reverse reasoning" approach is missing. Clarifying the rationale behind this strategy would be beneficial.

      R3R1: Thank you for pointing out the unclear explanation. As mentioned in R1R1, we have clarified the rationale in the revised manuscript. 

      We would like to provide some additional details: WOT is designed for time-series scRNA-seq data where the time/stage each single cell is given. At any adjacent time points t<sub>i</sub> and t<sub>i+1</sub>, WOT estimates the transition probability of all cells at t<sub>i</sub> to all cells at t<sub>i+1</sub>. One can select a cell set of interest at any time point t<sub>i</sub> to infer their ancestors at t<sub>i-1</sub> or their descendants at t<sub>i+1</sub> by sums of the transition probabilities. As introduced in the original paper, WOT allows for both ‘forward’ and ‘reverse’ inference (DOI: 10.1016/j.cell.2019.01.006).

      Q3R1: As the authors used the EEM cell cluster as a starting point to build the MJH trajectory, it's unclear whether this trajectory truly represents the cardiac differentiation trajectory of the FHF progenitors:

      - This strategy infers that the FHF progenitors are mixed in the same cluster as the extra-embryonic mesoderm, but no specific characterization of potential different cell populations included in this cluster was performed to confirm this.

      To build the MJH trajectory, we performed a two-step analysis:

      (1) Firstly, we used E8.5 CM cells as a starting point to perform WOT computational reverse lineage tracing and identify CM progenitors at each time point.

      (2) Secondly, we selected EEM cells from the E7.5 CM progenitor pool, as a starting point to perform WOT analysis. Cells along this trajectory consist of the JCF lineage (Figure 1B).

      The reason why we chose to use this subset of E7.5 EEM cells was due to its purity. It is distinct from the SHF lineage as suggested by their separation in the UMAP. It is also different from FHF cells as no FHF/CM markers were detected by E7.5. 

      It is admitted that it is infeasible to achieve 100% purity in this single cell omics analysis, but we believe the current strategy of defining the JCF lineage is reasonable. The distinct gene expression dynamics (Figure 1F) and spatial mapping results (Figure 1C), between JCF and SHF lineages, also supported our conclusion.

      - The authors identified the EEM cluster as a Juxta-cardiac field, without showing the expression of the principal marker Mab21l2 per cluster and/or on UMAPs.

      Thank you for your suggestion. We have added Mab21l2 expression plots in the ICA layout (new Figure S1D), showing its transient expression dynamics, consistent with Tyser et al (DOI: 10.1126/science.abb2986).

      - As the FHF progenitors arise earlier than the Juxta-cardiac field cells, it must be possible to identify an early FHF progenitor population (Nkx2-5+; Mab21l2-) using the time stamp. It would be more accurate to use this FHF cluster as a starting point than the EEM cluster to infer the FHF cardiac differentiation trajectory.

      We appreciate your insights. We used the early FHF progenitor population (E7.75 Nkx2-5+; Mab21l2- CM cells) as the starting point and identified its progenitor cells by E7.0 (Figure S2A). Results suggest both JCF and SHF lineages contribute to the early FHF progenitor population, consistent with live imaging-based single cell tracing by Dominguez et al (DOI: 10.1016/j.cell.2023.01.001).

      These concerns call into question the overall veracity of the trajectory analysis, and in fact, the discrepancies with prior published heart field trajectories are noted but the authors fail to validate their new interpretation. Because their trajectories are followed for the remainder of the paper, many of the interpretations and claims in the paper may be misleading. For example, these trajectories are used subsequently for annotation of the multiomic data, but any errors in the initial trajectories could result in errors in multiomic annotation, etc, etc.

      Thank you for your valuable comments. In the revised manuscript, we have added details about the trajectory analysis including the procedure of WOT lineage inference, marker gene expression and early FHF lineage tracing. We also renamed the two trajectories to avoid confusion with prior published heart field trajectories. Generally, our trajectories are consistent with the published evidence about two major lineages contributing to the linear heart tube:

      a) Clonal analysis: two trajectories exist which demonstrate differential contribution to the E8.5 cardiac tube (Meilhac et al, DOI: 10.1016/s1534-5807(04)00133-9).

      b) Live imaging: JCF cells contribute to the forming heart (Tyser et al, DOI: 10.1126/science.abb2986; Dominguez et al, DOI: 10.1016/j.cell.2023.01.001).

      c) Genetic labelling based lineage tracing: early Hand1+ mesodermal cells differentiate and contribute to the cardiac crescent (Zhang et al, DOI: 10.1161/CIRCRESAHA.121.318943).

      Molecular events by the initial segregation of the two lineages were not characterized before, which are the main focus of our paper. Our analyses suggest that the JCF lineage segregates earlier from the nascent/mixed mesoderm status, also consistent with the clonal analysis (Meilhac et al, DOI: 10.1016/s1534-5807(04)00133-9).

      Q4R1: As mentioned in the discussion, the authors identified the MJH and PSH trajectories as nonoverlapping. But, the authors did not discuss major previously published data showing that both FHF and SHF arise from a common transcriptomic progenitor state in the primitive streak (DOI: 10.1126/science.aao4174; DOI: 10.1007/s11886-022-01681-w). The authors should consider and discuss the specifics of why they obtained two completely separate trajectories from the beginning, how these observations conflict with prior published work, and what efforts they have made at validation.

      R4R1: Thank you for the important question. For trajectory analysis, we assigned cells to the trajectory with higher fate probability, resulting in ‘non-overlapping’ cell sets. However, the statement of ‘two non-overlapping trajectories’ is inaccurate. We performed analysis of fate divergence between two trajectories (which was not shown in the first version), which suggests, before E7.0, mesodermal cells have similar probabilities to choose either trajectory (Figure S1E). We agree with you and previously published data that the JCF and SHF arise from a common progenitor pool. Correction has been made in the revised manuscript.

      Q5R1: Figures 1D and E are confusing, as it's unclear why the authors selected only cells at E7.0. Also, panels 1D 'Trajectory' and 'Pseudotime' suggest that the CM trajectory moves from the PSH cells to the MJH. This result is confusing, and the authors should explain this observation.

      R5R1: Thank you for pointing out the confusion. As mentioned in R4R1, trajectory analysis indicates JCFSHF fate segregation by E7.0 and we used Figures 1D and E to characterize the cellular status. By E7.0, JCF progenitors are at EEM or MM status, while SHF progenitors are still at the earlier differentiation stage (NM). This result is consistent with previous clonal analysis (Meilhac et al, DOI: 10.1016/s1534-5807(04)00133-9) which demonstrates an apparent earlier segregation of the first lineage. Our interpretation of the pseudotime analysis is that it represents different levels of differentiation, instead of developmental direction.

      Q6R1: Regarding the PSH trajectory, it's unclear how the authors can obtain a full cardiac differentiation trajectory from the SHF progenitors as the SHF-derived cardiomyocytes are just starting to invade the heart tube at E8.5 (DOI: 10.7554/eLife.30668).

      R6R1.1: We agree with your opinion. Our trajectory analysis covers E8.5 SHF-derived CM cells and progenitors. Cells that differentiate as CM cells after E8.5 were missed.

      The above notes some of the discrepancies between the author's trajectory analysis and the historical cardiac development literature. Overall, the discrepancies between the author's trajectory analysis and the historical cardiac development literature are glossed over and not adequately validated.

      R6R1.2: Historical cardiac development related literature provided evidence, using multiple techniques, which support the existence of two cardiac lineages with common progenitors at the beginning and overlapping contribution of the four-chamber heart. Our trajectory analysis is in agreement with this model and provides more detailed molecular insights about lineage segregation by E7.0. Thank you for pointing out our mistakes describing the observations. We have corrected the text and provided additional data (Figure S1D-F and S2), aiming to resolved the confusions.

      Q7R1: The authors mention analyzing "activated/inhibited genes" from Peng et al. 2019 but didn't specify when Peng's data was collected. Is it temporally relevant to the current study? How can "later stage" pathway enrichment be interpreted in the context of early-stage gene expression?

      R7R1: The gene sets of "activated/inhibited genes" were collected from several published perturbation datasets (Gene Expression Omnibus accession numbers GSE48092, GSE41260, GSE17879, GSE69669, GSE15268 and GSE31544) using mouse ES cells or embryos. For a specific pathway, the gene set is fixed but the gene expression levels, which change over time, reflect the pathway enrichment. This explains the differential pathway enrichment between early and late stages.

      Q8R1: Motif enrichment: cluster-specific DAEs were analyzed for motifs, but the authors list specific TFs rather than TF families, which is all that motif enrichment can provide. The authors should either list TF families or state clearly that the specific TFs they list were not validated beyond motifs.

      R8R1: Thank you for your comment. For the DAE motif analysis, we firstly inferred the motif and TF families, then tested which specific TFs are expressed in the corresponding cell cluster. We have added this information in the legend of Figure 2D.

      Q9R1: The core regulatory network is purely predictive. The authors again should refrain from language implying that the TFs in the CRN have any validated role.

      R9R1: Thank you for your kind suggestion. We have revised the manuscript to avoid any misleading implications, as follows:

      “Through single-cell multi-omics analysis, a predicted core regulatory network (CRN) in JCF is identified, consisting of transcription factors (TFs) GATA4, TEAD4, HAND1 and FOXF1.”

      Q10R1: Regarding the in vivo analysis of Hand1 CKO embryos, Figures 6 and 7:

      How can the authors explain the presence of a heart tube in the E9.5 Hand1 CKO embryos (Figure 6B) if, following the authors' model, the FHF/Juxta-cardiac field trajectory is disrupted by Hand1 CKO? A more detailed analysis of the cardiac phenotype of Hand1 CKO embryos would help to assess this question.

      R10R1: Thank you for your valuable suggestion. In the revised manuscript, we have added detailed analysis of the cardiac phenotype of Hand1 CKO embryo (Figure S8C). Data suggest that by E8.5 when heart looping initiate in control group (14/17), the hearts of Hand1 CKO embryos (3/3) still demonstrate a linear tube morphology. By E9.5 when atrium and ventricle become distinct in WT embryos, heart looping of Hand1 CKO embryos is abnormal. The cardiac defects of our MESP1CRE driven Hand1 conditional KO are consistent with those of Hand1-null mutant mice (Doi: 10.1038/ng0398-266; D oi: 10.1038/ng0398-271).

      Author response image 1.

      The bright field images of E8.5-E9.5 Ctrl and Hand1 CKO mouse embryos. The arrows indicating the embryonic heart (h) and head folds (hf). Scale bars (E8.5): 200 μm; scale bars (E9.5): 500 μm.

      Q11R1: The cell proportion differences observed between Ctrl and Hand1 CKO in Figure 6D need to be replicated and an appropriate statistical analysis must be performed to definitely conclude the impact of Hand1 CKO on cell proportions.

      R11R1: We appreciate your valuable suggestion. As Figure 6D is based on scRNA-seq experiment, where replicates were merged as one single sequencing library, statistical analysis is infeasible. To address potential concerns about cell proportions, we added IF staining experiments of EEM marker gene, Vim, in serial embryo sections (Figure S8D). Statistical analysis indicates a significant decrease of VIM+ EEM cell proportion of Hand1 CKO embryos.

      Q12R1: The in-vitro cell differentiations are unlikely to recapitulate the complexity of the heart fields invivo, but they are analyzed and interpreted as if they do.

      R12R1: We agree with your opinion. In the revised manuscript, we tuned down the interpretation of the invitro cell differentiation data. 

      Previous version:

      I.  “The analysis indicated that HAND1 and FOXF1 could dually regulate MJH specification through directly activating the MJH specific genes and inhibiting the PSH specific genes.”

      II. “Together, our data indicated that mutual regulation between HAND1 and FOXF1 could play a key role in MJH cardiac progenitor specification.”

      III. “Thus, our data further supported the specific and synergistic roles of HAND1 and FOXF1 in MJH cardiac progenitor specification.”

      Revised version:

      I.  “The analysis indicated that HAND1 and FOXF1 were able to directly activate the JCF specific genes.”

      II. “Together, our in vitro experimental data indicated that mutual regulation between HAND1 and FOXF1 could play a key role in activation of JCF specific genes.”

      III. “These results suggest that HAND1 and FOXF1 may cooperatively regulate early cardiac lineage specification by promoting JCF-associated gene expression and suppressing alternative mesodermal programs.”

      Q13R1: The schematic summary of Figure 7F is confusing and should be adjusted based on the following considerations:

      (a) the 'Wild-type' side presents 3 main trajectories (SHF, Early HT and JCF), but uses a 2-color code and the authors described only two trajectories everywhere else in the article (aka MJH and PSH). It's unclear how the SHF trajectory (blue line) can contribute to the Early HT, when the Early HT is supposed to be FHF-associated only (DOI: 10.7554/eLife.30668). As mentioned previously in Major comment 3., this model suggests a distinction between FHF and JCF trajectories, which is not investigated in the article.

      R13R1(a): Thank you for your great insights. The paper you mentioned used Nkx2.5_cre/+; Rosa26tdtomato+/- and _Nkx2.5_eGFP embryos to reconstruct the cardiac morphologies between E7.5 and E8.2. Their 3D models clearly demonstrate the transition from yolk sac to FHF and then SHF (Figure 2A’ and A’’). The location of yolk sac is defined as JCF in later literature (DOI: 10.1126/science.abb2986). However, as _Nkx2.5 mainly marks cells after the entry of the heart tube, it is unable to reflect the lineage contribution by JCF or SHF. As in R3R1, more and more evidence support the contribution of both lineages to the Early HT, which is discussed in a recent review paper (DOI: 0.1016/j.devcel.2023.01.010).

      (b) the color code suggests that the MJH (FHF-related) trajectory will give rise to the right ventricle and outflow tract (green line), which is contrary to current knowledge.

      R13R1(b): Thank you for pointing out the confusion. The coloring of outflow tract is not an indication of JCF lineage contribution. We have changed the color of JCF/SHF trajectory in the revised model.

      Minor comments:

      Q14R1: How genes were selected to generate Figure 1F? Is this a list of top differentially expressed genes over each pseudotime and/or between pseudotimes?

      R14R1: For each trajectory, we ranked genes by the correlation between expression levels and pseudotime.

      Top 1000 genes for each group were selected.

      Q15R1: Regarding Figure 1G, it's unclear how inhibited signaling can have an increased expression of underlying genes over pseudotimes. Can the authors give more details about this analysis and results?

      R15R1: The increased expression of ‘inhibited genes’ could be explained as an indication of decreasing signaling levels or compensation effect by other signaling pathways. We appreciate your kind suggestion. Details about this analysis have been added in the Method section.

      Q16R1: How do the authors explain the visible Hand1 expression in Hand1 CKO in Figure S7C 'EEM markers'? Is this an expected expression in terms of RNA which is not converted into proteins?

      R16R1: Our opinion is that the visible Hand1 expression caused by the imperfect knock-out efficiency by Mesp1-Cre driven system.

      Q17R1: The authors do not address the potential presence of doublets (merged cells) within their newly generated dataset. While they mention using "SCTransform" for normalization and artifact removal, it's unclear if doublet removal was explicitly performed.

      R17R1: We appreciate your kind reminder. Doublet removal was performed using R package ‘DoubletFinder’ (DOI: 10.1016/j.cels.2019.03.003). We have added this information in the revised manuscript.

      Reviewer #2 (Public review):

      Summary of goals:

      The aims of the study were to identify new lineage trajectories for the cardiac lineages of the heart, and to use computational and cell and animal studies to identify and validate new gene regulatory mechanisms involved in these trajectories.

      Strengths:

      The study addresses the long-standing yet still not fully answered questions of what drives the earliest specification mechanisms of the heart lineages. The introduction demonstrates a good understanding of the relevant lineage trajectories that have been previously established, and the significance of the work is well described. The study takes advantage of several recently published data sets and attempts to use these in combination to uncover any new mechanisms underlying early mesoderm/cardiac specification mechanisms. A strength of the study is the use of an in vitro model system (mESCs) to assess the functional relevance of the key players identified in the computational analysis, including innovative technology such as CRISPR-guided enhancer modulations. Lastly, the study generates mesoderm-specific Hand1 LOF embryos and assesses the differentiation trajectories in these animals, which represents a strong complementary approach to the in vitro and computational analysis earlier in the paper. The manuscript is clearly written and the methods section is detailed and comprehensive.

      Comments and Weaknesses:

      Overall: The computational analysis presented here integrates a large number of published data sets with one new data point (E7.0 single cell ATAC and RNA sequencing). This represents an elegant approach to identifying new information using available data. However, the data presentation at times becomes rather confusing, and relatively strong statements and conclusions are made based on trajectory analysis or other inferred mechanisms while jumping from one data set to another. The cell and in vivo work on Hand1 and Foxf1 is an important part of the study. Some additional experiments in both of these model systems could strongly support the novel aspects that were identified by the computational studies leading into the work.

      We appreciate your positive comments and insightful suggestions. In the revised manuscript, we have incorporated additional analyses and experimental validations to address the concerns raised. Specifically, we added RNA velocity analysis to independently support the identification of the MJH and PSH trajectories, performed immunofluorescence staining of mesodermal and cardiac markers in Hand1 and Foxf1 knockout models, and included Vim staining-based quantification in Hand1 CKO embryos to assess developmental outcomes in vivo. Furthermore, we revised potentially overinterpreted conclusions, clarified methodological details of WOT analysis. These revisions have strengthened both the rigor and clarity of the manuscript.

      Q1R2: Definition of MJH and PSH trajectory:

      The study uses previously published data sets to identify two main new differentiation trajectories: the MJH and the PSH trajectory (Figure 1). A large majority of subsequent conclusions are based on in-depth analysis of these two trajectories. For this reason, the method used to identify these trajectories (WTO, which seems a highly biased analysis with many manually chosen set points) should be supported by other commonly used methods such as for example RNA velocity analysis. This would inspire some additional confidence that the MJH and PSH trajectories were chosen as unbiased and rigorous as possible and that any follow-up analysis is biologically relevant.

      R1R2: We appreciate your valuable comments. It is totally agreed that other commonly used methods help strengthen our conclusion about the two main trajectories. To this end, we performed RNA velocity analysis for the cardiac specification. Results support the contribution to CM along the MJH and PSH routes.

      Author response image 2.

      UMAP layout is colored by cell types. Developmental directions, shown as arrows, are inferred by RNA-velocity analysis.

      Actually, several recent studies indicated a convergence cardiac developing model where progenitors reach a myocardial state along two trajectories (DOI: 10.1016/j.devcel.2023.01.010). However, when and how specification between the two routes were unclear. Our data and analysis revealed a clear fate separation by E7.0 from transcriptomic and epigenetic perspectives, where unbiased RNA velocity analysis was performed (Figure 2C).

      We would like to clarify how we performed WOT (DOI: 10.1016/j.cell.2019.01.006) analysis: the only manually chosen cell set was the starting set, which was all cardiomyocyte cells by E8.5, of computational reverse lineage tracing. The ancestor cells were predicted in an unbiased manner among all mesodermal cells.

      Q2R2.1: Identification of MJH and PSH trajectory progenitors:

      The study defines various mesoderm populations from the published data set (Figure 1A-E), including nascent mesoderm, mixed mesoderm, and extraembryonic mesoderm. It further assigns these mesoderm populations to the newly identified MJH/PSH trajectories. Based on the trajectory definition in Figure 1A it appears that both trajectories include all 3 mesoderm populations, albeit at different proportions and it seems thus challenging to assign these as unique progenitor populations for a distinct trajectory, as is done in the epigenetic study by comparing clusters 8 (MJH) and 2 (PSH)(Figure 2). 

      R2R2.1: According to our model, the most significant difference between the two trajectories is their enrichment of EEM and PM cell types (Figure 1B), which represent the middle stages of cardiac development. Both trajectories begin as Mesp1+ Nascent mesoderm cells (Figure 1F), which is supported by Mesp1 lineage tracing (DOI: 10.1161/CIRCRESAHA.121.318943), and ends as cardiomyocytes. Our epigenetic analysis focused on the E7.0 stage when the two trajectories could be clearly separated and when JCF and SHF lineages were at mixed mesoderm and nascent mesoderm states, respectively. However, SHF lineage was predicted to bypass mixed mesoderm state later on.

      Q2R2.2: Along similar lines, the epigenetic analysis of clusters 2 and 8 did not reveal any distinct differences in H3K4m1, H3K27ac, or H3K4me3 at any of the time points analyzed (Figure 2F). While conceptually very interesting, the data presented do not seem to identify any distinct temporal patterns or differences in clones 2 and 8 (Figure 2H), and thus don't support the conclusion as stated: "the combined transcriptome and chromatin accessibility analysis further supported the early lineage segregation of MJH and the epigenetic priming at gastrulation stage for early cardiac genes".

      R2R2.2: In the epigenetic analysis, we delineated the temporal dynamics of E7.0 cluster-specific DAEs by selecting earlier (E6.5) and later (E7.5) time points. DAEs of C8 and C2 represent regulatory elements for the JCF and SHF lineages, respectively. We also included C1 DAEs as a reference to demonstrate the relative activity of C8 and C2. The overall temporal pattern suggests activation of C8 & C2, as their H3K4me1 and H3K27ac levels surpass C1 over time. Between C8 and C2, the following distinctions could be observed:

      a) H3K4me1 levels of C8 are higher by E6.5 and E7.0, with low H3K27ac levels, indicating early priming of C8 DAEs.

      b) By E7.5, H3K4me1 levels of C8 are caught up by C2 in E7.5 anterior mesoderm (E7.5_AM, Figure 2F column 3), where cardiac mesoderm is located.

      c) H3K4me1 and H3K27ac levels of C8 are similar as C1 in the posterior mesoderm (E7.5_P, Figure 2F column 4) and much higher than C2.

      d) From the perspective of chromatin accessibility, hundreds of characteristic DAEs were identified for C2 and C8 (Figure 2D), exemplified by the primed and active enhancers which were predicted to interact with cluster-specific genes (Figure 2H).

      Together with the transcriptomic analyses (Figure 2C), these data are consistent with our conclusion about early lineage segregation and epigenetic priming.

      Q3R2: Function of Hand1 and Foxf1 during early cardiac differentiation:

      The study incorporated some functional studies by generating Hand1 and Foxf1 KO mESCs and differentiated them into mesoderm cells for RNA sequencing. These lines would present relevant tools to assess the role of Hand1 and Foxf1 in mesoderm formation, and a number of experiments would further support the conclusions, which are made for the most part on transcriptional analysis. For example, the study would benefit from quantification of mesoderm cells and subsequent cardiomyocytes during differentiation (via IF, or more quantitatively, via flow cytometry analysis). These data would help interpret any of the findings in the bulk RNAseq data, and help to assess the function of Hand1 and Foxf1 in generating the cardiac lineages. Conclusions such as "the analysis indicated that HAND1 and FOXF1 could dually regulate MJH specification through directly activating the MJH specific genes and inhibiting PSH specific genes" seem rather strong given the data currently provided.

      R3R2: Thank you for your kind suggestions. We added IF staining of mesodermal (Zic3), JCF (Hand1) and cardiac markers (Tnnt2), followed by cell quantification. Results indicate that Hand1 and Foxf1 knockout leads to reduced commitment to the JCF lineage, evidenced by the loss of Hand1 expression, accumulation of undifferentiated Zic3+ mesoderm, and impaired cardiomyocyte formation (Tnnt2+), consistent with the up-regulation of JCF lineage specific genes and the downregulation of SHF lineage specific genes.

      We also revised the conclusion as “These results suggest that HAND1 and FOXF1 may cooperatively regulate early cardiac lineage specification by promoting JCF-associated gene expression and suppressing alternative mesodermal programs.”.

      (4) Analysis of Hand1 cKO embryos:

      Adding a mouse model to support the computational analysis is a strong way to conclude the study. Given the availability of these early embryos, some of the findings could be strengthened by performing a similar analysis to Figure 7B&C and by including some of the specific EEM markers found to be differentially regulated to complement the structural analysis of the embryos.

      R4R2: hank you for your positive comments and help. In the revised manuscript, we performed IF staining of EEM marker Vim in a similar fashion as Figure 7B&C (Figure S8D). In comparison with control embryos, the Hand1 CKO embryos demonstrated significant less number of Vim+ cells, further strengthening the conclusion that Hand1 CKO blocked the developmental progression toward JCF direction.

      Q5R2: Current findings in the context of previous findings:

      The introduction carefully introduces the concept of lineage specification and different progenitor pools. Given the enormous amount of knowledge already available on Hand1 and Foxf1, and their role in specific lineages of the early heart, some of this information should be added, ideally to the discussion where it can be put into context of what the present findings add to the existing understanding of these transcription factors and their role in early cardiac specification.

      R5R2: We appreciate your positive comments and kind reminder. We have added discussion about how our study could be put into the body of findings on Hand1 and Foxf1. Although these two genes have been validated to be functionally important for heart development, it is unclear when and how they affect this process. Using in-vivo and in-vitro models and single cell multi-omics analyses, we provided evidence to fill the gaps from multiple aspects, including cell state temporal dynamics, regulatory network, and epigenetic regulation underlying the very early cardiac lineage specification.

      Reviewer #3 (Public review):

      Q1R3: In Figure 1A, could the authors justify using E8.5 CMs as the endpoint for the second lineage and better clarify the chamber identities of the E8.5 CMs analysed? Why are the atrial genes in Figure 1C of the PSH trajectory not present in Table S1.1, which lists pseudotime-dependent genes for the MJH/PSH trajectories from Figure 1F?

      R1R3: Thank you for your comments. We used E8.5 CMs as the endpoint of the second (SHF) lineage because this stage represents a critical point where SHF-derived cardiomyocytes have begun distinct differentiation, allowing us to capture terminal lineage states reliably. The chamber identities of E8.5 CMs were determined based on known marker genes (DOI: 10.1186/s13059-025-03633-3). The atrial genes shown in Figure 1C reflect cluster-specific markers that may not meet the strict pseudotime-dependency criteria used to generate Table S1.1, which lists genes dynamically changing along the MJH/PSH trajectories.

      Q2R3: Could the authors increase the resolution of their trajectory and genomic analyses to distinguish between the FHF (Tbx5+ HCN4+) and the JCF (Mab21l2+/ Hand1+) within the MJH lineage? Also, clarify if the early extraembryonic mesoderm contributes to the FHF.

      R2R3: Thank you for your great suggestions. To distinguish between the FHF and JCF trajectories, we used early FHF progenitor population (E7.75 Nkx2-5+; Mab21l2- CM cells) as the starting point and performed WOT lineage inference (Figure S2A). Results suggest that both JCF and SHF progenitors contribute to the FHF, consistent with live imaging-based single cell tracing by Dominguez et al (DOI: 10.1016/j.cell.2023.01.001) and lineage tracing results by Zhang et al (DOI: 10.1161/CIRCRESAHA.121.318943). We also analyzed the expression levels of FHF marker genes (Tbx5, Hcn4) and observed their activation along both trajectories (Figure S2B).

      Q3R3: The authors strongly assume that the juxta-cardiac field (JCF), defined by Mab21l2 expression at E7.5 in the extraembryonic mesoderm, contributes to CMs. Could the authors explain the evidence for this? Could the authors identify Mab21l2 expression in the left ventricle (LV) myocardium and septum transversum at E8.5 (see Saito et al., 2013, Biol Open, 2(8): 779-788)? If such a JCF contribution to CMs exists, the extent to which it influences heart development should be clarified or discussed.

      R3R3: Thank you for the important question. For the JCF contribution to the heart tube, several lines of evidence have been published in recent years using micro-dissection of mouse embryonic heart (DOI: 10.1126/science.abb2986), live imaging (DOI: 10.1016/j.cell.2023.01.001) and lineage tracing approaches (DOI: 10.1161/CIRCRESAHA.121.318943). According to Tyser et al (DOI: 10.1126/science.abb2986), Mab21l2 expression is detected in septum transversum at E8.5 and the Mab21l2+ lineage contribute to LV, basically consistent with the literature you mentioned (Saito et al., 2013, Biol Open, 2(8): 779-788). Our lineage inference analyses further support the model and suggest earlier specification by JCF. However, the focus of our work is the transcriptional and epigenetic regulation of underlying the JCF developmental trajectory.

      Q4R3: Could the authors distinguish the Hand1+ pericardium from JCF progenitors in their single-cell data and explain why they excluded other cell types, such as the endocardium/endothelium and pericardium, or even the endoderm, as endpoints of their trajectory analysis? At the NM and MM mesoderm stages, how did the authors distinguish the earliest cardiac cells from the surrounding developing mesoderm?

      R4R3: We appreciate your insightful question. In our other study (DOI: 10.1186/s13059-025-03633-3), we tried to further divide the CM cells as subclusters and it seems that their difference is mainly driven by the segmentation of the heart tube (e.g. LV, RV, OFT etc.). By the E8.5 stage, we are unable to identify the Hand1+ pericardium cluster. 

      Also, it seems infeasible to distinguish endocardium from other endothelium cells only using singlecell data. High resolution spatial transcriptome data is required. Alternatively, we analyzed the E7.0 mesodermal lineages and determined C5/6 as hematoendothelial progenitors. Marker gene analysis indicate that their lineage segregation has started by this stage (Figure S4C and Author response image 3).

      Author response image 3.

      UMAP layout, using scRNA-seq (Reference data) and snRNA-seq (Multiome data), is colored by cell types (left). Expression of hematoendothelial progenitor marker genes is shown (right).

      We did observe the difference between the earliest cardiac cells from the surrounding developing mesoderm. As in Figure 1D, cells belonging to the JCF lineage (Hand1 high/Lefty2 low) were clustered at the EEM/MM end, in contrast to the NM cells.

      Q5R3: Could the authors contrast their trajectory analysis with those of Lescroart et al. (2018), Zhang et al., Tyser et al., and Krup et al.?

      R5R3: Thank you for the valuable suggestion. We compared our model with the suggested ones and summarized as follows:

      (1) Lescroart et al: The JCF and SHF progenitor cells match their DCT2 (Bmp4+) and DCT3 (Foxc2+) clusters, respectively.

      (2) Zhang et al: The JCF lineage matches their EEM-DC (developing CM)-CM trajectory. The SHF lineage is consistent with their NM-LPM (lateral plate mesoderm)-DC (developing CM)-CM trajectory. Notably, their EEM-DC-CM also expressed FHF marker (Tbx5) at later stages.

      (3) Tyser et al: we performed data integration analysis and found the correspondence between JCF progenitors (EEM cells from the cardiac trajectory) and their Me5, as well as SHF progenitors (PM cells from the cardiac trajectory) with Me7. In their model, both Me5 and Me7 contribute to Me4 (representing the FHF), consistent with our results (see Tyser et al., 2021 and Pijuan-Sala et al., 2019).

      (4) Krup et al also performed URD lineage inference, providing a model with CM (12) and Cardiac mesoderm (29) as cardiac end points. Their model did not seem to suggest distinct trajectories between JCF and SHF lineages, as both JCF (Hand1) and SHF (Isl1) markers co-expressed in CM.

      Q6R3: Previous studies suggest that Mesp2 expression starts at E8 in the presomitic mesoderm (Saga et al., 1997). Could the authors provide in situ hybridization or HCR staining to confirm the early E7 Mesp2 expression suggested by the pseudo-time analysis of the second lineage.

      R6R3: We validated the expression of E7 Mesp2 using Geo-seq spatial transcriptome data (Author response image 4, upper). Results suggest the high spatial enrichment of Mesp2 expression in primitive streak (T+) and/or nascent mesoderm (Mesp1+) cells, which correspond to the progenitors of the second lineage.

      In situ hybridization data (PMID: 17360776) also supports the early expression of Mesp2 by E7 (Author response image 4, lower).

      Author response image 4.

      (Upper) E7 Geo-seq data for selected genes: T, Mesp1, and Mesp2. (Lower) Mesp2 expression during early development; image acquired from Morimoto et al. (PMID: 17360776).

      Q7R3: Could the authors also confirm the complementary Hand1 and Lefty2 expression patterns at E7 using HCR or in situ hybridization? Hand1 expression in the first lineage is plausible, considering lineage tracing results from Zhang et al.

      R7R3: Thank you for your great suggestion. We observed spatially complementary expression patterns of Hand1 and Lefty2 in the Geo-seq spatial transcriptomic data. In the mesoderm layer, Hand1 is highly expressed in the proximal end. While Lefty2+ cells exhibit preference toward the distal direction.

      Author response image 5.

      E7 Geo-seq data for selected genes: Hand1 and Lefty2.

      Q8R3: Could the authors explain why Hand1 and Lefty2+ cells are more likely to be multipotent progenitors, as mentioned in the text?

      R8R3: Thank you for your question. Here, we observed E7.0 Mesp1+ and Lefty2+ nascent mesodermal cells assigned to both the JCF and SHF lineages (Figure 1D), indicating their multipotency. On the other hand, we also found low expressions of JCF markers, Hand1 and Msx2, by the early stage of the SHF trajectory (Figure 1F). Thus, we concluded that both Hand1+ and Lefty2+ E7.0 mesodermal cells are likely to be multipotent.

      Q9R3: Could the authors comment on the low Mesp1 expression in the mesodermal cells (MM) of the MJH trajectory at E7 (Figure 1D)? Is Mesp1 transiently expressed early in MJH progenitors and then turned off by E7? Have all FHF/JCF/SHF cells expressed Mesp1?

      R9R3: Thank you for the insightful questions. Zhang et al. (PMID: 34162224) performed scRNA-seq analysis of Mesp1 lineage-traced cells, which indicate the contribution of Mesp1+ cells to FHF, JCF, and SHF. This is also supported by Dominguez et al. utilizing live imaging approaches (PMID: 36736300). Our temporal dynamics analysis suggests that along the JCF trajectory, Mesp1 is turned off as JCF characteristic genes were up regulated (Figure 1F and S1D).

      Q10R3: Could the authors clarify if their analysis at E7 comprises a mixture of embryonic stages or a precisely defined embryonic stage for both the trajectory and epigenetic analyses? How do the authors know that cells of the second lineage are readily present in the E7 mesoderm they analysed (clusters 0, 1, and 2 for the multiomic analysis)?

      R10R3: Thank you for your questions. Although embryos were collected at E7.0, the developmental stages could be variable. As exemplified by Karl Theiler’s book, “The House Mouse: Atlas of Embryonic Development”, mesoderm was visible for some E7.0 egg cylinders but not in others. To test whether cells of the second lineage are present in the E7.0 mesoderm, we analyzed the WOT lineage tracing results and the cell type composition by E7.0 (Author response image 6, left panel). Most cells belong to the nascent mesoderm (NM) or mixed mesoderm (MM), while almost no cells were assigned to the primitive streak (PS). To avoid the possibility that the E7.0 embryos represented later stages, we also analyzed the E6.75 cells of the second lineage (Author response image 6, middle panel). Results suggest that NM cells were still the dominant contributors to the second lineage, although ~22.6% cells were assigned to the PS. The abovementioned analyses were performed using the scRNA-seq data. The embryos of the E7.0 single-cell multi-omics represent similar developmental stages as the scRNAseq data, as suggested by the well-aligned UMAPs (Figure S1D, right panel). Thus, we conclude that for the multi-omics data, the cells of the second lineage are also readily present in the mesoderm.

      Author response image 6.

      (Left and middle) Lineage inference and cell type composition at E7.0 and E6.75. (Right) UMAPs of E7.0 multi-omics and scRNA-seq data.

      Q11R3: Could the authors further comment on the active Notch signaling observed in the first and second lineages, considering that Notch's role in the early steps of endocardial lineage commitment, but not of CMs, during gastrulation has been previously described by Lescroart et al. (2018)?

      R11R3: We appreciate your kind suggestion. As reported by Lescroart et al. (2018), using Notch1CreERT2/Rosa-tdTomato mice and tamoxifen administration at E6.5, early expression of Notch1 mostly marked endocardial cells (ECs, 76.9-83.9%), with minor contribution to the cardiomyocytes (6.0-16.6%) and to the epicardial cells (EPs, 6.0-6.5%). The lineage specificity of Notch1 is consistent with our E7.0 multi-omics data, where its expression was mainly observed in the NM and hematoendothelial progenitors (Author response image 7). Interestingly, expression of other NOTCH receptor genes (Notch2 and Notch3) and ligand genes (Dll1 and Dll3) in the CM lineages. Notch3 demonstrate higher expression in the first lineage, while Dll1 and Dll3 were highly expressed in the second lineage. The study by Lescroart et al. (2018) emphasized the role of Notch1 as an EC lineage marker, while our analyses aimed at the activity of the NOTCH pathway.

      Author response image 7.

      Expression of representative NOTCH genes at E7.0 (multi-omics data).

      Q12R3: In cluster 8, Figure 2D, it seems that levels of accessibility in cluster 8 are relatively high for genes associated with endothelium/endocardium development in addition to MJH genes. Could the authors comment and/or provide further analysis?

      R12R3: Thanks for you for raising this interesting point. To confirm the association of these genes with endothelium (EC) and/or MJH, we analyzed their expression levels by E7.0 (progenitor stage) and E8.0 (differentiated stage) (Author response image 8). Among target genes of MJH-specific DAEs (cluster 3/7/8 in Figure 2D), Pmp22, Mest, Npr1, Pkp2, and Pdgfb were expressed in the hematoendothelial progenitors. The Nrp1 gene and PDGF pathway play critical roles in endothelial development by modulating cell migration (PMID: 15920019 and 28167492), which is also important for MJH cells. In addition, we observed common ATAC-seq peaks in both hematoendothelial and MJH clusters (Author response image 9), indicating shared regulatory elements. Interestingly, Pdgfb is not expressed by CM in vivo, it is actively expressed in the CM of the in vitro system (Author response image 9). These results indicate regulatory and functional closeness between hematoendothelial and MJH cell groups, at early stages of lineage establishment.

      Author response image 8.

      Regulatory connection between MJH and endothelial cells (ECs).

      Author response image 9.

      Representative genome browser snapshots of scATAC-seq (aggregated gene expression and chromatin accessibility for each cluster) and RNA-seq at the Pdgfb locus.

      Q13R3: Can the authors clarify why they state that cluster 8 DAEs are primed before the full activation of their target genes, considering that Bmp4 and Hand1 peak activities seem to coincide with their gene expression in Figure 2G?

      R13R3: Thanks for your great question. The overall analyses indicate low to medium levels of H3K4me1 and H3K27ac by E6.5-7.0 at cluster 8 DAEs, which were fully activated by E7.5 (Figure 2F). Further inspections suggest different epigenetic status of individual DAEs (Figure 3H), which could be active (K4me1+/K27ac+), primed (K4me1+/K27ac-), or inactive (K4me1-/K27ac-). Thus, we concluded that many DAEs could be primed before full activation. The coincidence of enhancer peak activities and gene expression was observed by aggregating single cell clusters at a single stage E7.0, which does not rule out the possibility that these enhancers are epigenetically primed at earlier stages.

      Q14R3: Did the authors extend the multiomic analysis to Nanog+ epiblast cells at E7 and investigate if cardiac/mesodermal priming exists before mesodermal induction (defined by T/Mesp1 onset of expression)?

      R14R3: We appreciate your kind suggestion. We observed low levels of T/Mesp1 expression in the E7.0 Nanog+ epiblast cells (Author response image 10). Interestingly, the T+/Mesp1+ cells were not clustered toward any specific differentiation directions in the UMAP. We also analyzed DAE activities in each single cell by averaging over the C1/C2/C8 DAE sets. The C2 and C8 DAEs were clearly less active than the C1 DAEs. But C2/C8-DAE active cells were observed among the E7.0 Nanog+ epiblast cells. These data indicate the early priming exists in epiblast cells before the commitment to cardiac/mesodermal differentiation.

      Author response image 10.

      Gene expression and DAE activity levels of E7.0 Nanog+ epiblast cells shown in UMAP layout.

      Q15R3: In the absence of duplicates, it is impossible to statistically compare the proportions of mesodermal cell populations in Hand1 wild-type and knockout (KO) embryos or to assess for abnormal accumulation of PS, NM, and MM cells. Could the authors analyse the proportions of cells by careful imaging of Hand1 wild-type and KO embryos instead?

      R15R3: Thank you for your important question. To assess the proportions of mesodermal cell populations in E7.25 wild-type and Hand1-CKO embryos, we analyzed the serial coronal sections of the extraembryonic portions and performed staining of the Vim gene, which marks the extra-embryonic mesodermal (EEM) cells (Figure S8D). We then counted the numbers of mesodermal/Vim+ EEM cells and calculated the relative proportion of Vim+ EEM cells in each section. The proportion of Vim+ EEM cells was statistically lower in the Hand1-CKO embryo, consistent with our model that Hand1 deletion led to blocked MJH specification.

      Q16R3: Could the authors provide high-resolution images for Figure 7 B-C-D as they are currently hard to interpret?

      R16R3: Thank you for your suggestion. We have replaced Figure 7B-C-D with high-resolution images.

      Recommendations for the authors:  

      Reviewing Editor Comments:

      Discussions among reviewers emphasize the importance of better addressing and validating the trajectory analysis by using more common and alternative bioinformatics and spatial approaches. Further discussion on whether there is a common transcriptional progenitor between the two trajectories is also required to enhance the significance of the study. For functional analysis, further validations are needed as the current data only partially support the claims. Please see public reviews for details.

      Reviewer #2 (Recommendations For The Authors):

      Beyond the suggestions made in the public review, below are some minor aspects for consideration:

      The manuscript is well written overall but may benefit from a thorough read-through and editing of some minor grammatical errors.

      We have carefully read through the manuscript and corrected minor grammatical errors to improve clarity and readability.

      Figure 2C: RNA velocity information gets largely lost due to the color choice of EEM and MM (black) on which the direction of arrows can't be appreciated.

      We have updated the color scheme in Figure 2C.

      Figure 6D: sample information is partially cut off in the graph.

      Sample information is completely shown now.

      The last paragraph of the discussion has some formatting issues with the references.

      We have corrected the formatting issues with the references.

      The methods and results section does not comment on if, or how many embryos were pooled for the sequencing analysis performed for this study.

      We have added the numbers of embryos for sequencing analyses in the methods section.

      Reviewer #3 (Recommendations For The Authors):

      Minor:

      In the discussion, authors could reconsider the sentence: "The process of cardiac lineage segregation is a complex one that may involve TF regulatory networks and signaling pathways," as it is not informative.

      We have re-written the sentence as: “Thus, additional regulation must exist and instructs the process of JCF-SHF lineage segregation.”

    1. eLife Assessment

      This study provides valuable information on the impact of Lamin A/C knockdown on gene expression using RNA-Seq analysis, as well as on telomere dynamics through live cell imaging. However, the conclusions remain inadequately supported by the current data, and several of the major technical concerns raised in the first round have not yet been fully resolved.

    2. Reviewer #1 (Public review):

      I am afraid that the manuscript has not improved much. The authors have barely addressed my specific comments, and the manuscript remains descriptive with little logic in the analyses, and no coherence between the RNA-seq work and the telomere dynamics analysis. The revised title still suggests more causality/mechanism than is demonstrated in the results.

      Of my three main technical concerns, two critical ones were not properly addressed, and for the third concern the answer is not entirely clear. So on balance, in my view the revised manuscript still does not meet the scientific standards of the field.

      (1) Knockdowns should be verified at the protein level:

      Authors state that they are working on this, but the results are not included in the revised manuscript.

      (2) Multiple shRNAs for each protein, or and alternative method such as CRISPR deletion or degron technology, must be tested to rule out such off-target effects:

      Authors state that they are working on this, but have not included the results in the revised manuscript.

      (3) It was not clear whether the replicate experiments are true biological replicates (i.e. done on different days).

      Authors give a somewhat ambiguous answer in the rebuttal: "samples [...] were derived from independently prepared cultures in separate experimental setups". A simple answer would have been "yes they were done on different days", but this is not what is stated, so I still wonder about the experimental design. The Methods text only states "Each experiment was performed with a minimum of three biological replicates" without clarifying how this was implemented.

    3. Reviewer #2 (Public review):

      Summary:

      This study focused on the roles of the nuclear envelope proteins lamin A and C, as well as nesprin-2, encoded by the LMNA and SYNE2 genes, respectively, on gene expression and chromatin mobility. It is motivated by the established role of lamins in tethering heterochromatin to the nuclear periphery in lamina-associated domains (LADs) and modulating chromatin organization. The authors show that depletion of lamin A, lamin A and C, or nesprin-2 results in differential effects of mRNA and lnRNA expression, primarily affecting genes outside established LADs. In addition, the authors used fluorescent dCas9 labeling of telomeric genomic regions combined with live-cell imaging to demonstrate that depletion of either lamin A, lamin A/C, or nesprin-2 increased the mobility of chromatin, suggesting an important role of lamins and nesprin-2 on chromatin dynamics.

      Strengths:

      The major strength of this study is the detailed characterization of changes in transcript levels and isoforms resulting from depletion of either lamin A, lamin A/C, or nesprin-2 in human osteosarcoma (U2OS) cells. The authors use a variety of advanced tools to demonstrate the effect of protein depletion on specific gene isoforms and to compare the effects on mRNA and lncRNA levels.

      The TIRF imaging of dCas9 labeled telomeres allows for high resolution tracking of multiple telomeres per cell, thus enabling the authors to obtain detailed measurements of the mobility of telomeres within living cells and the effect of lamin A/C or nesprin-2 depletion.

      Weaknesses:

      Although the findings presented by the authors overall confirm existing knowledge about the ability of lamins A/C and nesprin to broadly affect gene expression, chromatin organization, and chromatin dynamics, the specific interpretation and the conclusions drawn from the data presented in this manuscript are limited by several technical and conceptual challenges.

      One major limitation is that the authors only assess the knockdown of their target genes on the mRNA level, where they observe reductions of around 70%. Given that lamins A and C have long half-lives, the effect at the protein level might be even lower. This incomplete and poorly characterized depletion on the protein level makes interpretation of the results difficult. Assessing the effect of the knockdown on the protein level would provide more detailed information both on the extent of the actual protein depletion and the effect on specific lamin isoforms. Similarly, given that nesprin-2 has numerous isoforms resulting from alternative splicing and transcription initiation. In the current form of the manuscript, it remains unclear which specific nesprin-2 isoforms where depleted, and by what extent (on the protein level).

      Another substantial limitation of the manuscript is that the current analysis, with exception of the chromatin mobility measurements, is exclusively based on transcriptomic measurements by RNA-seq and qRT-PCR, without any experimental validation of the predicted protein levels or proposed functional consequences. As such, conclusions about the importance of lamin A/C on RNA synthesis and other functions are derived entirely from gene ontology terms and are not sufficiently supported by experimental data. Thus, the true functional consequences of lamin A/C or nesprin depletion remain unclear.

      Another substantial weakness is that the data and analysis presented in the manuscript raise some concerns about the robustness of the findings. Given that the 'shLMNA' construct is expected to deplete both lamin A and C, i.e., its effect encompasses the depletion of lamin A, which is achieved by the 'shLaminA' construct, one would expect a substantial overlap between the DEGs in the shLMNA and shLaminA conditions, with the shLMNA depletion producing a broader effect as it targets both lamin A and C. However, the Venn Diagram in Figure 4a, the genomic loci distribution in Figure 4b, and the correlation analysis in Suppl. Fig. S2 show little overlap between the shLMNA and shLaminA conditions, which is quite surprising. In the mapping of the DEGs shown in Fig. 4b, it is also surprising not to see the gene targeted by the shRNA, LMNA, found on chromosome 1, in the results for the shLMNA and shLamin A depletion.

      The correlation analysis in Suppl. Figure S2 raises further questions. The authors use dox-inducible shRNA constructs to target lamin A (shLaminA), lamin A/C (shLMNA), or nesprin-2 (shSYNE2). Thus, the no-dox control (Ctr) for each of these constructs would be expected to be very similar to the non-target scrambled controls (Ctrl.shScramble and Dox.shScramble). However, in the correlation matrix, each of the no-dox controls clusters more closely with the corresponding dox-induced shRNA condition than with the Ctrl.shScramble or Dox.shScramble conditions, suggesting either a very leaky dox-inducible system, effects from clonal selection (although less likely, giving the pooling of three clones), or substantial batch effects in the processing. Either of these scenarios could substantially affect the interpretation of the findings.

      The premise of the authors that lamins would only affect peripheral chromatin and genes at LADs neglects the fact that lamins A and C are also found in the nuclear interior, where they form stable structure and influence chromatin organization, and the fact that lamins A and C and nesprins additionally interact with numerous transcriptional regulators such as Rb, c-Fos, and beta-catenins, which could further modulate gene expression when lamins or nesprins are depleted.

      The comparison of the identified DEGs to genes contained in LADs might be confounded by the fact that the authors relied on the identification of LADs from a previous study, which used a different human cell type (human skin fibroblasts) instead of the U2OS osteosarcoma cells used in the present study. As LADs are often highly cell type specific, the use of the fibroblast data set could lead to substantial differences in LADs.

      Overall appraisal and context:

      Despite its limitations, the present study further illustrates the important roles the nuclear envelope proteins lamin A, lamin C, and nesprin-2 have in chromatin organization, dynamics, and gene expression. It thus confirms results from previous studies previously reported for lamin A/C depletion. For example, the effect of lamin A/C depletion on increasing mobility of chromatin, had already been demonstrated by several other groups, such as Bronshtein et al. Nature Comm 2015 (PMID: 26299252) and Ranade et al. BMC Mol Cel Biol 2019 (PMID: 31117946). Additionally, the effect of lamin A/C depletion on gene and protein expression has already been extensively studied in a variety of other cell lines and model systems, including detailed proteomic studies (PMIDs 23990565 and 35896617).

      The finding that that lamin A/C or nesprin depletion not only affects genes at the nuclear periphery but also the nuclear interior is not particularly surprising giving the previous studies and the fact that lamins A and C are also founding within the nuclear interior, where they affect chromatin organization and dynamics, and that lamins A/C and nesprins directly interact with numerous transcriptional regulators that could further affect gene expression independent from their role in chromatin organization.

      The isoform specific effects of LMNA depletion on chromatin mobility and gene expression are not entirely surprising, as recent work by the Medalia group identified a lamin A-specific chromatin binding site not present in lamin C (PMID: 40750945). This work should be cited in the manuscript.

      The authors provide a detailed analysis of isoform switching in response to lamin A/C or nesprin-depletion, but the underlying mechanism remains unclear. Similarly, their analysis of the genomic location of the observed DEGs shows the wide-ranging effects of lamin A/C or nesprin depletion, but lets the reader wonder how these effects are mediated. A more in-depth analysis of predicted regulator factors and their potential interaction with lamins A/C or nesprin would be beneficial in gaining more mechanistic insights.

      Additional note regarding the revised manuscript:

      The authors have made several revisions to the manuscript, including the title and abstract. The above comments have been updated to reflect the latest manuscript version.

      These text revisions made by the authors provide some more detailed discussion of context and interpretation of the work, improving the clarity of the manuscript. However, they do not fundamentally alleviate many of the concerns previously expressed regarding the lack of mechanistic insights and various technical aspects of the study, i.e., use of a single shRNA for knockdown, lack of knockdown validation on the protein level, potential off-target effects of the shRNA, batch-effects of the transcriptomic analysis, cell-type specific differences in LADs, etc. Without further experimental data, the manuscript offers a mostly descriptive analysis on the effect of LMNA and SYNE2 depletion on gene expression and telomere mobility. The manuscript might be useful as a reference data sets for comparison with other LMNA or SYNE2 depletion studies, albeit with various caveats regarding its interpretation due to the technical concerns raised by the reviewers.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      This manuscript reports a descriptive study of changes in gene expression after knockdown of the nuclear envelope proteins lamin A/C and Nesprin2/SYNE2 in human U2OS cells. The readout is RNA-seq, which is analyzed at the level of gene ontology and focused investigation of isoform variants and non-coding RNAs. In addition, the mobility of telomeres is studied after these knockdowns, although the rationale in relation to the RNA-seq analyses is rather unclear.

      We sincerely thank the reviewer for the thoughtful summary and valuable feedback. Regarding the telomere mobility analyses, our intention was to provide additional evidence supporting the hypothesis that knockdown of lamins and nesprins disrupts nuclear architecture. Although the connection to the RNA-seq data was not explicitly detailed, we believe that the increased telomere mobility may reflect broader changes in chromatin organization, which could contribute to the observed differential gene expression. We have revised the manuscript to clarify this rationale and improve the integration between the two analyses.

      RNA-seq after knockdown of lamin proteins has been reported many times, and the current study does not provide significant new insights that help us to understand how lamins control gene expression. This is particularly because the vast majority of the observed effects on gene expression appear to occur in regions that are not bound by lamin A. It seems likely that these effects are indirect. There is also virtually no overlap between genes affected by laminA/C and by SYNE2, which remains unexplained; for example, it would be good to know whether laminA/C and SYNE2 bind to different genomic regions. The claim in the Title and Abstract that LMNA governs gene expression / acts through chromatin organization appears to be based only on an enrichment of gene ontology terms "DNA conformation change" and "covalent chromatin conformation" in the RNA-seq data. This is a gross over-interpretation, as no experimental data on chromatin conformation are shown in this study. The analyses of transcript isoform switching and ncRNA expression are potentially interesting but lack a mechanistic rationale: why and how would these nuclear envelope proteins regulate these aspects of RNA expression? The effects of lamin A on telomere movements have been reported before; the effects of SYNE2 on telomere mobility are novel (to my knowledge), but should be discussed in the light of previously documented effects of SUN1/2 on the dynamics of dysfunctional telomeres (Lottersberger et al, Cell 2015).

      We sincerely thank the reviewer for this thoughtful and detailed critique. We agree that RNA-seq following knockdown of lamin proteins has been previously reported and appreciate the concern regarding the novelty and mechanistic interpretation of our findings. However, For our study, we revealed novel findings that there is distinct isoform switching and lncRNA affected by lamins and nesprins, which have not been reported yet by previous studies. Furthermore, we also revealed not only lamin A, but also nesprin-2 could also affect chromatin mobility.

      For the analysis of LMNA ChIP-seq data from  human fibroblast (Kohta Ikegami, 2021). Their data revealed that Lamin A/C modulates gene expression through interactions with enhancers. The pathogenesis of disorders associated with LMNA mutations may stem primarily from disruptions in this gene regulatory function, rather than from impaired tethering of chromatin to LADs.

      We acknowledge the reviewer’s concern that gene ontology enrichment related to chromatin conformation alone is insufficient to support claims about chromatin structural changes. We have therefore revised the “Title” and “Abstract” to avoid overstating conclusions and to more accurately reflect the scope of our data.

      Regarding telomere dynamics, while Lamin A's role has indeed been previously documented, our study provides evidence that SYNE2/Nesprin-2 also regulates telomere mobility. We have now expanded the discussion to include prior work, particularly the findings of Lottersberger et al. (Cell, 2015), to better contextualize our results and distinguish the contributions of SYNE2.

      Finally, we appreciate the reviewer’s suggestion about transcript isoform and noncoding RNA expression. While our study primarily provides descriptive data, we agree that further mechanistic investigation is warranted. We have clarified this point in the “Discussion” and framed our findings as a foundation for future studies exploring the broader regulatory roles of nuclear envelope proteins.

      We are grateful for the reviewer’s comments, which have helped us improve the clarity and rigor of our manuscript. Please see the revised highlights in our revised manuscript.

      As indicated below, I have substantial concerns about the experimental design of the knockdown experiments.

      Altogether, the results presented here are primarily descriptive and do not offer a significant advance in our understanding of the roles of LaminA and SYNE2 in gene regulation or chromatin biology, because the results remain unexplained mechanistically and functionally. Furthermore, the RNAseq datasets should be interpreted with caution until off-target effects of the shRNAs can be ruled out.

      We fully acknowledge that the original version of our manuscript lacked sufficient mechanistic insight. In response, we have revised the manuscript to include additional analyses and explanations that clarify the potential functional relevance of our findings. For example, we added following text “These findings further underscore the functional relevance of lamin A in coordinating transcriptional programs through modulation of nuclear architecture. In contrast, LMNA knockdown led to differential expression of genes enriched in pathways related to chromatin organization, suggesting potential disruptions in chromatin regulatory networks. Although direct measurements of chromatin conformation were not performed, these transcriptional changes indicate that LMNA may contribute to maintaining nuclear architecture and genomic stability, which aligns with its established involvement in laminopathies and genome integrity disorders.“ More analyses could be found in the main text.

      Regarding the concern about off-target effects of the shRNA-based knockdowns, we agree that this is an important consideration. While shRNA approaches inherently carry the risk of off-target effects, we have now performed additional analyses that help address this issue. These analyses support the specificity of our observations and suggest that the majority of gene expression changes are likely to be directly related to the targeted knockdown. Nonetheless, we have clearly stated the limitations of the approach in the revised discussion and emphasized the need for future validation using complementary methods.

      We hope that these revisions strengthen the overall impact and interpretability of our study.

      Specific comments:

      (1) Knockdowns were only monitored by qPCR. Efficiency at the protein level (e.g., Western blots) needs to be determined.

      We agree that complementary protein-level validation (e.g., by Western blot) would strengthen the findings, and we are in the process of obtaining suitable reagents to address this point in future experiments. We have now clarified this limitation in the revised manuscript  

      (2) For each knockdown, only a single shRNA was used. shRNAs are infamous for offtarget effects; therefore, multiple shRNAs for each protein, or an alternative method such as CRISPR deletion or degron technology, must be tested to rule out such offtarget effects.

      We fully acknowledge the concern regarding the use of only a single shRNA per knockdown and agree that shRNAs are prone to off-target effects. We recognize the importance of validating our findings using multiple independent shRNAs or alternative knockdown strategies, such as CRISPR deletion or degron-based approaches, to ensure specificity. To address this concern, we have conducted qPCR confirmation the knockdown of target proteins from RNA-seq findings, further supporting the validity of our data. In line with this, we are currently optimizing an auxin-inducible degron system (AtAFB2) for targeted and controlled depletion of lamin C. Our preliminary results indicate approximately a 40% knockdown efficiency after 16 hours of auxin induction, highlighting the necessity for further system optimization (Author response image 1). Future experiments will integrate this improved degron technology alongside multiple independent approaches to rigorously address and mitigate concerns about off-target effects, thereby enhancing the robustness and reproducibility of our data.

      Author response image 1.

      FACS analysis of the lamin C degron system at 0, 1, 3, and 16 hours postinduction with 500 μM indole-3-acetic acid (IAA) (Sigma).

      (3) It is not clear whether the replicate experiments are true biological replicates (i.e., done on different days) or simply parallel dishes of cells done in a single experiment (= technical replicates). The extremely small standard deviations in the RT-qPCR data suggest the latter, which would not be adequate.

      We appreciate the reviewer’s insightful comment regarding the nature of our replicates. The RT-qPCR experiments were indeed performed as true biological replicates, with samples collected on different days and from independently cultured cell batches. We have added this to the manuscript Methods. While we observed some variability in the Scramble control group, the low standard deviations in the shRNAtreated samples likely reflect the consistent and efficient knockdown of target genes.

      For the RNA-seq experiments, samples were collected as two batches during RNA extraction and library preparation. The samples still represent biological replicates, as they were derived from independently prepared cultures in separate experimental setups. This approach was chosen to strike a balance between biological variation and technical consistency, thereby improving the reliability of the RNA-seq results.

      Reviewer #2 (Public review):

      Summary:

      This study focused on the roles of the nuclear envelope proteins lamin A and C, as well as nesprin-2, encoded by the LMNA and SYNE2 genes, respectively, on gene expression and chromatin mobility. It is motivated by the established role of lamins in tethering heterochromatin to the nuclear periphery in lamina-associated domains (LADs) and modulating chromatin organization. The authors show that depletion of lamin A, lamin A and C, or nesprin-2 results in differential effects of mRNA and lncRNA expression, primarily affecting genes outside established LADs. In addition, the authors used fluorescent dCas9 labeling of telomeric genomic regions combined with live-cell imaging to demonstrate that depletion of either lamin A, lamin A/C, or nesprin-2 increased the mobility of chromatin, suggesting an important role of lamins and nesprin2 in chromatin dynamics.

      We sincerely appreciate the reviewer’s thoughtful summary of our study and the key findings. Our work is indeed motivated by the well-established roles of lamin A/C in chromatin tethering at the nuclear periphery and the emerging understanding of their broader influence on chromatin organization and gene regulation. In our study, we aimed to further explore these roles by examining the consequences of depleting lamin A, lamin A/C, and nesprin-2 (SYNE2) on both gene expression and chromatin mobility.

      As the reviewer accurately notes, we observed differential effects on mRNA and lncRNA expression, with many changes occurring outside of previously defined LADs. This finding suggests that lamins and nesprin-2 may also influence transcriptional regulation through mechanisms beyond direct LAD association. Furthermore, using live-cell imaging of fluorescently labeled telomeric regions, we demonstrated that loss of these nuclear envelope components leads to increased chromatin mobility, supporting their role in maintaining chromatin stability and nuclear architecture.

      We thank the reviewer for highlighting these aspects, which we believe contribute to a more nuanced understanding of how nuclear envelope proteins modulate chromatin behavior and gene regulation.

      Strengths:

      The major strength of this study is the detailed characterization of changes in transcript levels and isoforms resulting from depletion of either lamin A, lamin A/C, or nesprin-2 in human osteosarcoma (U2OS) cells. The authors use a variety of advanced tools to demonstrate the effect of protein depletion on specific gene isoforms and to compare the effects on mRNA and lncRNA levels.

      The TIRF imaging of dCas9-labeled telomeres allows for high-resolution tracking of multiple telomeres per cell, thus enabling the authors to obtain detailed measurements of the mobility of telomeres within living cells and the effect of lamin A/C or nesprin-2 depletion.

      We are grateful that the reviewer recognized the comprehensive analysis of transcript and isoform changes upon depletion of lamin A, lamin A/C, or nesprin-2 in U2OS cells. We also thank the reviewer for acknowledging our use of advanced tools to investigate isoform-specific effects and to distinguish between changes in mRNA and lncRNA expression.

      Furthermore, we are pleased that the reviewer highlighted the strength of our TIRF imaging approach using dCas9-labeled telomeres. This technique enabled us to capture high-resolution, multi-locus dynamics within single living cells, and we agree that it is instrumental in revealing the impact of lamin A/C and nesprin-2 depletion on telomere mobility.

      Weaknesses:

      Although the findings presented by the authors overall confirm existing knowledge about the ability of lamins A/C and nesprin to broadly affect gene expression, chromatin organization, and chromatin dynamics, the specific interpretation and the conclusions drawn from the data presented in this manuscript are limited by several technical and conceptual challenges.

      One major limitation is that the authors only assess the knockdown of their target genes on the mRNA level, where they observe reductions of around 70%. Given that lamins A and C have long half-lives, the effect at the protein level might be even lower. This incomplete and poorly characterized depletion on the protein level makes interpretation of the results difficult. The description for the shRNA targeting the LMNA gene encoding lamins A and C given by the authors is at times difficult to follow and might confuse some readers, as the authors do not clearly indicate which regions of the gene are targeted by the shRNA, and they do not make it obvious that lamin A and C result from alternative splicing of the same LMNA gene. Based on the shRNA sequences provided in the manuscript, one can conclude that the shLaminA shRNA targets the 3' UTR region of the LMNA gene specific to prelamin A (which undergoes posttranslational processing in the cell to yield lamin A). In contrast, the shRNA described by the authors as 'shLMNA' targets a region within the coding sequence of the LMNA gene that is common to both lamin A and C, i.e., the region corresponding to amino acids 122-129 (KKEGDLIA) of lamin A and C. The authors confirm the isoform-specific effect of the shLaminA isoform, although they seem somewhat surprised by it, but do not confirm the effect of the shLMNA construct. Assessing the effect of the knockdown on the protein level would provide more detailed information both on the extent of the actual protein depletion and the effect on specific lamin isoforms. Similarly, given that nesprin-2 has numerous isoforms resulting from alternative splicing and transcription initiation. In the current form of the manuscript, it remains unclear which specific nesprin-2 isoforms were depleted, and to what extent (on the protein level).

      We have revised the Methods section to include a clearer and more detailed description of the shRNA design, including the specific regions of the LMNA gene targeted by each construct, as well as the relationship between lamin A and C isoforms resulting from alternative splicing. We agree that this clarification will help prevent confusion for readers.

      Regarding the shLMNA construct, we acknowledge the importance of confirming the knockdown at the protein level, especially given the long half-lives of lamin proteins. In our revised manuscript, we now refer to Supplementary Figure S2, which demonstrates that the shLMNA construct effectively reduces both lamin A and lamin C transcript levels. While we initially focused on mRNA quantification, we recognize that additional proteinlevel validation is valuable and have accordingly emphasized this point in the revised discussion.

      We also appreciate the comment on nesprin-2 isoforms. Given the complexity of nesprin-2 splicing, we are currently working to further characterize the specific isoforms affected and will aim to include protein-level data in a future study. 

      Another substantial limitation of the manuscript is that the current analysis, with the exception of the chromatin mobility measurements, is exclusively based on transcriptomic measurements by RNA-seq and qRT-PCR, without any experimental validation of the predicted protein levels or proposed functional consequences. As such, conclusions about the importance of lamin A/C on RNA synthesis and other functions are derived entirely from gene ontology terms and are not sufficiently supported by experimental data. Thus, the true functional consequences of lamin A/C or nesprin depletion remain unclear. Statements included in the manuscript such as "our findings reveal that lamin A is essential for RNA synthesis, ..." (Lines 79-80) are thus either inaccurate or misleading, as the current data do not show that lamin A is ESSENTIAL for RNA synthesis, and lamin A/C and lamin A deficient cells and mice are viable, suggesting that they are capable of RNA synthesis.

      We agree that our current data do not support the claim that lamin A is essential for RNA synthesis, and we acknowledge the importance of distinguishing between correlation and causal relations in our conclusions. In light of this, we have revised the statement in the manuscript to more accurately reflect our findings:

      “Our findings suggest that lamin A contributes to RNA synthesis, supports chromatin spatial organization through LMNA, and that SYNE2 influences chromatin modifications as reflected in transcript levels.”

      We hope this revision better aligns with the limitations of our dataset and addresses the reviewer’s concerns regarding the interpretation of functional consequences based solely on transcriptomic data.

      Another substantial weakness is that the data and analysis presented in the manuscript raise some concerns about the robustness of the findings. Given that the 'shLMNA' construct is expected to deplete both lamin A and C, i.e., its effect encompasses the depletion of lamin A, which is achieved by the 'shLaminA' construct, one would expect a substantial overlap between the DEGs in the shLMNA and shLaminA conditions, with the shLMNA depletion producing a broader effect as it targets both lamin A and C. However, the Venn Diagram in Figure 4a, the genomic loci distribution in Figure 4b, and the correlation analysis in Supplementary Figure S2 show little overlap between the shLMNA and shLaminA conditions, which is quite surprising. In the mapping of the DEGs shown in Figure 4b, it is also surprising not to see the gene targeted by the shRNA, LMNA, found on chromosome 1,  in the results for the shLMNA and shLamin A depletion.

      We have added the discussion into the revised edition: “Interestingly, although both shLMNA and shLaminA constructs target lamin A, with shLMNA additionally depleting lamin C, the DEGs identified under these two conditions show limited overlap. This unexpected finding suggests that depletion of lamin C in the shLMNA condition may trigger distinct or compensatory transcriptional responses that are not elicited by lamin A knockdown alone. Furthermore, variation in shRNA efficiency or off-target effects may contribute to these differences. Notably, despite directly targeting LMNA, the overlap in DEGs between the two conditions remained limited under our stringent threshold criteria. Together, these observations highlight the complex and non-linear regulatory roles of lamin isoforms in gene expression and underscore the need for further mechanistic studies to dissect their individual and combined contributions [28,29].”

      The correlation analysis in Supplementary Figure S2 raises further questions. The authors use doc-inducible shRNA constructs to target lamin A (shLaminA), lamin A/C (shLMNA), or nesprin-2 (shSYNE2). Thus, the no-dox control (Ctr) for each of these constructs would be expected to be very similar to the non-target scrambled controls (Ctrl.shScramble and Dox.shScramble). However, in the correlation matrix, each of the no-dox controls clusters more closely with the corresponding dox-induced shRNA condition than with the Ctrl.shScramble or Dox.shScramble conditions, suggesting either a very leaky dox-inducible system, strong effects from clonal selection, or substantial batch effects in the processing. Either of these scenarios could substantially affect the interpretation of the findings. For example, differences between different clonal cell lines used for the studies, independent of the targeted gene, could explain the limited overlap between the different shRNA constructs and result in apparent differences when comparing these clones to the scrambled controls, which were derived from different clones.

      We thank the reviewer for this thoughtful observation. We would like to clarify that the samples shown in Supplementary Figure S2 were processed and sequenced in two separate batches, and the data presented in the correlation matrix are unnormalized. As such, batch effects are indeed present and likely contribute to the clustering pattern observed, particularly the closer similarity between the dox-induced and no-dox samples for each individual shRNA construct.

      Importantly, our analyses focus on within-construct comparisons (i.e., doxycyclinetreated vs untreated samples for the same shRNA), rather than direct comparisons across different constructs or scrambled controls. Each experimental pair (dox vs nodox) was processed in parallel within its respective batch to ensure internal consistency. Thus, while the global clustering pattern may reflect batch-related differences or baseline variations between independently derived cell lines, these factors do not affect the main conclusions drawn from the within-construct differential expression analysis.

      The manuscript also contains several factually inaccurate or incorrect statements or depictions. For example, the depiction of the nuclear envelope in Figure 1 shows a single bilipid layer, instead of the actual double bi-lipid layer of the inner and outer nuclear membranes that span the nuclear lumen. The depiction further lacks SUN domain proteins, which, together with nesprins, form the LINC complex essential to transmit forces across the nuclear envelope. The statement in line 214 that "Linker of nucleoskeleton and cytoskeleton (LINC) complex component nesprin-2 locates in the nuclear envelope to link the actin cytoskeleton and the nuclear lamina" is not quite accurate, as nesprin-2 also links to microtubules via dynein and kinesin.

      We sincerely thank the reviewer for pointing out these important inaccuracies. In response, we have revised Figure 1 to accurately depict the nuclear envelope as a double bi-lipid membrane and included SUN domain proteins to better reflect the structural components of the LINC complex. Additionally, we have updated the statement and citations 

      This is the revised part that is incorporated in the manuscript “The linker of nucleoskeleton and cytoskeleton (LINC) complex component nesprin-2 is a nuclear envelope protein that connects the nucleus to the cytoskeleton by interacting not only with actin filaments but also with microtubules through motor proteins such as dynein and kinesin. This structural linkage contributes to cellular architecture and facilitates mechanotransduction between the nuclear interior and the extracellular matrix (ECM) [8,21]

      ”We appreciate the reviewer’s insights, which have helped improve the accuracy and clarity of our manuscript.

      The statement that "Our data show that Lamin A knockdown specifically reduced the usage of its primary isoform, suggesting a potential role in chromatin architecture regulation, while other LMNA isoforms remained unaffected, highlighting a selective effect" (lines 407-409) is confusing, as the 'shLaminA' shRNA specifically targets the 3' UTR of lamin A that is not present in the other isoforms. Thus, the observed effect is entirely consistent with the shRNA-mediated depletion, independent of any effects on chromatin architecture.

      We have rephrased the statement “Our data show that knockdown with shLaminA, which specifically targets the 3' UTR unique to the lamin A isoform, selectively reduced lamin A expression without affecting other LMNA isoforms.”

      The premise of the authors that lamins would only affect peripheral chromatin and genes at LADs neglects the fact that lamins A and C are also found in the nuclear interior, where they form stable structure and influence chromatin organization, and the fact that lamins A and C and nesprins additionally interact with numerous transcriptional regulators such as Rb, c-Fos, and beta-catenins, which could further modulate gene expression when lamins or nesprins are depleted.

      Based on the reviewer’s comment we have added the statement into Discussion part “Beyond their well-established role in tethering heterochromatin at the nuclear periphery through lamina-associated domains (LADs), A-type lamins (lamins A and C) also localize to the nuclear interior, where they contribute to chromatin organization and gene regulation independently of LADs [27,28]. Nuclear lamins can form intranuclear foci that associate with active chromatin and are implicated in supporting transcriptional activity. Additionally, both lamins and nesprins participate in diverse protein-protein interactions that may influence transcriptional regulation. For example, lamin A/C interacts with the retinoblastoma protein (Rb) to modulate E2F-dependent transcription [29], and with c-Fos to regulate its nuclear retention and activity [30]. While βcatenin acts as a co-activator in Wnt signaling relies on nuclear translocation and interaction with transcriptional complexes, and evidence suggests that nuclear architecture and envelope components, including nesprins, can influence this process [31]. Therefore, the observed gene expression changes following depletion of lamins or nesprins are likely not restricted to genes located within lamina-associated domains (LADs), but may also result from broader perturbations in nuclear architecture and transcriptional regulatory networks. This is consistent with our findings that lamins and nesprins influence gene expression in distal, non-LAD regions.”

      The comparison of the identified DEGs to genes contained in LADs might be confounded by the fact that the authors relied on the identification of LADs from a previous study (ref #28), which used a different human cell type (human skin fibroblasts) instead of the U2OS osteosarcoma cells used in the present study. As LADs are often highly cell-type specific, the use of the fibroblast data set could lead to substantial differences in LADs.

      DamID in various mammalian cell types has shown that some LADs are cell-type invariant (constitutive LADs [cLADs]), while others interact with the NL in only certain cell types (facultative LAD [fLADs]) (Bas van Steensel, 2017). We agree that facultative LADs (fLADs), which comprise approximately half of all LADs, are often highly cell-type specific. We acknowledge that this specificity may influence the interpretation of our findings. At present, publicly available LAD datasets for U2OS cells are limited to those associated with LMNB. We concur that generating LMNA-specific LAD maps in U2OS cells would enhance the accuracy and relevance of our analyses, and we view this as an important direction for future research.

      Another limitation of the current manuscript is that, in the current form, some of the figures and results depicted in the figures are difficult to interpret for a reader not deeply familiar with the techniques, based in part on the insufficient labeling and figure legends. This applies, for example, to the isoform use analysis shown in Figure 3d or the GenometriCorr analysis quantifying spatial distance between LADs and DEGs shown in Figure 4c.

      For Figure 3, we added text in the caption to make the figure more readable “Isoform switching analysis reveals differential expression of alternative transcript variants between conditions, highlighting a shift in predominant isoform usage.” For Figure 4c, we added text in the caption “GenometriCorr analysis was used to quantify the spatial relationship between LADs and DEGs, evaluating whether the observed genomic proximity deviates from random expectation through empirical distributionbased statistical testing of pairwise distances between genomic intervals.” And also in the ‘Methods”.

      Overall appraisal and context:

      Despite its limitations, the present study further illustrates the important roles the nuclear envelope proteins lamin A, lamin C, and nesprin-2 have in chromatin organization, dynamics, and gene expression. It thus confirms results from previous studies (not always fully acknowledged in the current manuscript) previously reported for lamin A/C depletion. For example, the effect of lamin A/C depletion on increasing mobility of chromatin had already been demonstrated by several other groups, such as Bronshtein et al. Nature Comm 2015 (PMID: 26299252) and Ranade et al. BMC Mol Cel Biol 2019 (PMID: 31117946). Additionally, the effect of lamin A/C depletion on gene and protein expression has already been extensively studied in a variety of other cell lines and model systems, including detailed proteomic studies (PMIDs 23990565 and 35896617).

      We add more discussions as below “Our findings reinforce the pivotal roles of nuclear envelope proteins lamin A, LMNA and nesprin 2 in regulating chromatin organization, chromatin mobility, and gene expression. These results are consistent with and extend prior studies investigating the consequences of lamin depletion. For instance, increased chromatin mobility following the loss of lamin A/C has been previously demonstrated using live-cell imaging approaches [26,35], supporting our observations of nuclear structural relaxation and chromatin redistribution. Additionally, proteomic profiling following lamin A depletion has been extensively documented across both cellular and mouse models, providing valuable insights into the molecular consequences of nuclear envelope disruption [36,37]. While these earlier studies provide a strong foundation, our work contributes novel insights by integrating isoform-specific perturbations with spatial chromatin measurements. This approach emphasizes contextdependent regulatory mechanisms that involve not only lamina-associated regions but also nesprin-associated domains and distal genomic loci, thereby expanding the current understanding of nuclear envelope protein function in gene regulation.”

      The finding that that lamin A/C or nesprin depletion not only affects genes at the nuclear periphery but also the nuclear interior is not particularly surprising giving the previous studies and the fact that lamins A and C are also founding within the nuclear interior, where they affect chromatin organization and dynamics, and that lamins A/C and nesprins directly interact with numerous transcriptional regulators that could further affect gene expression independent from their role in chromatin organization.

      We have added the following statement into the Discussion part “Beyond their well-established role in tethering heterochromatin at the nuclear periphery through lamina-associated domains (LADs), A-type lamins (lamins A and C) also localize to the nuclear interior, where they contribute to chromatin organization and gene regulation independently of LADs [27,28]. Nuclear lamins can form intranuclear foci that associate with active chromatin and are implicated in supporting transcriptional activity. Additionally, both lamins and nesprins participate in diverse protein-protein interactions that may influence transcriptional regulation. For example, lamin A/C interacts with the retinoblastoma protein (Rb) to modulate E2F-dependent transcription [29], and with c-Fos to regulate its nuclear retention and activity [30]. While β-catenin acts as a co-activator in Wnt signaling relies on nuclear translocation and interaction with transcriptional complexes, and evidence suggests that nuclear architecture and envelope components, including nesprins, can influence this process [31]. Therefore, the observed gene expression changes following depletion of lamins or nesprins are likely not restricted to genes located within lamina-associated domains (LADs), but may also result from broader perturbations in nuclear architecture and transcriptional regulatory networks. This is consistent with our findings that lamins and nesprins influence gene expression in distal, non-LAD regions.”

      The authors provide a detailed analysis of isoform switching in response to lamin A/C or nesprin depletion, but the underlying mechanism remains unclear. Similarly, their analysis of the genomic location of the observed DEGs shows the wide-ranging effects of lamin A/C or nesprin depletion, but lets the reader wonder how these effects are mediated. A more in-depth analysis of predicted regulator factors and their potential interaction with lamins A/C or nesprin would be beneficial in gaining more mechanistic insights.

      We agree that the current findings, while highlighting the broad impact of lamin A/C or nesprin depletion on isoform usage and gene expression, do not fully elucidate the underlying regulatory mechanisms. We acknowledge the importance of identifying upstream regulators and understanding their potential interactions with lamins and nesprins. Future investigations integrating epigenetic approaches, such as ChIP-seq for transcription factors and chromatin-associated proteins, will be essential to clarify how lamins and nesprins contribute to isoform switching and to uncover the mechanistic basis of these regulatory effects.

      Reviewer #3 (Public review):

      Summary:

      This manuscript describes DOX inducible RNAi KD of Lamin A, LMNA coded isoforms as a group, and the LINC component SYNE2. The authors report on differentially expressed genes, on differentially expressed isoforms, on the large numbers of differentially expressed genes that are in iLADs rather than LADs, and on telomere mobility changes induced by 2 of the 3 knockdowns.

      Strengths:

      Overall, the manuscript might be useful as a description for reference data sets that could be of value to the community.

      We acknowledge that the initial version of our manuscript lacked comprehensive comparisons with previous studies. In our revised manuscript, we have included more detailed discussions highlighting how our findings complement and extend existing knowledge. Specifically, our study presents novel insights into the role of lamins and nesprins in regulating non-coding RNAs and isoform switching, areas that have not been extensively explored in prior literatures. We hope these additions will clarify the contribution of our work and demonstrate the potential value to the field.

      Weaknesses:

      The results are presented as a type of data description without formulation of models or explanations of the questions being asked and without follow-up. Thus, conceptually, the manuscript doesn't appear to break new ground.

      In our study, we proposed a conceptual model in which gene expression changes are linked to RNA synthesis, chromatin conformation alterations, and chromatin modifications, potentially mediated by lamin A, LMNA, and nesprin-2 at the transcriptional level. However, we acknowledge that this model remains preliminary and largely unexplored. We agree that additional mechanistic insights and identification of specific regulatory factors are needed to strengthen this framework. Future studies will aim to experimentally validate these hypotheses and clarify the pathways and regulators involved.

      Not discussed is the previous extensive work by others on the nucleoplasmic forms of LMNA isoforms. Also not discussed are similar experiments- for instance, gene expression changes others have seen after lamin A knockdowns or knockouts, or the effect of lamina on chromatin mobility, including telomere mobility - see, for example, a review by Roland Foisner (doi.org/10.1242/jcs.203430) on nucleoplasmic lamina. The authors need to do a thorough search of the literature and compare their results as much as possible with previous work.

      We sincerely thank the reviewer for pointing out the important body of previous work on the nucleoplasmic forms of LMNA isoforms and the impact of lamin A depletion on gene expression and chromatin mobility. In the revised version, we have now included relevant citations. Please see the highlights in the Discussion.

      The authors don't seem to make any attempt to explore the correlation of their findings with any of the previous data or correlate their observed differential gene expression with other epigenetic and chromatin features. There is no attempt to explore the direction of changes in gene expression with changes in nuclear positioning or to ask whether the genes affected are those that interact with nucleoplasmic pools of LMNA isoforms. The authors speculate that the DEG might be related to changing mechanical properties of the cells, but do not develop that further.

      We sincerely appreciate the reviewer’s insightful comments. In our revised manuscript, we have addressed this concern by comparing our telomere mobility results with previously published data (Bronshtein et al., 2015), and we observe consistent findings showing that lamin A depletion leads to increased telomere motility. Furthermore, our study provides novel evidence that nesprin-2 depletion similarly enhances telomere migration, suggesting a broader role for nuclear envelope components in chromatin dynamics.

      We acknowledge the importance of integrating gene expression data with epigenetic and chromatin features. However, to our knowledge, such datasets are currently limited for U2OS cells, particularly in the context of lamin and nesprin perturbation. We agree that understanding the correlation between differentially expressed genes and nuclear positioning or interactions with nucleoplasmic pools of LMNA isoforms is a promising direction. We are actively planning future studies that include chromatin profiling and mechanical perturbation assays to further explore these mechanisms.

      The technical concerns include: 1) Use of only one shRNA per target. Use of additional shRNAs would have reduced concern about possible off-target knockdown of other genes; 2) Use of only one cell clone per inducible shRNA construct. Here, the concern is that some of the observed changes with shRNA KDs might show clonal effects, particularly given that the cell line used is aneuploid. 3) Use of a single, "scrambled" control shRNA rather than a true scrambled shRNA for each target shRNA.

      (1) Regarding the use of a single shRNA per target, we agree that utilizing multiple independent shRNAs would strengthen the conclusions. In our study, we selected validated shRNA sequences with minimal predicted off-targets and confirmed knockdown efficiency at mRNA level (by qPCR).

      (2) As for the use of a single cell clones per inducible construct, we understand the concern that clonal variability, particularly in an aneuploid cell line, could influence the observed phenotypes. To clarify this, we have revised in the manuscript “Multiple independent clones per shRNA were screened for knockdown efficiency using reverse transcription quantitative real-time PCR (RT-qPCR). Three clones demonstrating robust and consistent knockdown were selected and expanded. These clones were subsequently pooled to minimize clonal variability and used for downstream analyses, including RNA-seq”. To mitigate this, we ensured consistent results across biological replicates and used inducible systems to reduce variability introduced by random integration. 

      (3) We also acknowledge that the use of a single scrambled shRNA control, rather than matched scrambled controls for each construct, is a limitation. While we used a standard non-targeting scrambled shRNA commonly applied in similar studies, we understand that distinct scrambled sequences might better control for construct-specific effects. .

      Reviewer #1 (Recommendations for the authors):

      Please make the processed RNA-seq data available for each individual experiment, not only the raw reads and averaged data.

      In response to your suggestion, we have now included the raw count data for each individual experiment in Supplementary Table S5 to enhance transparency and reproducibility.   

      Reviewer #2 (Recommendations for the authors):

      The current text contains numerous typos, and some of the text could benefit from additional editing for clarity and conciseness. In addition, several statements, particularly in the section encompassing lines 321-329, lack supporting references.

      In our revised version, we have carefully edited the text for clarity and conciseness.

      We have included related citations from lines 321-329: “The majority of genes located within LADs tend to be transcriptionally repressed or expressed at low levels. This is because LADs are associated with heterochromatin , a tightly packed form of DNA that is generally inaccessible to the cellular machinery required for gene expression 12,23. Lamin mutations and levels have shown to disrup LAD organization and gene expression that have been implicated in various diseases, including cancer and laminopathies 24,25.”

      The figures would benefit from better labeling, including a clear schematic of which specific regions of the LMNA and SYNE2 genes are targeted by the different shRNA constructs, and by labeling the different isoforms in Figure S1 with the common names. Furthermore, note that lamin A arises from posttranslational processing of prelamin A, not from a different transcript. Likely, the "different LMNA genes" shown in Supplementary Figure S1 are just different annotations, with the exceptions of the splice isoforms lamin C and lamin delta10.

      In the Method, we have clearly denoted the design of corresponding shRNAs as suggested “The shRNA designated as shLMNA targets a region within the coding sequence of LMNA that is shared by both lamin A and lamin C, corresponding to amino acids 122–129 (KKEGDLIA) of lamin A/C (RefSeq: NM_001406985.1). The shRNA against SYNE2 (shSYNE2) targets a sequence encoding amino acids 5133– 5140 (KRYERTEF) of the SYNE2 protein (RefSeq: NM_182914.3).”

      For Figure S1, we have added common isoform names to figure and captions. “lamin A (ENST00000368300.9), LMNA 227 (ENST00000675431.1), pre-lamin A/C (ENST00000676385.2), and lamin C (ENST00000677389.1)."

      Several statements about the novelty of the findings or approach are inaccurate. For example, the authors state in the introduction that "However, whether lamins and nesprins actively govern chromatin remodeling and isoform switching beyond their wellcharacterized functions in mechanotransduction remains an open question", as several previous studies have provided detailed characterization of lamin A/C depletion or mutations on chromatin organization, mobility, and gene expression. The authors should revise these statements and better acknowledge the previous work.

      We have added the citations of previous works and revised the text “While significant progress has been made in understanding the role of lamins in genome organization, the precise mechanisms by which lamins and nesprins regulate gene expression through distal chromatin interactions remain incompletely understood [10,11]. Notably, recent evidence suggests a reciprocal interplay between transcription and chromatin conformation, where gene activity can influence chromatin folding and vice versa [12]. However, whether lamins and nesprins actively govern chromatin remodeling and isoform switching beyond their well-characterized functions in mechanotransduction remains an open question.”

      Reviewer #3 (Recommendations for the authors):

      Overall, the manuscript might be useful as a description for reference data sets that could be of value to the community. Otherwise, I did not derive meaningful biological insights from the manuscript. It was not clear to me also how much might be repeating previous work already reported in the literature (see below). For example, I cited a review on nucleoplasmic lamins by Roland Foisner at the end of the specific comments - scanning it very quickly shows that there are already papers on increased chromatin mobility after lamin perturbations, including telomeres. I know there have also been studies of changes in gene expression after lamin A and B KD. The authors need to do a thorough search of the literature and compare their results as much as possible with previous work.

      We acknowledge that the roles of lamins in regulating chromatin dynamics and gene expression, including the effects of lamin perturbations on chromatin mobility and telomere behavior, have been previously reported. In response, we have revised the manuscript to incorporate relevant citations and to better contextualize our results within the existing literature. Importantly, to our knowledge, the finding that nesprin-2 influences telomere mobility has not been previously reported, and we have highlighted this novel observation in the revised text.

      In response, we have now conducted a more comprehensive literature review and revised the manuscript accordingly to better contextualize our findings. Specifically, we have added comparisons to prior studies reporting chromatin mobility changes following lamin A/C depletion. We also now emphasize the novel aspects of our study, such as the isoform-specific perturbations and the integration of spatial chromatin organization with transcriptomic outcomes.

      We hope these revisions strengthen the manuscript’s contribution as both a useful resource and a mechanistic investigation.

      Not even acknowledged is the previous extensive work on the nucleoplasmic forms of LMNA isoforms - I know Robert Goldman published extensively on this, implicating lamin A, for example, on DNA replication in the nuclear interior as well as transcription. More recently, Roland Foisner worked on this, including with molecular approaches. For example, a 2017 review mentions previous ChIP-seq mapping of lamin A binding to iLAD genes and also describes previous work on chromatin mobility, including telomere mobility. Yet the entire writing in the manuscript seems to only discuss the role of LMNA isoforms in the nuclear lamina per se, explaining the surprise in seeing many iLAD genes differentially expressed after KD.

      We have added related studies as suggested by the reviewer and  added the following statement: “Nucleoplasmic lamins bind to chromatin and have been indicated to regulate chromatin accessibility and spatial chromatin organization [24]. Lamins in the nuclear interior regulate gene expression by dynamically binding to heterochromatic and euchromatic regions, influencing epigenetic pathways and chromatin accessibility. They also contribute to chromatin organization and may mediate mechanosignaling [25]. However, the contribution of nesprins and lamins to isoform switch and chromatin dynamics has not been fully understood [7,10,26]. ”

      Overall, I found a surprising lack of review and citation of previous work (see Specific comments below), including the lack of citations for various declarative statements about previous conclusions in the field about lamin A.

      (1) Introduction:

      "However, the contribution of nesprins and lamins to gene 220 expression has not been fully understood."

      There is a literature about changes in gene expression- at least for lamin KD and KO- both in vitro and in vivo- that the authors could and should review and summarize here.

      To address this, we have now revised the manuscript to include a more comprehensive discussion of the relevant literature and added appropriate citations in the corresponding section. We hope this addition provides better context for our current findings and clarifies the contribution of lamins and nesprins to gene regulation.

      (2) Results:

      "A fragment of shRNA that targeting 3' untranslated region (UTR) in LMNA genes was chosen to knockdown lamin A (shLaminA). A fragment of shRNA that targeting coding sequence (CDS) region in LMNA genes was chosen to knockdown LMNA (shLMNA)". The authors should explain more - does one KD both lamin A and C (shLMNA), versus the other being specific to lamin A but not lamin C? It appears so from later text, but the authors should explicitly explain their targeting strategy right at the beginning to make this clear.

      To make the method clearer, we have clear added the text “The shRNA against lamin A (shLaminA) targets the 3′ untranslated region (UTR) of the LMNA gene, specific to prelamin A, which is post-translationally processed into mature lamin A. The shRNA designated as shLMNA targets a region within the coding sequence of LMNA that is shared by both lamin A and lamin C, corresponding to amino acids 122–129 (KKEGDLIA) of lamin A/C (RefSeq: NM_001406985.1). The shRNA against SYNE2 (shSYNE2) targets a sequence encoding amino acids 5133–5140 (KRYERTEF) of the SYNE2 protein (RefSeq: NM_182914.3).”

      But more importantly, the convention with RNAi is to demonstrate consistent results with at least two different small RNAs. This is to rule out that a physiological result is due to the KD of a non-target gene(s) rather than the target gene. The scrambled shRNA controls are not sufficient for this as they test a general effect of the shRNA culture conditions, including tranfection and dox treatment, etc, rather than a specific KD of a different gene(s) than the target due to off-target RNAi.

      We fully acknowledge the concern regarding the use of only a single shRNA per knockdown and agree that shRNAs are prone to off-target effects. However, we have conducted qPCR confirmation of key RNAseq findings, which strongly supports the specificity and validity of our observed results. Additionally, we recognize the importance of validating our findings using multiple independent shRNAs or alternative knockdown strategies, such as CRISPR deletion or degron-based approaches. To address this rigorously, we are currently optimizing an auxin-inducible degron system (AtAFB2) for targeted depletion of lamin C. Our preliminary data indicate approximately 40% knockdown efficiency after 16 hours of auxin induction, highlighting ongoing optimization efforts (Author response image 1). Future experiments will integrate this improved degron system and multiple independent shRNAs to further substantiate our results and definitively rule out potential off-target effects, thereby enhancing the robustness and reproducibility of our data.

      (3) "Single-cell clones 114 were subsequently isolated and expanded in the presence of 2 μg ml-1 puromycin to 115 establish doxycycline-inducible shRNA-knockdown stable cell lines."

      The authors need to describe explicitly in the Results how exactly they did these experiments. Did they do their analysis using a single clone from each lentivirus shRNA transduction? Did they do analysis - ie RNA-seq- on several clones from the same shRNA transduction and compare? Did they pool clones together?

      In our study, single-cell clones and pooled the three independent clones were mixed following lentiviral transduction with doxycycline-inducible shRNA constructs and selected with 2 μg/ml puromycin. For each shRNA, we screened multiple clones for knockdown efficiency and selected a representative clone exhibiting robust knockdown for downstream experiments, including RNA-seq. We did pool three multiple clones; all functional analyses were performed on pooled clones. We have now revised the Method section to explicitly describe this experimental design: “Multiple independent clones per shRNA were screened for knockdown efficiency using reverse transcription quantitative real-time PCR (RT-qPCR). Three clones demonstrating robust and consistent knockdown were selected and expanded. These clones were subsequently pooled to minimize clonal variability and used for downstream analyses, including RNAseq.”

      One confounding problem is that there are clonal differences among cells cloned from a single cell line. This is particularly true for aneuploid cell lines like U2OS. Ideally, they would use mixed clones, but if not, they should at least explain what they did.

      We added the text to method “Three single-cell clones exhibiting robust knockdown efficiency were individually expanded and subsequently pooled. The pooled clones were maintained in medium containing 2 µg ml ¹ puromycin to establish stable cell lines with doxycycline-inducible shRNA expression. Multiple independent clones per shRNA were screened for knockdown efficiency using reverse transcription quantitative real-time PCR (RT-qPCR). Three clones demonstrating robust and consistent knockdown were selected and expanded. These clones were subsequently pooled to minimize clonal variability and used for downstream analyses, including RNA-seq.”

      (4) I am confused by their shScramble control. This is typically done for each shRNA- ie, a separate scrambled control for each of the different target shRNAs. This is because there are nucleotide composition effects, so the scrambled idea is to keep the nucleotide composition the same.

      However, looking at STable 1 and SFig. 2- shows they used a single scrambled control, thus not controlling for different nucleotide composition among the three shRNAs that they used.

      In our study, we used a single non-targeting shRNA (shScramble) as a control to account for potential effects of the shRNA vector and delivery system. This approach is commonly accepted in the field when the scrambled sequence is validated as non-targeting and does not share significant homology with the genes of interest. While we acknowledge that using separate scrambled controls matched in nucleotide composition for each targeting shRNA can further minimize sequence-dependent effects, we believe that the use of a single validated scramble control is appropriate for the scope of this study.

      (5) In Figure 2 - what is on the x-axis? Number of DEG? Please state this explicitly in the figure legend.

      We have added “Counts” as figure legend, and added the caption “Gene counts are displayed on the x-axis.”

      (6) More importantly, in Figure 2 they only show pathway analysis of DEG. They should show more: a) Fold-change of DEG displayed for all DEG; b) Same for genes in LADs vs iLADs. More explicitly, are the DEG primarily in LADs or iLADs, or a mix? Are the DEGs in LADs biased towards increased expression, as might be expected for LAD derepression? Conversely, what about iLADs - is there a bias towards increased or decreased expression?

      We agree that a more detailed characterization of the differentially expressed genes (DEGs) will strengthen the conclusions. In response we have revised the manuscript as following: “Furthermore, differential expression analysis revealed that the majority of DEGs following depletion of lamins and nesprins were located outside lamina-associated domains (non-LADs). Specifically, for shLaminA knockdown, 8 DEGs within LADs were downregulated and 8 were upregulated, whereas 59 non-LAD DEGs were downregulated and 79 were upregulated. For shLMNA, 7 LAD-associated DEGs were downregulated and 15 were upregulated, with 88 downregulated and 140 upregulated DEGs in non-LAD regions. In the case of shSYNE2 knockdown, 161 LAD DEGs were downregulated and 108 were upregulated, while 2,009 non-LAD DEGs were downregulated and 1,851 were upregulated (Figure 2d). These results indicate that the transcriptional changes resulting from the loss of lamins or nesprins predominantly occur at non-LAD genomic regions.”

      We appreciate the reviewer’s comments, which helped improve the clarity and depth of our analysis.

      (7) Is there a scientific rationale for the authors' focus on DE of isoforms? Is this somehow biologically meaningful and different from the overall DE of all genes? The authors should explain in the Results section what their motivation was in deciding to do this analysis.

      We have add the following statement in response to the reviewer “To uncover transcript-specific regulatory changes, we performed isoform-level differential expression analysis. Many genes produce functionally distinct isoforms, and shifts in their usage can occur without changes in total gene expression, making isoform-level analysis essential for detecting subtle but meaningful transcriptional regulation.  Our analysis demonstrated that depletion of lamins and nesprins induced significant alterations in specific transcript isoforms, indicating regulatory changes in alternative splicing or transcription initiation that are not captured by gene-level differential expression analysis.”

      (8) "Expectedly, the DEGs from 327 depletion of lamin A, LMNA, and SYNE2 seldom intersected with genes in 328 LADs (Figure 4a)."

      Why was this expected? The authors have only cited one review paper. Others have seen significant numbers of genes in LADs that are DE after KD of lamina proteins. What was the fold cutoff used for DE? Was there a cutoff for the level of expression prior to KD? The authors should cite relevant primary literature showing that there are active genes in LADs and that some perturbations of the lamina proteins do result in DE of genes in LADs.

      We acknowledge the reviewer's concerns regarding our statement: "Expectedly, the DEGs from 327 depletion of lamin A, LMNA, and SYNE2 seldom intersected with genes in 328 LADs (Figure 4a)." To clarify, this expectation stems from previous observations that LAD-associated genes are typically transcriptionally silent or expressed at very low levels (Guelen et al., 2008). However, dynamic changes in LADs and gene expression status do occur during cellular differentiation (Peric-Hupkes et al., 2010), and some LAD-resident genes can become active and transcriptionally responsive under specific conditions, such as T cell activation. We applied specific foldchange and baseline expression level thresholds in our analysis, as detailed in the Methods section. We added the following text in the “Method”: “Differential gene expression analysis was performed using thresholds of baseMean > 50, absolute log fold change > 0.5, and p-value < 0.05.”  We agree that additional relevant primary literature demonstrating active gene expression changes within LADs upon perturbation of lamina proteins should be cited and we have added the following statement:

      “LADs exhibit dynamic reorganization and changes in gene expression during cellular differentiation [30]. Although genes within LADs are generally transcriptionally silent or expressed at low levels [31], some LAD-resident genes remain active and can be transcriptionally modulated in response to specific stimuli, such as T cell activation [32].”

      (9) "Expectedly, the DEGs from 327 depletion of lamin A, LMNA, and SYNE2 were seldomly intersected with genes in 328 LADs (Figure 4a)." I disagree with the wording of "seldom" which by definition means rarely. I don't see that this applies to the significant number of genes that are in LADs that are DE as shown in the Venn diagram, Fig. 4a. For example, this includes 57 genes for the shLamin A and ~400 genes for the shSYNE2.

      Is there anything of note about which genes are DE within LADs?

      We have rephrased the text to the following “The Venn diagram analysis revealed limited overlap between DEGs resulting from knockdown of lamin A (shLaminA), LMNA (shLMNA), or SYNE2 (shSYNE2) and genes located within laminaassociated domains (LADs). Specifically, only a small subset of DEGs intersected with LAD-associated genes across all three knockdowns, suggesting that the majority of transcriptional changes occur outside LAD regions”. The DEGs in LADs and non-LADs were shown in supplementary Table S4.

      (10) "The relative distance from DE genes (query features) to LADs (reference feature) is plotted by GenometriCorr package (v 1.1.24). The color depicting deviation from the expected distribution and the line indicating the density of the data at relative distance are shown." The authors should explicitly describe what the reference "expected distribution" was based on. This is all very cryptic right now, so we can't assess the biological possible significance. Third, they should clearly explain what is plotted on the x and y axes of Figure 4C. I really don't have a clue. I assume the x-axis is some measure of "relative distance" but what on earth does that mean? I really don't understand this plot, which is crucial to the whole story. What is on the y-axis? Density of DEGs? What? And they need to explain not only what is plotted on the x and y axes but also provide units.

      We have revised the text to clarify that the GenometriCorr analysis (v1.1.24) was used to assess the spatial association between differentially expressed genes (DEGs, query features) and lamina-associated domains (LADs, reference features). Specifically, this method evaluates whether the observed distances between query and reference genomic intervals significantly deviate from a null distribution generated by random permutation of query features across the genome, while preserving size and chromosomal context.

      In the revised figure legend and main text, we now clarify that the x-axis represents the relative genomic distance between each differentially expressed gene (DEG) and the nearest LAD, scaled between –1 and 1, where values near 0 indicate close proximity, and values approaching –1 or 1 reflect greater distances on either side of the LADs. The y-axis denotes the density (or proportion) of query features (DEGs) at each relative distance bin. The color gradient overlays the plot to indicate deviation from the expected null distribution (based on randomized query positions): red indicates enrichment (closer than expected), while blue indicates depletion (further than expected).

      “GenometriCorr analysis (v1.1.24) was used to assess the spatial relationship between DEGs (query) and LADs (reference) [48]. The x-axis shows the relative genomic distance between each DEG and the nearest LAD, scaled from –1 (far upstream) to 1 (far downstream), with 0 indicating closest proximity. The y-axis represents the density of DEGs at each distance bin. A color gradient indicates deviation from a randomized null distribution: red signifies enrichment (closer than expected), and blue signifies depletion. Statistical significance was determined using the Jaccard test (p < 0.05).”

      Second, to correlate with other features and to give more meaning, the authors should show the chromosome location of the DEGs and scale this by the actual DNA sequence distances. This will be needed to correlate with other features from other studies.

      The genomic positions of DEGs have now been displayed in Figure 4b, with distances shown in base pairs to facilitate cross-reference with other features in future studies.

      Third, they should attempt some kind of analysis themselves to try to understand what might correlate with the DEGs. To begin with, they might try to correlate with lamin A ChiP-seq or other molecular proximity assays. Others in fact have shown that lamin A interacts with 5' regulatory regions of a subset of genes- presumably this is the diffuse nucleoplasmic pool of lamin A that has been studied by others in the past.

      We agree that understanding potential regulatory mechanisms underlying DEG distribution is essential. In response, we have expanded our analysis (Figure 2d) to highlight that a substantial portion of DEGs are located outside of LADs, suggesting potential regulation by the nucleoplasmic pool of lamin A. This is consistent with previous studies showing lamin A interaction with regulatory elements such as 5′ UTRs and enhancers, independent of LAD localization. We have now cited relevant literature to support this hypothesis.

      Fourth, in the table, they should go beyond just giving the fold change in expression. Particularly for genes that are expressed at very low levels, this is not particularly meaningful as it is very sensitive to noise. They should provide a metric related to levels of expression both before and after the KD.

      We acknowledge the reviewer’s concern regarding fold-change interpretation for low-abundance transcripts. To improve clarity and interpretability, we have now included Supplementary Table S4, which provides the raw counts and baseMean values (average normalized expression across all samples) for all DEGs. Additionally, we note that in our differential expression analysis, genes with baseMean < 50 and absolute log<sub>2</sub>fold change > 0.5 were filtered out to reduce potential noise from low-expression genes.

      (11) The figure legend and description in the Results section were completely inadequate. I had little understanding of what was being plotted. It is not sufficient to simply state the name of some software package that they used to measure "XYZ" and to show the results. It has no meaning for the average reader.

      Without some type of explanation of rationale, questions being asked, and conclusions made of biological relevance, this section made zero impact on me.

      Yes- details can be provided in the Methods. But conceptually, the methods and the conceptual underpinnings of the approach and as the question being asked and the rationale for the approach, with the significance of the results, need to be developed in the Results section.

      In response, we have revised the “Results” section to better articulate the rationale behind the analysis, the specific biological questions we aimed to address, and the conceptual relevance of the method used. We have also clarified the meaning of the plotted data and how it supports our conclusions.

      While technical details remain in the “Methods” section, we now provide a more accessible narrative in the Results to guide the reader through the approach and highlight the biological significance of our findings. We hope these revisions make the section more informative and impactful.

      (12) The telomere movement part of the manuscript seems to come out of nowhere. Why telomeres? Where are telomeres normally positioned, particularly relative to the nuclear lamina? Does this change with the KDs - particularly for those that increase motion? The MSD for SYNE2 appears unconstrained- they should explore longer delta time periods to see if it reaches a point of constrained movement.

      If the telomeres are simply tethered at the nuclear lamina, then is that the explanation- that they become untethered? But if they are not typically at the periphery, then where are they relative to other nuclear compartments? And why is there mobility changing? Is it related to the loss of nuclear lamina positioning of adjacent LAD regions to the telomeres? Is it an indirect, secondary effect? What would they see after an acute KD? What about other chromosome regions? Again, there is little explanation for the rationale for these observations. It is one of many possible experiments they could have done. Why did they do this one?

      We added the following explanation “Although telomeres are not uniformly tethered to the nuclear lamina, they can transiently associate with the nuclear periphery, particularly during post-mitotic nuclear reassembly, through interactions involving SUN1 and RAP1 36. Given that lamins and nesprins are key components of the nuclear envelope that regulate chromatin organization and mechanics 37,38, we examined telomere dynamics as a proxy for changes in nuclear architecture. Using EGFP-tagged dCas9 to label telomeric regions in live U2OS cells, we assessed whether knockdown of these proteins leads to increased telomere mobility, reflecting a loss of structural constraint or altered chromatin–nuclear envelope interactions 17.” And “To probe how nuclear envelope components regulate chromatin dynamics, we tracked telomeres as a representative genomic locus whose mobility reflects changes in nuclear mechanics and chromatin organization. Although telomeres are not stably tethered to the nuclear lamina, their motion can be influenced by nuclear architecture and transient peripheral associations [36]. Upon depletion of lamin A, LMNA, or SYNE2, we observed significantly increased telomere mobility and nuclear area explored, quantified by mean square displacement and net displacement (Figure 6b–c, Supplementary Movie S1). These changes likely reflect altered chromatin–lamina interactions or disrupted nuclear mechanical constraints, consistent with prior studies showing that lamins modulate chromatin dynamics and nuclear stiffness [37,38,39]. Thus, our findings support a role for lamins and nesprins in constraining chromatin motion through nuclear structural integrity.”

      (13) "Notably, Lamin A depletion led to enrichment of 392 pathways associated with RNA biosynthesis, supporting its previously suggested role 393 in transcriptional activation and ribonucleotide metabolism."

      There is a literature on this. Say more and cite the references.

      Notably, lamin A depletion led to enrichment of pathways associated with RNA biosynthesis, supporting its previously suggested role in transcriptional activation and ribonucleotide metabolism 45.  

      (14) "This aligns with prior studies indicating that Lamin A contributes to chromatin accessibility and RNA polymerase activity." Again, there is a literature on this. Say more and cite the references.

      This aligns with prior studies indicating that lamin A contributes to chromatin accessibility and RNA polymerase activity 46. These findings further underscore the functional relevance of lamin A in coordinating transcriptional programs through modulation of nuclear architecture.

      (15) "In contrast, LMNA knockdown was linked to alterations in chromatin conformation." No. The authors show gene ontology and implicate perturbed RNA levels for genes implicated in "chromatin conformation". That is not the same thing as measuring chromatin conformation, which is not done, and showing changes in conformation.

      Based on the reviewer’s comment we have revised the text as the following: “In contrast, LMNA knockdown led to differential expression of genes enriched in pathways related to chromatin organization, suggesting potential disruptions in chromatin regulatory networks. Although direct measurements of chromatin conformation were not performed, these transcriptional changes indicate that LMNA may contribute to maintaining nuclear architecture and genomic stability, which aligns with its established involvement in laminopathies and genome integrity disorders.”

      (16) "The findings that DEGs are predominantly located in non-LAD regions highlight a unique regulatory aspect of lamins and nesprins, emphasizing their spatial specificity in gene expression". Is this novel? Can the authors separate direct from indirect effects? Is the percentage of genes in LADs that are altered in expression different from the percentage of genes in iLADs that are altered in expression? There are many more active genes in iLADs, so one expects more DEGs in iLADs even if this is random. Also - how does this correlate with lamin A binding near 5' regulatory regions detected by ChIP-seq? See the following review for references to this question and also previous work on lamin A versus chromatin mobility, including telomeres. J Cell Sci (2017) 130 (13): 2087-2096. https://doi.org/10.1242/jcs.203430

      We appreciate the reviewer’s valuable comments and feedback, we have revised the manuscript as the following to address the feedback. “Furthermore, differential expression analysis revealed that the majority of DEGs following depletion of lamins and nesprins were located outside lamina-associated domains (non-LADs). Specifically, for shLaminA knockdown, 8 DEGs within LADs were downregulated and 8 were upregulated, whereas 59 non-LAD DEGs were downregulated and 79 were upregulated. For shLMNA, 7 LAD-associated DEGs were downregulated and 15 were upregulated, with 88 downregulated and 140 upregulated DEGs in non-LAD regions. In the case of shSYNE2 knockdown, 161 LAD DEGs were downregulated and 108 were upregulated, while 2,009 non-LAD DEGs were downregulated and 1,851 were upregulated (Figure 2d, Supplementary Table S4). These results indicate that the transcriptional changes resulting from the loss of lamins or nesprins predominantly occur at non-LAD genomic regions.

      The percentage of DEGs was consistently higher in non-LADs, which are gene rich and transcriptionally active, whereas LADs, known to be enriched for silent or lowly expressed genes, showed fewer expression changes. These findings are consistent with previous studies demonstrating that active genes are more prevalent in non-LADs and that LAD associated genes are generally repressed or less responsive to perturbation [27,28]. Together, these results support a model in which lamins and nesprins influence gene expression through both structural organization and promoter proximal interactions, particularly within euchromatic nuclear regions [10,26,29].”

    1. XG385, XG752, and XH855) that contained a βgeo fusion gene inserted into intron B of Slain1 (as illustrated Fig. 8A) were obtained from the Mutant Mouse Regional Resource Center (http://www.mmrrc.org/)

      DOI: 10.1016/j.ydbio.2006.01.023

      Resource: Mutant Mouse Regional Resource Center (RRID:SCR_002953)

      Curator: @AleksanderDrozdz

      SciCrunch record: RRID:SCR_002953


      What is this?

    1. There will also a be refugee flow going to the Netherlands from other parts of Europe, some ofthese people also require care, it is assumed they will flow in via the regular healthcare chain (via a generalpractitioner).

      herschijven

    Annotators

    1. eLife Assessment

      This important work represents an advance in our understanding of resident myeloid cells in the zebrafish brain, particularly as it provides a molecular definition of dendritic cell subtypes associated with their localization. Combined evidence from single cell transcriptomics and histology is compelling. The associated atlas will be used as a resource by the zebrafish community and beyond.

    2. Reviewer #1 (Public review):

      Using several zebrafish reporter lines, the authors characterized immune cells in the adult zebrafish brain, identifying a population of DC-like cells with distinct regional distribution and transcriptional profiles. These cells were distinct from microglia and other macrophages, closely resembling murine cDC1s. Analysis of different mutants revealed that this DC population depends on Irf8, Batf3 and Csf1rb, but not Csf1ra.

      This elegantly designed study provides compelling evidence for additional heterogeneity among brain mononuclear phagocytes in zebrafish, encompassing microglia, macrophages, and DC-like cells. It advances our understanding of the immune landscape in the zebrafish brain and facilitates better distinction of these cell types from microglia.

    3. Reviewer #2 (Public review):

      The authors made an atlas of single-cell transcriptome of on a pure population of leukocytes isolated from the brain of adult Tg(cd45:DsRed) transgenic animals by flow cytometry. Seven major leukocyte populations were identified, comprising microglia, macrophages, dendritic-like cells, T cells, natural killer cells, innate lymphoid-like cells and neutrophils. Each cluster was analyzed to characterize subclusters. Among lymphocytes, in addition to 2 subclusters expressing typical T cell markers, a group of il4+ il13+ gata3+ cells was identified as possible ILC2. This hypothesis is supported by the presence of this population in rag2KO fish, in which the frequency of lck and zap70+ cells is strongly reduced. The use of KO lines for such validations is a strength of this work (and the zebrafish model).

      The subcluster analysis of mpeg1.1 + myeloid cells identified 4 groups of microglial cells, one novel group of macrophage like cells (expressing s100a10b, sftpbb, icn, fthl27, anxa5b, f13a1b and spi1b), and several groups of DC like cells expressing the markers siglec15l, ccl19a.1, ccr7, id2a, xcr1a.1, batf3, flt3, chl1a and hepacam2.Combining these new markers and transgenic reporter fish lines, the authors then clarified the location of leukocyte subsets within the brain, showing for example that DC-like cells stand as a parenchymal population along with microglia. Reporter lines were also used to perform detailed analysis of cell subsets, and cross with a batf3 mutant demonstrated that DC like cells are batf3 dependent, which was similar to mouse and human cDC1. Finally, analysis of classical mononuclear phagocyte deficient zebrafish lines showed they have reduced numbers of microglia but exhibit distinct DC-like cell phenotypes. A weakness of this study is that it is mainly based on FACS sorting, which might modify the proportion of different subtypes.

      This atlas of zebrafish brain leukocytes is an important new resource to scientists using the zebrafish models for neurology, immunology and infectiology, and for those interested in the evolution of brain and immune system.

    4. Reviewer #3 (Public review):

      Rovira, et al., aim to characterize immune cells in the brain parenchyma and identify a novel macrophage population referred to as "dendritic-like cells". They use a combination of single-cell transcriptomics, immunohistochemistry, and genetic mutants to conclude the presence of this "dendritic-like cell" population in the brain. The strength of this manuscript is the identification of dendritic cells in the brain, which are typically found in the meningeal layers and choroid plexus. In addition, Rovira, et al., findings are supported by the findings of the Wen lab and a recent Cell Reports paper. Congratulations on the nice work!

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Weaknesses:

      While scRNA-seq data clearly revealed different subsets of microglia, macrophages, and DCs in the brain, it remains somewhat challenging to distinguish DC-like cells from P2ry12- macrophages by immunohistochemistry or flow cytometry.

      Indeed, in flow cytometry analyses of adult brain samples, the p2ry12<sup>-</sup>; mpeg1<sup>+</sup> fraction could, in theory, encompass not only DC-like cells but also other macrophage subsets, as well as B cells, since B cells have been reported to express mpeg1 in zebrafish (Ferrero et al., 2020; Moyse et al., 2020). Nevertheless, our data strongly indicate that within the brain parenchyma, DC-like cells represent the predominant component of this population. This conclusion is supported by the pronounced reduction of p2ry12<sup>-</sup>; mpeg1<sup>+</sup> cells in brain sections from ba43 mutants, in which DC development is impaired. Currently, further phenotypic resolution is constrained by the limited availability of zebrafish-specific antibodies and the restricted palette of fluorescent reporter lines capable of distinguishing MNP subsets. We anticipate that future efforts, including the generation of novel transgenic lines informed by our dataset (initiatives already underway in our group), will enable more precise discrimination among these distinct subsets.

      Reviewer #2 (Public Review):

      A weakness of this study is that it is mainly based on FACS sorting, which might modify the proportion of different subtypes.

      We agree that reliance solely on FACS could potentially introduce biases in the proportions of different subtypes. To minimize this concern, we complemented our flow cytometry data with quantification performed directly on brain sections using immunohistochemistry. This approach allowed us to validate cell population distributions in situ, thereby confirming that the trends observed by FACS accurately reflect the cellular composition of microglia and DC-like cells within the brain parenchyma.

      Reviewer#3 (Public Review):

      A weakness is the lack of specific reporters or labeling of this dendritic cell population using specific genes found in their single-cell dataset. Additionally, it is difficult to remove the meningeal layers from the brain samples and thus can lead to confounding conclusions. Overall, I believe this study should be accepted contingent on sufficient labeling of this population and addressing comments.

      While the generation of DC-like specific transgenic lines is indeed a promising direction (and such efforts are currently underway in our group), creating and validating these lines is time-consuming. Importantly, although these additional tools will be valuable for future functional investigations, we believe they would not impact the main conclusions or core message of our current work, where we already provide detailed spatial information on DC-like cells, and we demonstrated their lineage identity through the use of our newly generated batf3 mutant line. 

      Recommendations for the authors:

      Major Comments: 

      The authors should discuss another recent report demonstrating DCs in the zebrafish brain, which also developed independently of Csf1ra, and compare the two datasets (Zhou et al. Cell reports, 2023).

      Thank you for highlighting the study by Zhou et al., which offers complimentary insight into the dendritic cell population in the zebrafish brain. We note that in this work, the authors reclassify ccl34b.1<sup>-</sup> mpeg1<sup>+</sup> brain-resident cells as conventional DCs, thus revising their earlier interpretation of these cells as microglia (Wu et al., 2020). This shift in interpretation is based on their transcriptional comparison between the previously characterized ccl34b.1<sup>-</sup> mpeg1<sup>+</sup> population and a new dataset of brain

      mpeg1<sup>+</sup> cells. This updated classification aligns closely with our findings. Given that our data already demonstrate the equivalence between the DC-like cells described in our study and the ccl34b.1<sup>-</sup> mpeg1<sup>+</sup> population, repeating a direct transcriptional comparison would be redundant. We have now included a discussion of this work in the revised manuscript. Specifically, we have added the following sentences in the discussion: “Importantly, since the submission of our manuscript, the Wen lab published an independent study in which they now reclassify the ccl34b.1<sup>-</sup> mpeg1<sup>+</sup> cells in the zebrafish brain as cDCs, revising their earlier interpretation of these cells as microglia (Zhou et al., 2023)”. 

      Data reported in Figure 5 should be quantified (cell numbers, how many brains analyzed). 

      Thank you for this comment. We would like to clarify that the primary purpose of Figure 5 (and Figure 5 supplement 1) is to provide an initial qualitative overview of the different MNP subsets present in the adult brain, using the currently available transgenic and immunohistochemical tools. These descriptive analyses were instrumental in identifying the most reliable combination, namely the Tg(p2ry12:p2ry12GFP; mpeg1.1:mCherry) double transgenic line in conjunction with L-plastin immunostaining, to distinguish microglia from other parenchymal MNPs. Quantitative analyses using this optimized strategy are presented in Figure 7 (Figure 7 supplement 1), where we systematically enumerate the different MNPs. We therefore believe that performing additional quantification in Figure 5 would be redundant with the more robust data already shown in Figure 7. As requested, we have now included in the Figure 5 legend that images are representative of brain tissue sections from 2-3 fish. 

      The title mentions an "atlas", but there is no searchable database or website associated with the paper. Please provide one.

      We agree and fully support the importance of data accessibility. To facilitate use of our dataset by the scientific community, we have developed a user-friendly, searchable web interface that allows users to explore gene expression pacerns within our dataset. This website is available at https://scrna-analysis zebrafish.shinyapps.io/scatlas/

      This information has now been included in the “Data availability statement” section of the manuscript.  

      Reviewer #1 (Recommendations For The Authors): 

      Specific comments: 

      The authors should discuss another recent report demonstrating DCs in the zebrafish brain, which also developed independently of Csf1ra, and compare the two datasets (Zhou et al. Cell reports, 2023). 

      Thank you for this suggestion. Please refer to our response in the major comments section, where we address this point in detail.

      Within macrophages, the authors identified 5 clusters including 4 microglia clusters and 1 MF cluster (Figure 4). Does the laUer relate to 'BAMs' and express markers previously described in murine BAMs, including Lyve1, CD206, etc.? Or to monocytes? By flow cytometry, monocytes were detected (Figure 1B), but not by scRNA-seq.  

      You have raised an important point here. As described in lines 197-202 (“results” section), the cells in the MF cluster exhibit a macrophage identity, based on their expression of classical macrophage markers such as marco, mfap4 or csf1ra. However, we were unable to confidently annotate this cluster more specifically. We also considered whether this population might resemble mammalian BAMs or monocytes, cell types that, to our knowledge, have not yet been clearly identified in zebrafish. However, orthologous markers typically associated with murine BAMs were not detected (lyve1) or not specifically enriched (mrc1a/mrc1b) in the MF cluster (see below). Based on these findings, we can only cautiously propose that this cluster may represent blood-derived macrophages and / or monocytes.

      To further address your suggestion, we performed a cell type enrichment analysis using the marker genes of the MF cluster, following the same strategy as for the microglia and DC-like clusters presented in Figure 4 supplement 2 C,D. This analysis revealed significant for “monocytes” and “macrophages”, further supporting a general monocytic/macrophage identity (see below). At present, further characterization of this cluster is limited by the lack of zebrafish-specific antibodies and the restricted palette of fluorescent reporter lines that distinguish among MNP subsets. We anticipate that future studies, including the development of new transgenic lines guided by our dataset, will allow for a more precise analysis of this distinct population. 

      Author response image 1.

      Do all 4 DC clusters identified by scRNA-seq represent cDC1s? or are there also cDC2s, and cDC3s present?  

      In our analyses, the four dendritic cell clusters identified by scRNA-seq (DC1-DC4) exhibit transcriptional profiles consistent with a conventional type 1 dendritic cell (cDC1) identity. These clusters uniformly express hallmark cDC1-associated genes, while lacking expression of markers typically associated with mammalian cDC2 or plasmacytoid dendritic cells (pDCs). For instance, irf4, a key transcription factor required for cDC2 development, is not detected in our dataset. Similarly, we do not observe expression of genes characteristic of pDCs. 

      That said, the absence of cDC2 or pDC-like signatures in our dataset does not rule out the presence of these populations in zebrafish.  

      While they show that DC-like cells did not express Csf1rb (Figure 4D) or other macrophage/microglia genes, DC-like cells were affected in the Csf1rb mutants and in double mutants, demonstrating that their development depends on Csf1rb signaling, as known for macrophages but not DCs. Can the authors discuss this in more detail with regard to DC differentiation/precursors? 

      Thank you for pointing this out. As previously demonstrated, CSF1R signaling in zebrafish is more complex than in mammals, due to the presence of two paralogs, csf1ra and csf1rb, which exhibit partially non-overlapping functions (Ferrero et al., 2021). We and others have shown that csf1rb signaling is implicated in the regulation of definitive hematopoiesis, particularly in the regulation of hematopoietic stem cell (HSC)-derived myelopoiesis. Although the developmental origin of zebrafish brain DC-like cells remains uncharacterized, their reduced numbers in the csf1rb mutant, despite their lack of csf1rb expression, supports the current model in which csf1rb acts at the progenitor level, promoting myeloid lineage commitment. According to this, csf1rb disruption affects the differentiation of multiple myeloid subsets, which likely include DC-like cells. We have developed this point in the discussion section (lines 502506).  

      Do the DCs express Csf1ra? 

      Csf1ra transcripts are not found in DCs in our dataset. As shown below, csf1ra expression is restricted to the microglia and macrophage clusters. These observations are in line with those made by Zhou et al., 2023.

      Author response image 2.

      Fig. 5, the number of brains analyzed should be added, and also quantifications of cell numbers included. It is mentioned (line 260) that P2ry12GFP+mpeg1mCherry+ microglia are abundant across brain regions while P2ry12GFP- mpeg1mCherry+ cells particularly localize in the ventral part of the posterior brain parenchyma. It would be nice if images of the different brain regions were provided. 

      Regarding the quantification, we refer to our response in the major comments section, where we explain that detailed quantification of microglia and other MNP subsets is provided in Figure 7, using a more refined strategy for distinguishing cell types.

      As requested, we have now included representative sections from the forebrain, midbrain and hindbrain of adult Tg(mhc2dab:GFP; cd45:DsRed) fish. These images illustrate the spatial distribution of DC-like cells across brain regions. Notably, DC-like cells are most abundant in the ventral areas of the midbrain and hindbrain, and are also present in the posterior telencephalon, particularly concentrated in the region of the commissura anterior. This regional annotation is based on the zebrafish brain atlas by Wullimann et al., 1996 (Neuroanatomy of the zebrafish brain, https://doi.org/10.1007/978-3-0348-8979-7).

      These additional images have been included in Figure 5 Supplement 1 (A-E).

      It is sometimes not evident whether the Pr2y12- cells included DC-like cells and macrophages, which should be discussed. 

      Thank you for bringing this to our attention. Upon review, we agree this point required clearer explanation throughout the text, particularly beginning with the description of putative DC-like cells in Figure 5. We have now revised the manuscript to improve clarity and becer guide readers through the phenotypic identification of DC-like cells using the Tg(p2ry12:p2ry12-GFP;mpeg1:mCherry) line. Specifically, we have modified the titles in the results section from page 5 to page 9, so that readers can more easily follow the step-by-step approach we used to distinguish DC-like cells from microglia. 

      To directly address your comment: the p2ry12<sup>-</sup>; mpeg1<sup>+</sup> fraction may, in theory, include not only DC-like cells but also other macrophage subsets and B cells, as B cells have been shown to express mpeg1 in zebrafish (Ferrero et al., 2020; Moyse et al., 2020). Nevertheless, our data strongly indicate that within the brain parenchyma, DC-like cells represent the predominant component of this population. This conclusion is supported by the pronounced reduction of p2ry12<sup>-</sup>; mpeg1<sup>+</sup> cells in brain sections from ba43 mutants, in which DC development is impaired. 

      We have revised the text accordingly to clarify this point in the results section of the manuscript (line 355).

      For example, the DC-like cell population in Figure 6C appears to include two populations of cells. Thus, it is unclear whether the sorted mhc2dab:GFP+;CD45:DsRedhi population for bulk-seq also contains the MF population identified in Fig. 2. 

      Thank you for this thoughtful observation. During the course of this study, we indeed considered how best to isolate non-microglial macrophages in order to specifically recover the MF population identified in our scRNA-seq analysis. However, with the current repertoire of fluorescent transgenic zebrafish lines, it remains technically challenging to selectively isolate non-microglial macrophages from the adult brain. As a result, the mhc2dab:GFP<sub>+</sub>; cd45:DsRedhi sorted population used for bulk RNA-seq may indeed include a mixture of DC-like and other mononuclear phagocytes, potentially the MF population. In contrast, our data demonstrate that the Tg(p2ry12:p2ry12-GFP) line provides a more selective tool for isolating microglia, minimizing contamination from other mononuclear phagocyte subsets.

      In Figure 7, a reduction of GFP-mpeg+ cells can be seen in baf3 mutants. Could the remaining cells be the (non-microglia) macrophages? Or in Figure 8, could the remaining P2ry12GFP-Lcp1+ cells in Irf8 mutants be macrophages? 

      Indeed, we believe it is likely that the remaining mpeg1<sup>+</sup> cells observed in ba43 mutants include non-microglial macrophages and/or B cells, as we and others previously showed that zebrafish B cells express mpeg1.1 transcripts and are labeled in the mpeg1.1 reporters (Ferrero et al., 2020). This interpretation is further supported by the observation that the reduction in mepg1+ cells is more pronounced in brain sections than in flow cytometry samples, where non-parenchymal mpeg+ cells, such as peripheral macrophages or B cells, are likely enriched. To explore this possibility, we attempted to assess the expression of MF- and B cell-specific markers in the remaining mpeg1+ population isolated from ba43 mutants. However, due to the very low numbers of cells recovered per animal, we were limited to analyzing only a few markers. Despite multiple attempts, qPCR analyses proved unconclusive, likely due to low transcript abundance. We thank you for your understanding of the technical limitations that currently prevent a more definitive characterization of these remaining cells.  

      Regarding the irf8 mutants (Figure 8), irf8 is a well-established master regulator of mononuclear phagocyte development. In mice, deficiency results in developmental defects and functional impairments across multiple myeloid lineages, including microglia, which exhibit reduced density (Kierdorf et al., 2013) and an immature phenotype (Vanhove and al., 2019). Similarly, in zebrafish, irf8 mutants show abnormal macrophage development, with an accumulation of immature and apoptotic cells during embryonic and larval stages (Shiau et al., 2014). Based on these findings, it is plausible that the residual p2ry12:GFP<sup>-</sup> Lcp1<sup>+</sup> cells observed in the irf8 mutant brains represent immature or arrested mononuclear phagocytes, possibly including both microglia and DC-like cells. This is supported by their distinct morphology and specific localization along the ventricle borders. However, as previously noted, our current tools do not permit to conclusively identify these cells.

      Reviewer #2 (Recommendations For The Authors): 

      A few sentences are not easy to understand for a "non zebrafish specialist". 

      (1) Page 3 line 111 The sentence "Interestingly, analyses of brain cell suspensions from double transgenics showed p2ry12:GFP+ microglia accounted for half of cd45:DsRed+ cells (50.9 % {plus minus} 2.9; n=4) (Figure 1D,E). Considering that mpeg1:GFP+ cells comprised ~75% of all leukocytes, these results indicated that approximately 25% of brain mononuclear phagocytes do not express the microglial p2ry12:GFP+ transgene." is not clear. This point is significant and deserves a more detailed explanation. 

      We apologize for the lack of clarity in this section. The quantification presented in Figure 1 refers specifically to cd45:Dsred<sup>+</sup> leukocytes, meaning that the reported percentages of p2ry12:GFP<sup>+</sup> and mpeg1:GFP<sup>+</sup> cells are calculated relative to the total cd45+ population (defined as 100%). Specifically, we observed that approximately 51% of all cd45+ cells were p2r12:GFP<sup>+</sup> microglia, while around ti5% were mpeg1:GFP<sup>+</sup>. From these values, we infer that about 25% of mpeg1:GFP<sup>+</sup> leukocytes do not express the p2ry12:GFP transgene and therefore likely represent non-microglial mononuclear phagocytes. We agree that this distinction is important and have revised the text accordingly to clarify the interpretation for readers who may be less familiar with zebrafish transgenic lines or gating strategies. See page 3, lines 107 117.

      (2) Line 522; Like human and mouse ILC2s, "these cells do not express the T cell receptor cd4-1" is confusing (T cell receptor should be reserved to the ag specific TCR). Also, was TCR isotypes expression analyzed (and how was genome annotation used in this case ?) 

      Thank you for this insightful comment.  We agree that the term “T cell receptor” should be used specifically to refer to antigen-specific TCRs, and we have revised the discussion accordingly to avoid any confusion. Regarding your question on the analysis of TCR isotype expression and the use of genome annotation: due to technical limitations, we did not pursue TCR isotype-level analysis in this study. Instead, we relied on established markers such as cd4-1 and cd8a to distinguish T cell populations, acknowledging that cd4-1 is not expressed by ILC2-like cells in our dataset. We have clarified these points in the relevant sections of the manuscript (see lines 168 and 535)

      The analysis of single-cell data might be more detailed, with more explanation about possible doublet identification and normalization procedures. 

      Thank you for highlighting the need for additional clarity regarding our scRNA-seq analysis.

      As noted in the Seurat tutorial, “cell doublets or multiplets often exhibit abnormally high gene count” (https://sa7jalab.org/seurat/archive/v3.0/pbmc3k_tutorial). To evaluate this, we performed a dedicated doublet detection analysis using the scDblFinder R package (https://rdrr.io/bioc/scDblFinder/f/vigneces/2_scDblFinder.Rmd). Our results indicated that the proportion of predicted doublets is low (see Figure below), and when present, these doublets are distributed among the different clusters. This contrasts with the typical clustering of doublets into discrete groups and indicates that our single-cell sequencing workflow was sufficiently robust to predominantly capture singlets.

      Regarding normalization, we have clarified this in the manuscript. Briefly, single-cell data were normalized using Seurat’s SCTransform method with the following custom parameters: “variable.features.n=4000 and return.only.var.genes=F”. These settings are now clearly described to ensure reproducibility.

      Author response image 3.

      Reviewer #3 (Recommendations For The Authors):

      Major issues

      Though baf3 mutants were generated the manuscript will greatly benefit from in situ labeling by RNAscope or the generation of transgenic reporters to conclusively localize this dendritic cell population and address any potential contamination issues. 

      We thank you for this constructive suggestion. We agree that in situ labeling approaches such as RNAscope would offer valuable complementary insights. In our current study, however, we already provide detailed spatial information on DC-like cells, and we demonstrated their lineage identity through the use of our newly generated batf3 mutant line. 

      To address concerns regarding potential contamination, we have carefully analyzed more than two dozens adult brains to date and consistently observed abundant DC-like cells within the brain parenchyma, exhibiting a reproducible and specific spatial distribution, as described in the manuscript. This consistent localization across multiple samples strongly supports the genuine presence of these cells in the brain rather than artifactual contamination.

      While the generation of DC-like specific transgenic lines is indeed a promising direction (and such efforts are currently underway in our group) we note that creating and validating these lines is time-consuming and falls beyond the scope of the present study. Importantly, although these additional tools will be valuable for future functional investigations, we believe they would not impact the main conclusions or core message of our current work. 

      The morphological characterization of CD45:DsRed+ macrophages stained with May-Grunwald-Giemsa has been previously reported in the paper, "Characterization of the mononuclear phagocyte system in the zebrafish" Wittamer et al., 2011."Morphologic analyses revealed that the majority of cells exhibited the characteristics of monocytes/macrophages namely low nuclear to cytoplasm ratios and a high number of cytoplasmic vacuoles (Figure 3B). 

      We thank you for pointing out the reference to Wittamer et al., 2011. In that study, we indeed provided the first morphological characterization of mononuclear phagocytes (MNPs) in various adult zebrafish organs using the cd45:DsRed line in combination with the mhc2dab:GFP reporter. The focus was primarily on MNPs across peripheral tissues. In the current study, our aim is broader: we investigate the full diversity of brain immune cells, using cd45 as a general marker for leukocytes. As part of this comprehensive characterization, we applied MGG staining, a widely accepted cytological technique, to gain morphological insight into the sorted CD45:DsRed+ population. This method remains a valuable and rapid approach to visually assess cell type heterogeneity, especially when evaluating samples where multiple immune cell lineages may be present. 

      While there is some overlap with the methodology used in Wittamer et al., the context, scope, and tissue examined differ substantially. Thus, the inclusion of MGG staining in this study serves to complement our broader transcriptomic analyses by providing supporting morphological evidence specific to brain-resident immune cells.

      We have now clarified this distinction in the revised manuscript to better differentiate the current work from our previous findings (see line 85).

      Figure 5 data should be quantified.

      Please refer to our response in the major comments section, where we address this question in detail.

      Figure 7- Figure Supplement 1. J, K has no CD45:DsRed positive cells in baf3 mutants, which is counterintuitive because CD45:DsRed should capture all hematopoietic cells and is not specific to dendritic cells. 

      It is correct that cd45 is a general leukocyte marker, labeling all immune cells, including dendritic cells. In this Figure, we used the Tg(cd45:DsRed) transgenic line to visualize the phenotype because it offers an alternative to IHC, with the advantage of strong endogenous fluorescence and easier screening of vibratome sections. However, this technique has limitations: due to fixation, only cells with high fluorescence (e.g. cd45<sup>high</sup>dendritic cells) are captured, while those with medium/low expression (e.g. cd45<sup>low</sup> microglia) are often not visible. This explains why fewer cells are observed in both wild-type and ba43 mutant brains (Figure 5 KN, Figure 7 – supplement 1 JK). While this approach is quicker and allows for thicker sections, IHC remains the preferred method for the rest of the analyses, including the use of additional markers to identify all relevant cell populations. 

      Thank you for bringing this point of confusion to our attention. To improve clarity, we have amended the text in the relevant sections (see lines 704-706, and legend of Figure 7 Supplement 1)

      Minor issues: 

      The terms in the title, "A single-cell transcriptomic atlas..." are used. What is meant by "atlas"? A searchable database or website is not provided.

      Please refer to our response in the major comments section, where we explain that we have made our dataset accessible through a searchable web interface (https://scrna-analysiszebrafish.shinyapps.io/scatlas/) which is now referenced in the Data Availability Statement.

      This reviewer considers that it is offensive to use terminology such as "poorly characterized" in reference to others' work. 

      Thank you for pointing this out. We understand the concern and have revised the wording to ensure it remains respectful and neutral when referring to previous work. The changes are reflected in lines 20 and 49.

      The introduction of this manuscript should consider restructuring and editing. Example: Lines 51-57 introduce the importance of immune cells in zebrafish regeneration studies. However, this study does not investigate such processes. Additionally, the authors focus on the concept of immune heterogeneity in the brain throughout the text however, these studies have been conducted previously by others (Silva et al., 2021) at single-cell level.

      The novelty of this manuscript is the identification of "dendritic-like cells" and yet the introduction and text are limited to 68-71 lines. The introduction would benefit by introducing this cell type "dendritic-like cells" and differences between vertebrates. 

      Thank you for these valuable comments. In response, we have revised the introduction to better align with the focus of the study (see edited text in page 2). We now emphasize that, while macrophages have been extensively studied in zebrafish, dendritic cells remain much less well characterized in this model.  Also, while we acknowledge that Silva et al. addressed aspects of immune heterogeneity in the zebrafish brain, their study primarily focused on mononuclear phagocytes. In contrast, our work provides a broader and more detailed characterization of the brain immune landscape, integrating transcriptomic data with multiple fluorescent reporter lines and hematopoietic mutants to strengthen cell identity assignments. Importantly, we note that Silva et al. classified DC-like cells within the microglial compartment, whereas our findings support that these cells represent a distinct population. While our data challenge this specific aspect of their conclusions, we believe both studies offer complementary insights that collectively advance our understanding of zebrafish brain immunity. 

      Though Figure 6 is a great conformation of scRNA sequencing, it seems redundant and should be supplemental data.

      We respectfully disagree with the reviewer’s suggestion. We believe that presenting the data in Figure 6 as the main figure enhances its visibility and impact, particularly highlighting the distinction between microglia and DC-like cells, an aspect we consider highly valuable information for the zebrafish research community. This is especially important given that our conclusions challenge two previous independent reports, further underscoring the relevance of these findings to the field.

    1. It’s not for everyone, but online dating does not carry the stigma it once did.

      Really interesting how taboo it used to be, especially how it was seen as a "risky last resort" at a point. Online dating is definitely more normalized now and I think it is a regular way to meet people. I still think that these apps are a bit of a lost way of meeting people though because you do not get the same first interaction you would if you met them organically. They really depend though and work out for many!

    2. It is worth noting that there are also niche fandoms that probably would not exist without the aid of digital networks.

      This, absolutely!! So many of the artists I listen to have smaller fandoms and to even find people in person who are in them is impossible. Unfortunately, social media is so important for reach and smaller creators.

    3. Yes, there are people who are “Instagram famous” or “YouTube” famous, but the biggest stars in our cultural world still have many ties to legacy media.

      I agree, a lot of these instagram or YouTube famous people do not last either. Or try to, but have a hard time staying relevant.

    4. rules.

      Interesting example. With the advances of social media, there has been a further experience that has been offered among fans in recent years. As someone who has been apart of multiple fan clubs heavily in the past, I can vouch for how the internet connected me with people that had the same common interest as me that I never would have met otherwise. I met many people from miles away in groups and was able to watch the fan-star interaction that would not have been the same without online platforms.

    5. The only constant in digital culture is change, which may sound cliché, but the underlying ICT structures shift so often that it can be difficult for cultural trends to take hold.

      This goes along with how fast trends are coming and going on social media. Not only trends like audios and such, but the digital world pushing trends within things like clothing and decor is what comes to mind for me. Trends that are not lasting long and are what are leading to consumerism currently.

    1. eLife Assessment

      This work provides high-precision single-cell data on the relationship between DnaA activity and cell size, offering important insights for the field of cell cycle control. These findings motivate a novel and intriguing hypothesis for DNA replication initiation -the "extrusion model"- in which DNA-binding proteins modulate free DnaA availability in response to biomass-DNA imbalance. While the current indirect evidence does not fully establish the model, an experimental perturbation involving H-NS offers convincing support for its plausibility, laying the groundwork for future investigation.

    2. Reviewer #1 (Public review):

      Summary:

      The study by Li and coworkers addresses the important and fundamental question of replication initiation in Escherichia coli, which remains open despite of many classic and recent works. It leverages single-cell mRNA-FISH experiments in strains with titratable DnaA and novel DnaA activity reporters to monitor DNA activity peaks versus size. The authors find oscillations in DnaA activity and show that their peaks correlate well with the estimated population-average replication initiation volume across conditions and imposed dnaA transcription levels. The study also proposes a novel and interesting extrusion model where DNA-binding proteins regulate free DnaA availability in response to biomass-DNA imbalance. Experimental perturbations of H-NS support the model validity, addressing key gaps in current replication control frameworks.

      Strengths:

      I find the study interesting and well conducted, and I think its main strong points are (i) the novel reporters obtained with systematic synthetic biology methods, and combined with a titratable dnaA strain, (ii) the interesting perturbations (titration, production arrest and H-NS) and (iii) the use of single-cell mRNA FISH to monitor transcripts directly. The proposed extrusion model is also interesting, though not fully validated, and I think it will contribute positively to the future debate.

      Weaknesses and Limitations

      A relevant limitation in novelty is that DnaA activity and concentration oscillations have been reported by the cited Iuliani and coworkers previously by dynamic microscopy, and to a smaller extent by the other cited study by Pountain and coworkers using mRNA FISH.

      An important limitation is that the study is not dynamic. While monitoring mRNA is interesting and relevant, the current study is based on concentrations and not time variations (or nascent mRNA). Conversely, the study by Iuliani and coworkers, while having the drawback of monitoring proteins it can access directly production rates. It would be interesting for future studies to monitor the strains and reporters dynamically, as well as using (as a control) the technique of this study on the chromosomal reporters used by Iuliani et al.

      While the implemented code is made available and the parameter values are given in the text, important details are missing regarding the mathematical models (mathematical definitions, clear discussions of ingredients and main assumptions, and choices made in the deployment of such models, which are presented briefly in the Methods section). The reader is not given sufficient tools to understand the predictions of different models and no analytical estimates are used and the falsification procedures are not clear. More transparency and depth in the analysis would be needed to use the models as more than a heuristic tool for qualitative arguments. The Berger model for example has many parameters and many regimes and behaviors. When models are compared to data (e.g. in fig. 2G) it is not clear how parameters were fixed, and whether and how the model prediction depends on adjustable parameters.

      Importantly, the statement about tight correlations of peak volumes and average estimated initiation volume does not establish coincidence. Crucially, the data rely on average initiation volumes, and the estimate procedure relies on assumptions that could lead to systematic biases and uncertainties added to the population variability (in any case error bars are not provided).

      The delays observed by the authors (in both directions) between the peaks of DnaA-activity conditional averages with respect to volume and the average estimated initiation volumes are not incompatible with those observed dynamically by Iuliani and coworkers. The direct experiment to prove the authors' point would be to use a direct proxy of replication initiation such as SeqA or DnaN and monitor initiations and quantify DnaA activity peaks jointly, with dynamic measurements.

      While not being an expert I had the doubt that the fact that the reporters are on plasmid (despite a normalization control that seems very sensible) might affect the measurements. The approach is different from the aforementioned previous study, which used a chromosomal reporter placed symmetrically, at the same distance from the origin of replication as the original dnaA promoter.

      Overall Appraisal:

      In summary, this appears to me as a very interesting study providing valuable high-precision data and a novel testable hypothesis, the extrusion model, supported by relevant perturbation experiments and open to future explorations.

      Comments on revisions:

      I am happy with the replies and the revisions.

      The main outstanding point remains that reconstructing the mathematical model details from the text (and having to rely on the code) is not optimal for a reader. However, I do understand that the authors intend to use the models as a heuristic tool only and possibly plan a theoretical study where they explore the models more systematically.

    3. Reviewer #2 (Public review):

      Summary:

      The authors show that in E. coli the initiator protein DnaA oscillates post-translationally: its activity rises and peaks exactly when DNA replication begins, even if dnaA transcription is held constant. To explain this, they propose an "extrusion" mechanism in which nucleoid-associated proteins such as H-NS, whose amount grows with cell volume, dislodge DnaA from chromosomal binding sites; modelling and H-NS perturbations reproduce the observed drop in initiation mass and extra initiations seen after dnaA shut-down. Together, the data and model link biomass growth to replication timing through chromosome-driven, post-translational control of DnaA, filling gaps left by classic titration and ATP/ADP-switch models.

      Strengths:

      (1) Introduces an "extrusion" model that adds a new post-translational layer to replication control and explains data unexplained by classic titration or ATP/ADP-switch frameworks.

      (2) A major asset of the study is that it bridges the longstanding gap between DnaA oscillations and DNA-replication initiation, providing direct single-cell evidence that pulses of DnaA activity peak exactly at the moment of initiation across multiple growth conditions and genetic perturbations.

      (3) A tunable dnaA strain and targeted H-NS manipulations shift initiation mass exactly as the model predicts, giving model-driven validation across growth conditions.

      (4) A purpose-built Psyn66 reporter combined with mRNA-FISH captures DnaA-activity pulses with cell-cycle resolution, providing direct, compelling data.

      Weaknesses:

      (1) What happens to the (C+D) period and initiation time as the dnaA mRNA level changes? This is not discussed in the text or figure and should be addressed.

      (2) It is unclear what is meant by "relative dnaA mRNA level." Relative to what? Wild-type expression? Maximum expression? This should be explicitly defined.

      (3) It would be helpful to provide some intuition for why an increase in dnaA mRNA level leads to a decrease in initiation mass per ori and an increase in oriC copy number.

      (4) The titration and switch models do not explicitly include dnaA mRNA in the dynamics of DnaA protein. Yet, in Figure 2G, initiation mass is shown to decrease linearly with dnaA mRNA level in these models. How was dnaA mRNA level represented or approximated in these simulations?

      (5) Is Schaechter's law (i.e., exponential scaling of average cell size with growth rate) still valid under the different dnaA mRNA expression conditions tested?

      (6) The manuscript should explain more explicitly how the extrusion model implements post-translational control of DnaA and, in particular, how this yields the nonlinear drop in relative initiation mass versus dnaA mRNA seen in Fig. 6E. Please provide the governing equation that links total DnaA, the volume-dependent "extruder" pool, and the threshold of free DnaA at initiation, and show-briefly but quantitatively-how this equation produces the observed concave curve.

      (7) Does this Extrusion model give well well-known adder per origin, i.e., initiation to initiation is an adder.

      (8) DnaA protein or activity is never measured; mRNA is treated as a linear proxy. Yet the authors' own narrative stresses post-translational (not transcriptional) control of DnaA. Without parallel immunoblots or activity readouts, it is impossible to know whether a six-fold mRNA increase truly yields a proportional rise in active DnaA.

      (9) Figure 2 infers both initiation mass and oriC copy number from bulk measurements (OD₆₀₀ per cell and rifampicin-cephalexin run-out) instead of measuring them directly in single cells. Any DnaA-dependent changes in cell size, shape, or antibiotic permeability could skew these bulk proxies, so the plotted relationships may not accurately reflect true initiation events.

      Comments on revisions:

      The authors have addressed all of my previous concerns, questions, and suggestions sufficiently.

    4. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The study by Li and coworkers addresses the important and fundamental question of replication initiation in Escherichia coli, which remains open, despite many classic and recent works. It leverages single-cell mRNA-FISH experiments in strains with titratable DnaA and novel DnaA activity reporters to monitor DNA activity peaks versus size. The authors find oscillations in DnaA activity and show that their peaks correlate well with the estimated population-average replication initiation volume across conditions and imposed dnaA transcription levels. The study also proposes a novel extrusion model where DNA-binding proteins regulate free DnaA availability in response to biomass-DNA imbalance. Experimental perturbations of H-NS support the model validity, addressing key gaps in current replication control frameworks.

      Strengths:

      I find the study interesting and well conducted, and I think its main strong points are:

      (1) the novel reporters obtained with systematic synthetic biology methods, and combined with a titratable dnaA strain.

      (2) the interesting perturbations (titration, production arrest, and H-NS).

      (3) the use of single-cell mRNA FISH to monitor transcripts directly.

      The proposed extrusion model is also interesting, though not fully validated, and I think it will contribute positively to the future debate.

      We thank the reviewer for acknowledging the strengths of our study.

      Weaknesses and Limitations:

      (1) A relevant limitation in novelty is that DnaA activity and concentration oscillations have been reported by the cited Iuliani and coworkers previously by dynamic microscopy, and to a smaller extent by the other cited study by Pountain and coworkers using mRNA FISH.

      (2) An important limitation is that the study is not dynamic. While monitoring mRNA is interesting and relevant, the current study is based on concentrations and not time variations (or nascent mRNA). Conversely, the study by Iuliani and coworkers, while having the drawback of monitoring proteins, can directly assess production rates. It would be interesting for future studies or revisions to monitor the strains and reporters dynamically, as well as using (as a control) the technique of this study on the chromosomal reporters used by Iuliani et al.

      We acknowledge the value of dynamic measurements and clarify our methodological rationale.

      While luliani et al. provided valuable temporal resolution through protein dynamics, our mRNA FISH approach achieves direct decoupling of transcriptional vs. post-translational regulation (Fig 4F-H), and condition flexibility across 7 growth rates (30-66 min doubling times). This trade-off sacrifices temporal resolution for enhanced population-scale resolution and perturbation flexibility. To directly address temporal coupling, future work will implement dual-color live imaging of DnaA activity concurrent with replication initiation events.

      (3) Regarding the mathematical models, a lot of details are missing regarding the definitions and the use of such models, which are only presented briefly in the Methods section. The reader is not given any tools to understand the predictions of different models, and no analytical estimates are used. The falsification procedures are not clear. More transparency and depth in the analysis are needed, unless the models are just used as a heuristic tool for qualitative arguments (but this would weaken the claims). The Berger model, for example, has many parameters and many regimes and behaviors. When models are compared to data (e.g., in Figure 2G), it is not clear which parameters were used, how they were fixed, and whether and how the model prediction depends on parameters.

      We agree that model transparency is essential for quantitative validation. To address this, all model parameters (DnaA synthesis rate, activation/deactivation rates etc.) are explicitly tabulated in Supplementary Information Table S6. For the titration (Hansen et al. 1991) and extrusion models, we derive analytical expressions for initiation mass (IM) sensitivity to DnaA expression in Supplementary Note 1. For Figure 2G/S6, we used published parameters (Berger & Wolde 2022 SI Table 2) with experiment growth conditions (μ = 1.54 h<sup>-1</sup>).

      The extrusion model's validation relies primarily on its ability to resolve paradoxical initiation events under dnaA shutdown (Fig 6C), a test where other models fail categorically. While the Berger titration-switch hybrid can fit steady-state IM trends (Fig S6A), it cannot reproduce post-shutdown dynamics without ad hoc modifications (Fig S6B). We acknowledge that comprehensive analysis of all model regimes exceeds this study's scope but provide full simulation code for independent verification: https://github.com/BaiYangBqdq/dynamics_of_biomass_DNA_coordination

      (4) Importantly, the main statement about tight correlations of peak volumes and average estimated initiation volume does not establish coincidence, and some of the claims by the authors are unclear in these respects (e.g., when they say "we resolve a 1:1 coupling between DnaA activity thresholds and replication initiation", the statement could be correct but is ambiguous). Crucially, the data rely on average initiation volumes (on which there seems to be an eternally open debate, also involving the authors), and the estimate procedure relies on assumptions that could lead to biases and uncertainties added to the population variability (in any case, error bars are not provided).

      We acknowledge the limitations of population-level inference and have refined our claims: "Replication initiation volume scales proportionally with peak DnaA activity volume with a slope of 1.0 (R<sub>2</sub>=0.98, Fig 7G), indicating predictive correspondence rather than absolute coincidence. While population-level  𝑉<sub>𝑖</sub> estimation cannot resolve single-cell stochasticity, the consistent 𝑉*: 𝑉<sub>𝑖</sub> relationship across 20 conditions suggest DnaA activity thresholds predict initiation timing within physiological error margins”. Future work will implement simultaneously DnaA activity and replication forks by using microfluidic single-cell tracking.

      (5) The delays observed by the authors (in both directions) between the peaks of DnaAactivity conditional averages with respect to volume and the average estimated initiation volumes are not incompatible with those observed dynamically by Iuliani and coworkers. The direct experiment to prove the authors' point would be to use a direct proxy of replication initiation, such as SeqA or DnaN, and monitor initiations and quantify DnaA activity peaks jointly, with dynamic measurements.

      We acknowledge the observed temporal deviations between DnaA activity peaks (𝑉*) and population-derived volumes at initiation ( 𝑉<sub>𝑖</sub>) in certain conditions, in line with the findings of Iuliani et al. This might be mechanistically consistent with the time required for orisome assembly or oriC sequestration. They do not contradict our core finding that initiation occurs at a defined DnaA activity threshold (slope=1.0, R<sub>2</sub>=0.98 in 𝑉*: 𝑉<sub>𝑖</sub> correlation).

      (6) While not being an expert, I had some doubt that the fact that the reporters are on plasmid (despite a normalization control that seems very sensible) might affect the measurements. Also, I did not understand how the authors validated the assumptions that the reporters are sensitive to DnaA-ATP specifically. It seems this assumption is validated by previous studies only.

      We employed a plasmid-based reporter system to circumvent the significant confounding effects of chromosomal position on promoter activity, as extensively documented by Pountain et al., where local genomic context (e.g., nucleoid occlusion, supercoiling gradients, and neighboring operons) introduces uncontrolled variability. By housing the P<sub>syn66</sub> test promoter and P<sub>con</sub> normalization control in identical low-copy pSC101 vectors (<8 copies/ cell, Peterson & Phillips, Plasmid 2008), we ensured they experience equivalent physical and biochemical environments. This ratiometric design, where DnaA activity is calculated, actively corrects for global fluctuations in RNA polymerase availability, nucleotide pools, and plasmid copy number. Critically, P<sub>syn66</sub>’s architecture emulates natural DnaA-responsive elements: its strong DnaAboxes report free DnaA concentration, while its weak box is preferentially bound by DnaA-ATP (Speck et al., EMBO journal 1999), mirroring the nucleotide-state sensitivity of oriC and the native dnaA promoter. This system was indispensable for our central finding, as it uniquely enabled the decoupling of DnaA activity oscillations from transcriptional feedback (Fig. 4F-H), an experiment fundamentally impossible with chromosomally integrated reporters due to autoregulatory interference.

      Overall Appraisal:

      In summary, this appears as a very interesting study, providing valuable data and a novel hypothesis, the extrusion model, open to future explorations. However, given several limitations, some of the claims appear overstated. Finally, the text contains some selfevaluations, such as "our findings redefine the paradigm for replication control", etc., that appear exaggerated.

      We thank the reviewer for highlighting the need for precise language in framing our conclusions. We have implemented the following substantive revisions throughout the manuscript to ensure claims align strictly with empirical evidence:

      (1) Changed "redefine the paradigm for replication control" into "advance the paradigm for replication control" (Introduction)

      (2) Changed "redefine bacterial cell cycle control" into "refine bacterial cell cycle control as a dynamic interplay..." (Discussion)

      (3) Removed the term "spatial" from the Discussion's description of DnaA-chromosome interactions (Discussion, first paragraph).

      (4) Changed "provides a blueprint" into "provides a valuable tool for dissecting spatial regulation..." (Discussion, final paragraph)

      (5) Scrutinized all superlatives (e.g., "critical feat" into "important capability"; "fundamental principle of cellular organization" into "potential organizational strategy")

      (6) Replaced the instances of "robust" with evidence-backed descriptors (e.g., "sensitive," "consistent")

      (7) We agree that the extrusion model requires further validation and have emphasized this in Discussion: "While H-NS perturbation supports extrusion mechanism, future work should identify the full extruder interactome and elucidate how metabolic signals modulate their activity" (final paragraph)

      This calibrated language more accurately represents our study as a conceptual advance with testable mechanisms, not a complete paradigm shift.

      Reviewer #2 (Public review):

      Summary:

      The authors show that in E. coli, the initiator protein DnaA oscillates post-translationally: its activity rises and peaks exactly when DNA replication begins, even if dnaA transcription is held constant. To explain this, they propose an "extrusion" mechanism in which nucleoidassociated proteins such as H-NS, whose amount grows with cell volume, dislodge DnaA from chromosomal binding sites; modelling and H-NS perturbations reproduce the observed drop in initiation mass and extra initiations seen after dnaA shut-down. Together, the data and model link biomass growth to replication timing through chromosome-driven, posttranslational control of DnaA, filling gaps left by classic titration and ATP/ADP-switch models.

      Strengths:

      (1) Introduces an "extrusion" model that adds a new post-translational layer to replication control and explains data unexplained by classic titration or ATP/ADP-switch frameworks.

      (2) A major asset of the study is that it bridges the longstanding gap between DnaA oscillations and DNA-replication initiation, providing direct single-cell evidence that pulses of DnaA activity peak exactly at the moment of initiation across multiple growth conditions and genetic perturbations.

      (3) A tunable dnaA strain and targeted H-NS manipulations shift initiation mass exactly as the model predicts, giving model-driven validation across growth conditions.

      (4) A purpose-built Psyn66 reporter combined with mRNA-FISH captures DnaA-activity pulses with cell-cycle resolution, providing direct, compelling data.

      We thank the reviewer for acknowledging the strengths of our study.

      Weaknesses:

      (1) What happens to the (C+D) period and initiation time as the dnaA mRNA level changes? This is not discussed in the text or figure and should be addressed.

      We thank the reviewer for this important observation. Our data demonstrate that increased dnaA mRNA levels induce two compensatory changes in cell cycle progression:

      (1) Earlier replication initiation, manifested as a reduced initiation mass: the initiation mass decreased from 5.6 to 2.6 (OD<sub>600</sub>·ml per 10<sup>10</sup> cells) as the relative dnaA mRNA level increased from 0.2 to 7.2 (normalized to the wild-type level) (Fig. 2F, red).

      (2) Prolonged C+D period: Increased by approximately 60% (from 1.05 to 1.66 hours, Fig. 2F blue).

      The complete quantitative relationship is now explicitly described in the Results section: “Concurrently, the initiation mass was reduced by 50%, and the period from initiation to division (C+D) was increased by ~60% (Fig. 2F)”

      (2) It is unclear what is meant by "relative dnaA mRNA level." Relative to what? Wild-type expression? Maximum expression? This should be explicitly defined.

      The relative dnaA mRNA level was obtained by normalizing to that in wild-type MG1655 cells grown in the same medium. To clarify this point, we have now marked the wild-type level in Fig. 1B, and a clear description of this has also been included in the figure caption.

      (3) It would be helpful to provide some intuition for why an increase in dnaA mRNA level leads to a decrease in initiation mass per ori and an increase in oriC copy number.

      Thank you for your valuable suggestion. Increased dnaA mRNA accelerates DnaA accumulation, causing cells to reach the initiation threshold at a smaller cell size (reducing initiation mass, Fig. 2F red). This earlier initiation increases oriC copies per cell at populational level (Fig. 2E). This mechanistic interpretation now appears in the Results: “As the DnaA expression level increases, DnaA activity reaches the initiation threshold earlier. Given that cell mass remained nearly unchanged, this earlier initiation led to an increase in population-averaged cellular oriC numbers (Fig. 2E).”

      (4) The titration and switch models do not explicitly include dnaA mRNA in the dynamics of DnaA protein. Yet, in Figure 2G, initiation mass is shown to decrease linearly with dnaA mRNA level in these models. How was dnaA mRNA level represented or approximated in these simulations?

      All models presented in this article omit explicit modeling of dnaA mRNA dynamics for simplicity. However, at steady state, the relative level of dnaA mRNA can be approximated by the relative expression rate of DnaA protein, as both reflect the expression level of DnaA. This detail is now clarified in the caption of Figure 2G.

      (5) Is Schaechter's law (i.e., exponential scaling of average cell size with growth rate) still valid under the different dnaA mRNA expression conditions tested?

      Schaechter's law describes the exponential scaling of average cell size with growth rate in bacteria. In our prior work (Zheng et al., Nature Microbiology 2020), where we demonstrated that Schaechter's law fails in slow-growth regimes. However, in current study, growth rate remained constant across different dnaA expression levels (Fig. 2C), and cell mass showed no significant change (Fig. 2D). Since Schaechter's law specifically addresses how cell size scales with growth rate, it does not apply here, as growth rate was invariant in our perturbations, which selectively alter replication initiation dynamics, not growth rate or size scaling.

      (6) The manuscript should explain more explicitly how the extrusion model implements posttranslational control of DnaA and, in particular, how this yields the nonlinear drop in relative initiation mass versus dnaA mRNA seen in Figure 6E. Please provide the governing equation that links total DnaA, the volume-dependent "extruder" pool, and the threshold of free DnaA at initiation, and show - briefly but quantitatively - how this equation produces the observed concave curve.

      The governing equations linking initiation mass and DnaA expression level is now provided in Supplementary Note S1 for both the titration and the extrusion model. In general, the dependence of initiation mass (𝑉<sub>𝐼</sub>) on dnaA expression level (𝛼<sub>𝐴</sub>) dependency takes an inverse 1 proportionality form: . In the extrusion model, the incorporated extruder protein is assumed to have similar synthesis dynamics as DnaA and can release DnaA from DnaA-box. After denoting the synthesis rate of the extruder as 𝛼<sub>𝐻</sub>, the combined effect of DnaA and the extruder on replication initiation can be briefly described as: . Then the additive contribution of 𝛼<sub>𝐻</sub> dampens the sensitivity of initiation mass to changes in 𝛼<sub>𝐴</sub>, resulting in a significantly flattened curve. As a result, the predicted 𝑉<sub>𝐼</sub> − 𝛼<sub>𝐴</sub> relationship has a concave shape in the semi-log plots.

      (7) Does this Extrusion model give well well-known adder per origin, i.e., initiation to initiation is an adder.

      Yes, the extrusion model can provide the initiation-to-initiation adder phenomenon, this information was provided in fig. S3C.

      (8) DnaA protein or activity is never measured; mRNA is treated as a linear proxy. Yet the authors' own narrative stresses post-translational (not transcriptional) control of DnaA. Without parallel immunoblots or activity readouts, it is impossible to know whether a sixfold mRNA increase truly yields a proportional rise in active DnaA.

      We acknowledge the reviewer's valid concern regarding the indirect nature of our DnaA activity measurements. While mRNA levels alone cannot resolve active DnaA dynamics, our approach integrates functional replication outcomes with a validated synthetic reporter to infer activity. Crucially, elevated dnaA mRNA causes demonstrable biological effects: earlier replication initiation (Fig. 2F) and increased oriC copies (Fig. 2E), directly confirming enhanced functional DnaA activity at the oriC locus. The P<sub>syn66</sub> reporter, engineered with DnaA-boxes mirroring oriC's architecture, provides orthogonal validation, showing progressive repression to dnaA induction (Fig. 3C). Our operational metric , bases on P<sub>syn66</sub> responds sensitively to DnaA-chromosome interactions within its characterized 8-fold dynamic range (Fig. 3C). Immunoblots would be inadequate here, as they cannot distinguish functionally critical pools: free versus chromosome-bound DnaA, or DnaA-ATP versus DnaAADP, precisely the post-translational states our study implicates in regulation. We therefore prioritize functional readouts (initiation timing) and the P<sub>syn66</sub> reporter, which probes the biologically active fraction relevant to replication control.

      (9) Figure 2 infers both initiation mass and oriC copy number from bulk measurements (OD<sub>600</sub> per cell and rifampicin-cephalexin run-out) instead of measuring them directly in single cells. Any DnaA-dependent changes in cell size, shape, or antibiotic permeability could skew these bulk proxies, so the plotted relationships may not accurately reflect true initiation events.

      We acknowledge the reviewer's valid methodological concern and clarify that while bulk measurements carry inherent limitations, our approach is grounded in established techniques with demonstrated reliability. Cell mass was inferred from OD600/cell, which correlates strongly with direct dry weight measurements and microscopic cell volumes across diverse growth conditions, as validated in our prior work (Zheng et al., Nature Microbiology 2020). Crucially, cell mass remained invariant across dnaA expression levels (Fig. 2D).

      Regarding oriC quantification, the rifampicin-cephalexin run-out assay is a wildly applied for replication initiation studies. Our data shows expected 2<sup>n</sup> oriC distributions without abnormal ploidy (as shown below). While single-cell methods offer superior resolution, our bulk approach provides accurate population-level trends.

      Author response image 1.

      Recommendations for the authors:

      Reviewing Editor Comments:

      The reviewers felt that the mathematical modeling was not adequately explained in the paper, and that this affected the readability of the manuscript. The authors are encouraged to elaborate on this aspect of the paper (in addition to strengthening other claims, if possible, per the reviewers' comments).

      We thank the editor and reviewers for their constructive feedback. We have comprehensively strengthened the mathematical modeling framework to enhance clarity and rigor.

      Reviewer #1 (Recommendations for the authors):

      The only revision I would do is a recalibration of the claims and a major effort to clarify the modeling part (including a detailed SI appendix), without necessarily performing additional work.

      To enhance mathematical modeling transparency, we have completed model description in the method section and a parameter table with literature-sourced values in Supplementary Information Table S6. Moreover, analytical derivations of initiation mass dependencies are performed and presented in the Supplementary Information Note S1.

      Of course, there are extra experiments (mentioned in the public review) that would help support some of the big claims, but that can be considered a different project.

      Thank you for your suggestion. This will be addressed in our future work.

      Minor suggestion: please put signposts or plot jointly to compare the maxima/minima in Figures 4D, E, G, and H.

      We added dashed lines in Figures 4D, and E, to synchronize visualization of DnaA activity peaks and transcriptional minima across panels, facilitating direct biological comparisons.

      Reviewer #2 (Recommendations for the authors):

      (1) Should define what DNA activity is.

      We have explicitly defined DnaA activity in the Introduction as “the capacity to initiate replication…” and noted that it is “governed by free DnaA concentration, DnaA-ATP/-ADP ratio, and orisome assembly competence”.

      (2) Word repetition - “...grown in in Luria-Bertani (LB) medium...”.

      Corrected.

      (3) Typographical error - “FISH ... was preformed" should be "performed”.

      Corrected.

      (4) The manuscript alternates between “ng ml<sup>-1</sup>” and “ng·ml<sup>-1</sup>”; choose one style and apply it uniformly.

      Standardized the units to ng·ml<sup>-1</sup> throughout.

      (5) Reference duplicates - Some citations appear twice in the bibliography (e.g., "Bintu et al., 2005a/b" and "Bintu et al., 2005b" listed again later).

      The studies by Bintu et al. (2005a, 2005b) represent separate works: 2005a details applications, and 2005b develops models.

    1. eLife Assessment

      This manuscript provides an improved version of an important cancer risk estimation tool and refines and expands upon resources that are currently available to the cancer genetics community. The new program is validated in a set of clinical pedigrees demonstrating its practical accuracy and relevance to the field. Collectively, the data are compelling and support the major conclusions of this manuscript.

    2. Reviewer #1 (Public review):

      Summary:

      Although consanguinity is a rare clinical occurrence, it results in essentially a failure state for pedigree analysis algorithms by introducing loops that prevent accurate risk estimation. Therefore, Kubista et al. developed the graph-based "breakloops" function to allow their PanelPRO risk estimator (PMID 34406119) to successfully process consanguineous pedigrees.

      Strengths:

      This function allows them to first identify a loop in a pedigree, then decide which of two separate algorithms to best apply, Prim's or greedy, to optimize the introduction of clones to break these loops. As this function is automatic, it represents an improvement over previous similar algorithms, and also allows for the optimal algorithm to be chosen. The inclusion of pseudocode in the manuscripts provides a succinct summary of the logic behind the above: it greatly enhances the understanding of the function for those not necessarily computationally inclined.

      After simulating a variety of consanguineous possibilities, the authors leveraged clinical pedigree data to validate their function. Integration of clinical pedigrees was extremely helpful in demonstrating the real-life applicability of this update. The successful inclusion of these clinical data justifies the claims they make regarding the ability to assess cancer risk in a wider range of family structures.

      Weaknesses:

      As consanguinity is inextricably linked with autosomal recessive disease, the discussion on the clinical implications of this new function is lacking.

    3. Reviewer #2 (Public review):

      Summary:

      This paper introduces a new function within the Fam3Pro package that addresses the problem of breaking loops in family structures. When a loop is present, standard genotype peeling algorithms fail, as they cannot update genotypes correctly. The solution is to break these loops, but until now, this could not be done automatically and optimally.

      The manuscript provides useful background on constructing graphs and trees from family data, detecting loops, and determining how to break them optimally for the case of no loops with multiple matings. For this situation, the algorithm switches between Prim's algorithm and a simple greedy approach and provides a solution. However, here, an optimal solution is not guaranteed.

      The theoretical foundations-such as the representation of families as graphs or trees and the identification of loops-are clearly explained and well-illustrated with example pedigrees. The practical utility of the new function is demonstrated by applying it to a dataset containing families with loops.

      This work has the potential for considerable impact, especially for medical researchers and individuals from families with loops. These families could previously not be analysed automatically and optimally. The new function changes that, enabling risk assessments and genetic calculations that were previously infeasible.

      Strengths:

      (1) The theoretical explanation of graphs, trees, and loop detection is clear and well-structured.

      (2) The idea of switching between algorithms is original and appears effective.

      (3) The function is well implemented, with minimal additional computational cost.

      Weaknesses:

      (1) In cases with multiple matings, the notion of a "close-to-optimal" solution is not clearly defined. It would be helpful to explain what this means-whether it refers to empirical performance, theoretical bounds, or something else.

      (2) In the example pedigree discussed, multiple options exist for breaking loops, but it is unclear which is optimal.

      (3) No example is provided where the optimal solution is demonstrably not reached.

      (4) It is also unclear whether the software provides a warning when the solution might not be optimal.

    4. Author response:

      Response to Reviewer #1:

      We plan to extend the discussion section to discuss the clinical implications of this new function. We will note the algorithm's applicability to broader genetic counseling contexts beyond cancer risk assessment.

      Response to Reviewer #2:

      We will clarify the four points raised:

      (1) "Close-to-optimal" definition: We will explain that in multiple-mating cases, finding the global optimum is NP-hard (equivalent to the Weighted Feedback Vertex Set problem). We will clarify that our greedy algorithm provides practically efficient solutions suitable for clinical use, though without theoretical optimality guarantees.

      (2) Example clarity: We will improve Figure 1's caption to explain the cost calculations and note that with equal weights, both shown solutions are equivalent.

      (3) Non-optimal examples: We will describe scenarios where the greedy algorithm may not achieve the global optimum, particularly in multiple-mating cases with heterogeneous weights.

      (4) Warning message: The current version not provide a warning when the solution might be non-optimal. This may be added in the future to the function.

      We appreciate your feedback and suggestions to help improve the manuscript.

    1. As we can see,increasing the size of a model and the amount of training ex-amples increases its quality, but also the energetic cost.

      Results match predictions.

    2. To that end, we proposedthe use of a new evaluation based on Pareto optimality to givean equivalent importance to both model quality and their en-ergy consumption. This places computational complexity andresources at the heart of the research process. It should benoted that our approach is generic and could be applied toany type of model or input data.

      Essentially, current AI audio generation models are inefficient and unsustainable; a new model needs to be developed.

    3. his is the first study on energyconsumption for waveform generation and a primer attemptto include energy efficiency in the entire evaluation proce-dure.

      I didn't know information on the energy use of AI audio generators would be so hard to find.

    4. Mean OpinionScore (MOS). It is a subjective measure ranging from 1 to5, based on a qualitative test where participants are askedto rate as 1 the lowest perceived quality and 5 the highestwhen comparing a set of results

      Is this a good measurement system?

    5. it consumes 64.8 kWh, which is slightly higher thanour initial estimation (certainly due to CPU and DRAM en-ergy draw). At this point, we believe that comparing modelspurely on the basis of these estimations is questionable, andargue that the real energy should have been recorded.

      Proves that there are inefficiencies with the way these models work.

    6. Approximated energy consumption for training sev-eral state-of-art generative audio models.

      For comparison, the average energy use of these five models is about the same amount as a fridge uses in one month.

    7. These include the hardware usedto train the model, such as the type of GPU and total trainingtime in hours. Surprisingly, we found out that only five ofthe studies properly specified both criteria.

      It is hard to find truly credible data on the topic of AI energy use, because none of these AI companies want to expose the truth.

    8. ”Carbontracker”2, which tracks and predictsenergy consumption and carbon emissions for training deepleaning models. This provides a more accurate estimationwhile being user-friendly.

      I wonder if this will be used in the near future to solve AI energy use problems.

    9. the amount of energy required to train a model(until convergence), and the amount of energy required bythe model for inference (generating a sample in the case ofaudio synthesis)

      Convergence means the state of AI that can reliably create the same thing each time; inference is the state of AI that makes predictions and creates new things based on new data. Convergence is required at the beginning to make sure the data the AI is trained on is accurate, then inference is the final product, where it is able to create something new based on something it hasn't seen before.

    10. However,the generated samples tend to be slightly blurry compared torecent adversarial networks, such as WaveGan [10] or GAN-Synth [11]. These show impressive reconstruction abilitiesbut lack latent expressivity and are difficult to optimize dueto unstable training dynamics.

      Direct correlation between the amount of training data and audio quality, less training data means lower quality audio.

    11. Furthermore,they also provide almost no direct control on the generativeprocess.

      You can't manipulate what it gives you, there's already one set output per prompt.

    12. In red (left) those thatrefer to the quality of the generated samples, and in green(right) those that refer to their algorithmic complexity andperformances.

      The graph compares quality to generation complexity.

    13. hen,we propose the use of a multi-objective Pareto optimality cri-terion to provide fair comparisons simultaneously on genera-tion quality and energy efficiency. For that purpuse, we focuson a recent model called WaveFlow [13], and measure theeffective energy consumption of training and inferring newsamples for five alternative configurations.

      The research question of this paper is "Can AI audio be generated with less energy without sacrificing quality?"

    14. green computing

      This is interesting, I'd like to do some more research on this, at it seems like it could be a solution to the extremely high energy use of AI.

    15. The training time needed forthem to converge along with their complexity

      These audio-generating AI models also require a lot of training data, which uses a lot of energy.